Front. Oncol. , 04 March 2020

Sec. Cancer Epidemiology and Prevention

Volume 10 - 2020 |

Role of Genetic Ancestry in 1,002 Brazilian Colorectal Cancer Patients From Barretos Cancer Hospital

  • 1Molecular Oncology Research Centre, Barretos Cancer Hospital, Barretos, Brazil
  • 2Department of Medical Oncology, Barretos Cancer Hospital, Barretos, Brazil
  • 3Cancer Registry, Barretos Cancer Hospital, Barretos, Brazil
  • 4Department of Pathology, Barretos Cancer Hospital, Barretos, Brazil
  • 5IPATIMUP (Institute of Molecular Pathology and Immunology of the University of Porto), Porto, Portugal
  • 6i3S (Instituto de Investigação e Inovação em Saúde, Universidade Do Porto), Porto, Portugal
  • 7Nucleous of Epidemiology and Statistics, Barretos Cancer Hospital, Barretos, Brazil
  • 8Endoscopy Department, Barretos Cancer Hospital, Barretos, Brazil
  • 9Life and Health Sciences Research Institute (ICVS), Medical School, University of Minho, Braga, Portugal
  • 10ICVS/3B's-PT Government Associate Laboratory, Guimarães, Portugal

Background: Colorectal cancer (CRC) is the third most frequent and the second deadliest cancer worldwide. The ethnic structure of the population has been gaining prominence as a cancer player. The purpose of this study was to determine the genetic ancestry of Brazilian CRC patients. Moreover, we intended to interrogate its impact on patients' clinicopathological features.

Methods: Retrospective observational cohort study with 1,002 patients with CRC admitted from 2000 to 2014 at Barretos Cancer Hospital. Following tumor DNA isolation, genetic ancestry was assessed using a specific panel of 46 ancestry informative markers. Survival rates were obtained by the Kaplan–Meier method, and the log-rank test was used to compare the survival curves. Multivariable Cox proportional regression models were used to estimate hazard ratios (HRs).

Results: We observed considerable admixture in the genetic composition, with the following average proportions: European 74.2%, African 12.7%, Asian 6.5%, and Amerindian 6.6%. The multivariate analysis for cancer-specific survival showed that clinical stage, lymphovascular invasion, and the presence of recurrence were associated with an increased relative risk of death from cancer (p < 0.05). High African proportion was associated with younger age at diagnosis, while high Amerindian proportion was associated with the mucinous histological subtype.

Conclusions: This represents the larger assessment of genetic ancestry in a population of Brazilian patients with CRC. Brazilian CRC patients exhibited similar clinicopathological features as described in Western countries.

Impact: Genetic ancestry components corroborated the significant admixture, and importantly, patients with high African proportion develop cancer at a younger age.


Colorectal cancer (CRC) represents more than 1,849,518 new cases, accounting for approximately 10.2% of all neoplasms worldwide (1). CRC is the third most common neoplasm in men and the second in women (14). CRC mortality is also high, with 880,792 deaths having been estimated for 2018, corresponding to 9.2% of the total, with higher rates (52%) being observed in less developed regions of the world (1). The incidence of CRC varies more than 10-fold worldwide (1). The highest detection rates are observed in Australia, New Zealand, and countries of Europe and North America, and the lowest are found in the countries of Africa, South America, and Asia (1, 2). In Brazil, according to the National Cancer Institute (INCA), an estimated 17,380 new cases of colon and rectum cancer in men and 18,980 in women are expected for 2018, which occupies the third position in men and the second among women (4).

Several reasons account for these discrepancies, including distinct risk factors. Age is the primary risk factor, yet many other factors contribute to CRC development including a previous history of colorectal neoplasia and/or adenomas or family history of CRC; a diet rich in red meat and saturated fats, fruits, and vegetables; obesity and sedentary lifestyle; smoking; diabetes mellitus; CRC-associated syndromes such as familial adenomatous polyposis, hereditary non-polyposis colorectal cancer (HNPCC or Lynch syndrome); and inflammatory diseases of the colon (2, 3, 5, 6). Besides the risk factors abovementioned, patient ethnicity has been reported as a risk and prognostic factor (712). The ethnic structure becomes more pressing in the Brazilian population due to its great admixture (1317). Currently, genetic markers are available that determine, with more assertiveness than the self-declared form or based on physical traits, the ethnic structure of each individual (14, 18).

Despite the high incidence and mortality rate of CRC in Brazil, few studies have comprehensively described and characterized the main clinicopathological features of Brazilian patients with CRC (19, 20). Therefore, this study aimed to characterize the clinicopathological aspects of CRC patients, to determine their genetic ancestry, and to identify whether the genetic ancestry can influence patients' clinicopathological features and disease outcome.


Study Design and Data Source

We conducted a retrospective observational cohort study enrolling 1,002 patients with CRC admitted from 2000 to 2014 at Barretos Cancer Hospital, Barretos, São Paulo, Brazil. Of the total of 1,002 cases, 96.3% (965/1,002) were selected from the Department of Low-Digestive and 3.7% (37/1,002) were oncogenetic-based cases, being 1.8% (18/1,002) confirmed Lynch syndrome cases, 1.5% (15/1,002) confirmed familial adenomatous polyposis (FAP) syndrome cases, and 0.4% (4/1,002) of unclassified hereditary syndrome (21). Clinicopathological and treatment data of CRC patients were collected from patient medical records. The present study evaluated 21 variables. The seventh edition of the American Joint Committee on Cancer (AJCC) was used for tumor staging. The Institutional Ethics Committee approved the study (protocol number: 600/2012-CAAE: 02468812.30000.5437).

Genetic Ancestry Determination

DNA samples were recovered from formalin-fixed paraffin-embedded (FFPE) tissue of tumor specimens obtained from surgical or endoscopic procedures. The DNA was isolated using the DNA Micro kit (Qiagen), according to the method previously established by our group (22).

The ancestry of the patients was determined using ancestry informative markers (AIMs) as previously reported (14, 2325). Briefly, 46 small insertion–deletion (INDEL) polymorphisms were ascertained to maximize the divergence between four human major population groups: Amerindian (AME), European (EUR), African (AFR), and East Asian (ASN). These markers were selected due to their high allele frequency divergence between different ancestral or geographically distant populations, including more than 1,000 individuals from 40 reference populations from the Human Genome Diversity Project (HGDP)-Centre d/Etude du Polymorphisme (CEPH), plus individuals from Angola, Portugal, Taiwan, and indigenous Brazilian, which allowed to establish the ancestral proportions in high admixture individuals and populations, like the Brazilian one (14). Moreover, they were assembled in a simple multiplex reaction following a short amplicon strategy, adequate for challenging samples such as FFPE (15, 26, 27). The primer sequences and PCR conditions were according to Giolo et al. (14).

After DNA extraction, and multiplex PCR with 46 primers, the amplified products were further subjected to capillary electrophoresis and fragment analysis on an ABI 3500 Genetic Analyzer (Applied Biosystems) according to the manufacturer's instructions. These 46 INDELs are used mainly to estimate ancestry proportions in admixed populations and assess the structure of those populations. Two observers independently analyzed the electropherograms, and the genotypes were automatically assigned with GeneMapper Software v4.1 (Applied Biosystems).

The ancestry ratios were evaluated using the Structure Software v2.3.4 (23, 24, 28, 29), considering the four main population groups, AME, EUR, AFR, and ASN, as possible contributors to the current Brazilian genetic composition. Briefly, the data available for the HGDP-CEPH panel were used as a reference for the ancestral populations, and a supervised analysis was performed to estimate ancestry relationship proportions of the individuals involved in the study. The Structure software runs considering K = 4 consisted of 100,000 burning steps followed by 100,000 Markov Chain Monte Carlo iterations. The option “Use population information to test for migrants” was used with the admixture model, considering allele frequencies correlated, and updating allele frequencies using only individuals with POPFLAG = 1.

Statistical Analyses

Patient and cancer characteristics were reported as frequencies (number and percentage). First, the continuous variables of genetic ancestry component were summarized as mean [standard deviation (SD)]. For the association of the genetic ancestry component (AFR, EUR, ASN, AME) by AIMs panel with patient and clinical characteristics, the chi-square test or Fisher's exact test was used. For this step, ancestry proportions were further categorically defined as low, intermediate, and high based on tercile distribution (Table 1 and Supplementary Table 1).


Table 1. Ancestry background categorization according to tercile based on percentage proportions for all four ethnic groups.

The overall survival (OS) and the cancer-specific survival (CSS) rates were obtained using the Kaplan–Meier method. Survival rates were estimated in months. Survival was defined as the period from diagnosis to the date of death or the time at which information was last obtained. For the analysis, the event of interest was death by any cause for OS and death related to cancer for CSS. Cases that were alive were censored for OS, and cases that were alive or dead from other causes were censored for CSS. Such information was obtained through direct consultation to the death certificate or medical records. The follow-up median of our sample was 62.0 months. The log-rank test was used to compare survival curves, and results were considered significant when the p < 0.05.

Multiple confirmatory models were used to check whether genetic ancestry component (AFR, EUR, ASN, AME) by AIMs panel was related to the prognosis of CRC. Multivariable Cox proportional hazards regression models were used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs) for the variables with p < 0.20 in univariate analyses and adjusted with treatment period and genetic ancestry components by AIMs panel. Fisher exact test was used for association analysis.

For tabulation and statistical analysis, the IBM® SPSS® Statistics 21.0 software for Windows (IBM Corporation, Route 100, Somers, NY 10589) was used. The level of statistical significance was set at 0.05 for all analyses.


Clinicopathological Features

The present study included 1,002 cases, and the main clinicopathological features are summarized in Table 2. A detailed description of therapeutic regimens is shown in Supplementary Table 2. There were more men than women in the population (51.9%), most patients were between 50 and 75 years old at diagnosis (60.5%), and the majority lived in the South or Southeast regions of Brazil (82%). The distribution of the CRC patients according to the Brazilian state of origin is plotted in Supplementary Figure 1. The left colon was the leading primary tumor site, representing 46% of the cases, and adenocarcinoma was the main histological type, representing 93.5% of the cases. The cases were distributed in all stages, but clinical stages II and III were the most common, representing together 70.8% of the cases.


Table 2. Clinicopathological features of Brazilian colorectal cancer patients (n = 1,002).

Genetic Ancestry

The present study also aimed to evaluate the genetic ancestry of the patients, which was performed in 934/1,002 (93.2%) of the cases. In a small subset of cases (n = 68), the genetic ancestry could not be evaluated due to low quantity and poor-quality DNA. We observed a great admixture in genetic composition, with the following averages of ancestral proportions: AFR 12.7% (SD = 15.7%), EUR 74.2% (SD = 20.6%), ASN 6.5% (SD = 11.3%), and AME 6.6% (SD = 7.1%) (Figure 1). The average of each genetic ancestry component according to the Brazilian state of origin is plotted in Figure 2. The ancestry proportions were further categorically defined as low, intermediate, and high based on tercile distribution (Table 1 and Supplementary Table 1).


Figure 1. Individual ancestry estimates for the Brazilian colorectal cancer patients (n = 934).


Figure 2. Average of each genetic ancestry component according to the Brazilian state of origin. AFR- African; EUR- European; ASN- Asian; AME- Amerindian (Native American).

We further investigated the association of genetic ancestry with patients' clinicopathological characteristics (Table 3). We observed significant associations between the AFR component and younger age at diagnosis (p = 0.013), Brazilian region of origin (p < 0.001), and recurrence of the disease (p = 0.034). For the EUR component, we found significant associations with the region of origin (p < 0.001), adenocarcinoma (p = 0.023), higher histological grade (p = 0.040), and presence of synchronous tumors (p = 0.012). For the AME component, a significant association with the mucinous histological type (p = 0.033) was observed.


Table 3. Association between clinicopathological and genetic ancestry components by AIM-INDEL panel (n = 934).

Survival Analysis

An initial univariate analysis of survival was performed, including 1,002 individuals: 489 events occurred in OS, and 422 events occurred in CSS. The probability of patients living for more than 5 years was 58.2% for OS and 62.3% for CSS (Table 4). Several significant associations were observed between OS and CSS and patients' features, including gender, clinical stage, histological type, histological grade, lymphovascular invasion, perineural invasion, presence of recurrence, treatment period, neoadjuvant chemotherapy, adjuvant chemotherapy, and radiotherapy (Table 3 and Supplementary Figure 2). On the univariate survival analyses (OS and CSS), the genetic ancestry categorically defined as low, intermediate, and high based on terciles was not associated with CRC survival (Table 4).


Table 4. Kaplan-Meier estimates of overall survival and cancer-specific survival of colorectal cancer patients (n = 1,002).

The multivariate analysis for CSS adjusted by treatment period and genetic ancestry components showed that clinical stage, lymphovascular invasion, and the presence of recurrence were associated with an increased relative risk of death from cancer (p < 0.05), whereas adjuvant chemotherapy was associated with a lower risk of death (Table 5). These results are explained by the different therapeutic approaches used in distinct clinical stages (Supplementary Tables 3, 4).


Table 5. Multivariate analysis of cancer-specific survival associated with different clinicopathological characteristics and treatment of patients with colorectal cancer.


CRC is one of the most common neoplasms in men and women worldwide (3, 30, 31). Although its incidence is declining in the US and other western countries (32); in others, including Brazil, we are still witnessing an increase in the number of cases, and it is a major public health problem. In this study, we intended to characterize the genetic ancestry of an extensive series of 1,002 CRC patients admitted at the Barretos Cancer Hospital. Knowing that the Brazilian population is ethnically one of the most heterogeneous in the world (14, 18), with an essential contribution from the main ethnicities that formed the background of our population, we also intended to correlate the ancestry components (EUR, AFR, ASN, and AME) measured genetically with the different clinical–pathological factors and its prognostic role.

There was a slight male predominance, with an incidence of 1.08. In all regions of the world, despite the similarities between genders, the rates were higher for males (vs. females, 1.3) in the American population (2, 33), as well as in Europe (1) and Asia (34). Others have a higher incidence among women in the colon (4, 35).

The main studies divide the samples into three age categories: below 50, between 50 and 75, and above 75 years old. The age of 50 years old is critical to differentiate between hereditary and sporadic CRC cases. This age limit has been used in the Amsterdam criteria (36, 37) and also to recommend screening colonoscopic examination for people at average risk for CRC (38, 39). Although it has been reported that 21 to 33% of patients are older than 75 years [Surveillance Epidemiologic and End Results (SEER)] (40), they may account for more than 40% and are underrepresented in the clinical studies. These clinical studies use in their inclusion criteria an age group of up to 75 years old as a limit to be treated (4144), mainly due to comorbidities. Therefore, we adopted the upper limit range as those with 75 or more years old (45).

The mean age at diagnosis in our population was 57.7 years (SD = 13.8), below the American age of 68 years (31) and the European age of 72 years (45). The predominant age group in our population was between 50 and 75 years old (60.5%), similar to that in the SEER (31) data.

Our population had a high incidence of patients younger than 50 years old (28.9%), higher than the 20% reported in studies including North American populations (31) and Asian patients (3–14%) (46). This finding can be due to the inclusion criteria and to the potential presence of some hereditary cases in the present analysis. In the present study, patients with a known and genetically confirmed familial history of Lynch or APC represented <4% of cases (21); however, we cannot rule out the existence of hereditary cases in the cohort. Since 1992, the incidence in cases under 50 has increased by 1.5% per year (3, 45), especially from 20 to 34 years. According to the American College of Gastroenterology (39), colorectal cancer screening begins at age 50, except for those of African origin, where it is recommended to start at age 45 (47). Moreover, some studies even question to initiate at 40 years old (48). In concordance with these findings, we observed that Brazilian CRC patients depicting higher African proportion were associated with younger age of disease onset.

The importance of primary tumor location, being associated with distinct clinical–pathological features, as well as a differential prognostication has been widely discussed. For this, we performed the categorization of the cases included into the right colon, left colon, and rectum (4953). In our population, 25% of the tumors were in the right colon. This percentage is within the average of other studies that ranged from 22.7 to 39% (49). However, in contrast, we did not find that laterality was associated with disease outcome.

Another critical variable is the TNM staging. In our study, the majority of cases were stage II (37.6%), followed by III (33.2%) and IV (16.7%). The percentage of stage IV at diagnosis is in agreement with several regions of the world (31, 45, 54).

Another goal of our study was to evaluate the main prognostic factors in our CRC patients. To this end, we estimated the OS and CSS and correlated with the different variables collected and selected in the multivariate analysis. The follow-up median of our sample was 62.0 months, very similar to the SEER that was 65.2 months (31). In the study of OS and CSS, we interrogated whether the variables selected in the multivariate analysis would be influenced by other variables such as the treatment period and the genetic ancestry components. Therefore, following adjustment of both variables, namely, treatment period (patients treated from 2000 to 2009 and from 2010 to 2014, where the introduction of the molecular target drugs, such as cetuximab, were included by the Department of Oncology of the Barretos Cancer Hospital), and ancestry, a multivariate analysis was performed.

The multivariate analysis for OS and CSS adjusted by genetic ancestry showed that the clinical stage, lymphovascular invasion, and recurrence of the disease were associated with an increased relative risk of death from cancer. In contrast, adjuvant chemotherapy was associated with a better outcome, as expected.

About 1/3 of our patients had lymphovascular invasion. The association of lymphovascular dissemination and adverse outcomes (55, 56) is well-described, besides being a known definer regarding therapeutics, especially in stage II (3).

The ancestry of the individuals assumes importance concerning its association with specific pathologies, immunological, and therapeutic responses, yet in the vast majority of studies, it is not evaluated (57, 58). Currently, with the availability of molecular tools for genetic studies, self-declaration and/or family origin can no longer be a proxy/authentication of the ancestral origin of an individual or population, especially in regions with a high degree of population admixture such as Brazil (57). However, in a large number of studies, skin color alone is used to assert ethnical origin of CRC patients (79, 12, 59). There is an extensive amount of studies, based on self-declaration, suggesting that black skin color patients have a higher incidence and lower survival of CRC (7, 12). However, it is unclear whether the ancestral component alone would influence survival or whether there are other confounding factors, such as less reference to screening methods (60), presence of a higher number of comorbidities diagnosis (7), lack of access to treatment services (61), or low educational/economic level, which can justify this fact (11, 60).

There are several ways to analyze the genetic composition of a particular population, and the selection criteria of genetic markers may diverge between studies, originating different values of the ethnic groups (62) for the Brazilian population. In our study, we analyzed four major ethnic compositions (EUR, AFR, ASN, and AME) using AIMs according to previous studies (13, 14, 18, 63). The present study was retrospective, dependent on the collection of information in medical records, and there was no mention of self-declaration of skin color. Therefore, our data collection instrument did not contemplate this aspect. If we had this information, we could have carried out a cross-referencing of information between the genotype and the phenotype to try to evaluate the fidelity of the latter.

When ancestry was measured genetically, we did not evaluate the data individually, but rather the four ancestor components that form the demographic base of Brazil. Our study did not intend to assess the causal relationship between genetic ancestry and CRC cancer as already done in other studies (64), but rather to correlate them with the various clinical–pathological characteristics of the patients.

As expected, the predominant ancestral component was the EUR one, with an average of 74% followed by the AFR with 13%, and by the ASN and AME with 7%, agreeing with previous studies of the Brazilian population (13, 63, 65). In agreement with other studies (62), a predominance of the European ancestral component in the Southeast and South regions was observed (63). Some differences in our study were observed, for example, the African component concentrated more strongly in the north region, unlike other studies based on the mitochondrial DNA (mtDNA) and not on autosomal AIM-INDELs, where this happened more in the Northeast Brazilian region (13, 17, 62, 63). The contribution of Asian ancestry in the northeast region of 5% is very close to that of the regions known to be colonized by Asians, but this may perhaps be explained first by the small sample size representing this region in our study and/or the proximity of the gene pool between Amerindians and eastern Asians considering the modern history of these human groups (14). The high SDs identified in our study show how miscegenated our Brazilian population is.

When we evaluated the individual components separately, we found that the European ancestral component was significantly associated with the absence of synchronous tumors. The African component was associated with younger patients, in agreement with other international studies (7, 8, 10). The Amerindian component predominated in the Northern region which correlates with other studies and is corroborated by the IBGE (Brazilian Institute of Geography and Statistics) self-declaration assessment (13, 18, 63, 66). Interestingly, we observed an association of the Amerindian component with mucinous histological type.

In our study, there was no correlation between the different ancestry proportions and patient survival. However, some North American studies reported an association of African ancestry based on self-declaration with tumors located in the right colon (67) and that these would be associated with more aggressive behavior histopathology, which would lead to worse survival (7, 11, 12).

Finally, despite the exciting and important findings, this study harbors some limitations, such as the retrospective nature of the study, based on the analysis of medical records, which often do not have complete and accurate information. The extent of the AIM panel could also be arguably higher. Nonetheless, the employed AIM indel set harbors a sufficient number of markers sparsely distributed throughout the genome and is simply analyzed in a multiplexed short-amplicon strategy, which are desirable characteristics considering the challenging nature of the source tumor samples included in our study. Despite the large number of patients and their diverse geographic origin, it does not represent all Brazilian states and the fully ethnical diversity of the Brazilian population, so further studies are warranted to extend our findings.


This pioneering work determined the genetic ancestry profile of more than 1,000 Brazilian patients diagnosed with CRC from a single oncology reference center. We described the main clinicopathological features of the population and observed that patients with a high African proportion develop cancer at a younger age. The present study can contribute to drawing a nationwide portrait of Brazilian CRC patient and may help in the design of management strategies for these patients.

