Whole exome-seq and RNA-seq data reveal unique neoantigen profiles in Kenyan breast cancer patients

Wagutu, Godfrey; Gitau, John; Mwangi, Kennedy; Murithi, Mary; Melly, Elias; Harris, Alexandra R.; Sayed, Shahin; Ambs, Stefan; Makokha, Francis

doi:10.3389/fonc.2024.1444327

ORIGINAL RESEARCH article

Front. Oncol., 11 December 2024

Sec. Cancer Epidemiology and Prevention

Volume 14 - 2024 | https://doi.org/10.3389/fonc.2024.1444327

This article is part of the Research TopicAccelerating Cancer Genomics Research in Sub-Saharan AfricaView all 4 articles

Whole exome-seq and RNA-seq data reveal unique neoantigen profiles in Kenyan breast cancer patients

Godfrey Wagutu¹

John Gitau^1,2,3

Kennedy Mwangi⁴

Mary Murithi⁵

Elias Melly⁶

Alexandra R. Harris⁷

Shahin Sayed⁸

Stefan Ambs⁷

Francis Makokha^1*

¹Directorate of Research and Innovation, Mount Kenya University, Thika, Kenya
²African Institute for Mathematical Science, Kigali, Rwanda
³Center for Epidemiological Modeling and Analysis, Nairobi, Kenya
⁴International Livestock Research Institute, Nairobi, Kenya
⁵Department of Pre-Clinical, Kabarak University, Nakuru, Kenya
⁶National Cancer Institute, Nairobi, Kenya
⁷Laboratory of Human Carcinogenesis, National Cancer Institute, Bethesda, MD, United States
⁸Aga Khan University Hospital, Nairobi, Kenya

Background: The immune response against tumors relies on distinguishing between self and non-self, the basis of cancer immunotherapy. Neoantigens from somatic mutations are central to many immunotherapeutic strategies and understanding their landscape in breast cancer is crucial for targeted interventions. We aimed to profile neoantigens in Kenyan breast cancer patients using genomic DNA and total RNA from paired tumor and adjacent non-cancerous tissue samples of 23 patients.

Methods: We sequenced the genome-wide exome (WES) and RNA, from which somatic mutations were identified and their expression quantified, respectively. Neoantigen prediction focused on human leukocyte antigens (HLA) crucial to cancer, HLA type I. HLA alleles were predicted from WES data covering the adjacent non-cancerous tissue samples, identifying four alleles that were present in at least 50% of the patients. Neoantigens were deemed potentially immunogenic if their predicted median IC50 (half-maximal inhibitory concentration) binding scores were ≤500nM and were expressed [transcripts per million (TPM) >1] in tumor samples.

Results: An average of 1465 neoantigens covering 10260 genes had ≤500nM median IC50 binding score and >1 TPM in the 23 patients and their presence significantly correlated with the somatic mutations (R² = 0.570, P=0.001). Assessing 58 genes reported in the catalog of somatic mutations in cancer (COSMIC, v99) to be commonly mutated in breast cancer, 44 (76%) produced >2 neoantigens among the 23 patients, with a mean of 10.5 ranging from 2 to 93. For the 44 genes, a total of 477 putative neoantigens were identified, predominantly derived from missense mutations (88%), indels (6%), and frameshift mutations (6%). Notably, 78% of the putative breast cancer neoantigens were patient-specific. HLA-C*06:01 allele was associated with the majority of neoantigens (194), followed by HLA-A*30:01 (131), HLA-A*02:01 (103), and HLA-B*58:01 (49). Among the genes of interest that produced putative neoantigens were MUC17, TTN, MUC16, AKAP9, NEB, RP1L1, CDH23, PCDHB10, BRCA2, TP53, TG, and RB1.

Conclusions: The unique neoantigen profiles in our patient group highlight the potential of immunotherapy in personalized breast cancer treatment as well as potential biomarkers for prognosis. The unique mutations producing these neoantigens, compared to other populations, provide an opportunity for validation in a much larger sample cohort.

Introduction

Breast cancer is among the most frequent causes of cancer-related mortality in women. Disease heterogeneity and limited immunogenicity contribute to the lethality of breast cancer (1). Immune evasion, an important hallmark of cancer, adds to the complexity of cancer burden through induction of immunosuppression (2). Immune checkpoint blockade (CKB) therapy has been developed to target and block immune regulatory molecules (PD-1/PD-L1 and CTLA-4) and in the process reactivate T cell immunity (3). This approach has been reported to improve clinical responses and survival, especially in tumors with high mutational burdens, such as lung cancer and melanoma (4). However, CKB therapy is not universally successful among all patients and shows increased efficacy with higher mutational burden tumors (5). Another immunotherapy approach that has been tested in clinical studies is the targeting of tumor-associated antigens (TAAs) that are expressed in tumors at abnormally high levels and rarely detectable in normal tissues (6). One of the limitations of this therapy approach is that many TAAs represent normal self-antigens and thus can be tolerated by T-cells, resulting in poor immune response (1). Considering the lower mutational burden in breast cancer, both CKB and TAAs immunotherapy have had limited success (7).

Tumor neoantigens are tumor-specific antigens derived from somatic mutations in expressed genes and are presentable to the major histocompatibility complex (MHC) by both class I human leukocyte antigen (HLA-I) molecules present on surface of cancer cell, as well as class II HLA molecules present on professional antigen-presenting cells (8). This elicits anti-tumor immune responses that have the potential of eliminating the tumor cells with minimal off-target effects (9). Neoantigens are encoded in various mutational types, including single nucleotide substitution, insertion and deletions (INDELs), splice sites, stop codons gains and silent change, which can result in translational frameshifts or novel open reading frames (1). As such, these neoantigens offer an advantage over TAAs in that they are only expressed by cancer cells and not by normal cells, which enables specific recognition by the immune system (1). Although some neoantigens are shared among patients, most of them are patient-specific and are not subject to immune tolerance mechanisms (10). The specificity of neoantigens could provide an opportunity for future personalized therapy in a cancer with a low tumor mutational burden and a high disease heterogeneity, such as breast cancer. Moreover, neoantigens can potentially be used as biomarkers in cancer immunotherapy to assess or predict the response of a patient to treatment (1).

Despite advancements in next generation sequencing and high-performance computing that has resulted in improved cancer immunotherapy research and neoantigen-based treatments, there remains a scarcity of information regarding neoantigens in specific populations from sub-Saharan African countries such as Kenya. This lack of data poses a significant challenge in tailoring immunotherapeutic strategies for breast cancer patients in such regions that have a high cancer burden, especially when compounded by germline ancestral factors and a distinct mutational spectrum that may influence tumor biology and immune response. Thus, it is critical to profile the neoantigen burden in this population to contribute to the global collection of breast cancer immunogenic antigens for future drug development. To this end, we sought to profile neoantigens in Kenyan women diagnosed with breast cancer in silico through analysis of the whole exome and RNA sequencing data from 23 patients. We characterized the mutation burden for each patient using WES, identified gene expression patterns in tumor tissue, and predicted the putative neoantigens incorporating these datasets.

Materials and methods

Patients and samples

Tumor and adjacent normal tissue pairs were obtained from 23 breast cancer patients at the Aga Khan Hospital, Nairobi, Kenya and AIC Kijabe Hospital, Kijabe, Kenya between 2019 and 2021. Samples were collected through surgical excision, after which tissues were snap frozen in liquid nitrogen and temporarily stored at Aga Khan Hospital. Frozen tissue samples were shipped to the National Cancer Institute, Bethesda, MD, USA, for sequencing. Prior to tissue collection, all patients provided written informed consent and the study was approved by Research and Ethics Committees at Aga Khan University Hospital, Nairobi (Ref: 2018/REC-80) and AIC Kijabe Hospital (KH IERC-02718/2019).

Whole-exome sequencing and RNA-sequencing

Genomic DNA was extracted from the samples using the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany), following manufacturer’s instructions. Total RNA was extracted from the frozen tissues using TRIzol reagent (Invitrogen). WES was performed by the company, Psomagen (https://www.psomagen.com/). This service provider is Clinical Laboratory Improvement Amendments-certified and College of American Pathologists (CAP)-accredited, achieving a sequence depth of 250x for tumor tissues and 150x for adjacent non-cancerous tissues, as previously described by us (11). Total RNA from the 23 sample pairs was processed by a NCI Leidos core facility, where library preparation was performed using the TruSeq Poly A kit (19). Samples were sequenced on a Novaseq system with 150 bp paired-end reads and a depth of 30 million reads.

Reads mapping and variant calling

For WES, raw reads were quality checked using FASTQC (12) and results summarized using MultiQC (13). The reads were trimmed for low quality reads and adapter sequences using Trimmomatic (14) and quality-checked again using FASTQC and MultiQC. All samples passed the QC test after trimming and the reads were aligned using BWA-MEM (15) to the hg38 human reference genome, where >95% of the reads aligned properly to the genome. The aligned reads were deduplicated and read groups added to the deduplicated bam files using Picard. This was followed by base quality recalibration in GATK (16). Prior to variant calling, a panel of normal (PoN) was built using MuTect2 utilizing the reads from non-cancerous tissue. This was done to exclude artifacts and potential germline mutations in subsequent steps. Somatic variant calling was performed using MuTect2 (16) in paired tumor-normal mode utilizing the panel of normal option. Variants were normalized using a variant tool set (vt; 17), filtered using GATK and functional/consequence-annotated using a variant effect predictor (VEP; 18). Annotated variants were converted to MAF files using vcf2maf (19) and concatenated into a single file. The MAF files were imported into R package maftools (20) for further processing.

For RNA-seq, a quality check was performed using FASTQC and MultiQC after which the reads were trimmed and quality checked again. All samples passed the quality check and the reads were pseudo-aligned to the hg38 reference genome using Kallisto aligner (21) with default settings to obtain count matrix. Alignment statistics showed that over >50% reads mapped uniquely to the genome. The raw counts were normalized into estimated Transcripts Per Million (TPM), and scaled using the average transcript length over samples and the library size by tximport (22).

Variant expression annotation

VCF files containing the variants were annotated for expression using the vcf-expression-annotator (https://github.com/griffithlab/VAtools) with default setting except for choosing the use of gene names instead of transcripts and thereby ignoring the Ensembl id version. The tool takes the output of Kallisto and adds the data contained in the file to the VEP annotated VCF’s INFO column. Each of the variant annotated gets its expression value (TPM) added to the annotation information and this is used to determine the level of variant expression during neoantigen filtering.

Neoantigen prediction

Human leukocyte antigen (HLA) class I alleles (HLA a, b and c) were predicted from each patient’s normal sample exome-seq data using HLA-HD v.1.2.1 (23). Here, the putative HLA reads are aligned to an imputed library of full-length HLA alleles. Neoantigens were then predicted using pVACseq (24) with MHCflurry, MHCnuggetsI, SMM, and SMMPMBEC algorithms and keeping the default parameters, except for turning off the VAF and coverage filters. Here, the neoepitopes that could bind to the patient-specific HLA alleles were predicted from the Immune Epitope Database (IEDB; 25). This involved matching patient HLA type to the existing IEDB list keeping all amino acids with lengths for 9, 10 and 11-mers. Predicted epitopes were filtered to retain only those with high affinity (IC50 ≤ 500nM) and were expressed (transcripts per million, TPM>1) in tumor samples. The bioinformatic analysis workflow is outlined in Figure 1.

Figure 1

Figure 1. Workflow for neoantigen prediction from WES and RNA sequencing data. Fastq files were quality checked, trimmed and aligned to the hg38 genome. Variant calling was performed following GATK best practice, while gene expression was quantified using Kallisto. Variants were annotated and expression data added, after which neoantigen prediction was performed in PVACseq pipeline.

Sample summary statistics and the pairwise tests for differences among mutations and neoantigens abundance among the BC subtypes using Wilcoxon test and visualization of the results were performed in R software (26).

Results

Patients and sample characteristics

The demographic and clinical characteristics of the 23 breast cancer patients are summarized in Supplementary Table S1. We grouped the tumors into 3 subtypes based on expression of either the hormone receptors (HR) or human epidermal growth factor receptor 2 (HER2) (7): those that were HER2+ regardless of the HR status, those that were negative for all hormone receptors (triple negative breast cancer; TNBC) and those that were HR+ and HER2-. Majority of the samples were HR+/HER2- constituting 52.2%, followed by HER2+ at 34.8% and TNBC at 13.0%. Most of the patients had invasive carcinoma (invasive ductal carcinoma, 78.26% and invasive carcinoma; 4.35%). For tumor grade, 65.22% of the patients had grade 3 tumors (65.22%), while the rest had grade 2 tumors (34.78%). Clinically, 39.13% of the patients were in stage II, 30.44% in stage III, and 8.7% in stage I (Supplementary Table S1).

Mutation profiles for the 23 patients

Across all genes, the average number of detected mutations in the 23 patients was 2809 mutations. Considering the different subtypes, TNBC had the highest average number of mutations at 3202, followed by HR+/HER2- at 2757, and HER2+ at 2740 mutations (Supplementary Figure S1). From the catalog of somatic mutations in cancer (COSMIC, v99), we identified 73 genes reported to be mutated in breast cancer and among those, 62 (84.9%) had at least one mutation in our samples. The mutation characteristics are summarised in (Figures 2A–F). In brief, the mutation frequency among the 62 genes ranged from 1 to 55 mutations per individual. The majority of the mutations were of the missense type, most of which were substitutions of C>T (Figure 2A). The top 10 mutated genes among the 62 are shown in Figure 3. Four genes (MUC16, MUC17, TTN, RP1L1) were altered in more than 95% of the patients (Figure 3). Moreover, mutations in genes TP53-ERBB3, PTEN-CFAP46 were found to significantly co-occur, while BRCA1-MUC17 mutations were significantly mutually exclusive (P<0.05) (Figure 4). Furthermore, the majority of the single nucleotide mutations were substitutions were most uncommon (Figure 5A). Transitions occurred more frequently than transversions in these substitutions (Figure 5B), and there was obvious variation in proportions of each substitution among the 23 samples (Figure 5C).

Figure 2

Figure 2. Mutational profiles in 23 patients for 73 genes reported to be mutated in breast cancer. (A) variant classes abundance in the total mutations, (B) variant types that include single nucleotide polymorphism (SNP), insertions (INS) and deletions (DEL), (C) proportion of different single nucleotide variant (SNV), (D) distribution of variants per sample with colors representing the different variant classes denoted in A, (E) summary of the variant classes distribution and numbers in all samples, (F) Top 10 mutated genes, with colors representing different variant classes and the percentages indicating the proportion of samples in which the genes mutations are present.

Figure 3

Figure 3. Top 10 genes mutated in >50% of the samples. Each color corresponds to a variant class listed at the bottom of the figure apart from gray, which indicates absence of mutation.

Figure 4

Figure 4. Probability of mutations in any two genes co-occurrence or being mutually exclusive in the breast cancer genes for the 23 Kenyan patients. The numbers in parenthesis alongside each gene represents the number of missense mutations for that gene in the samples.

Figure 5

Figure 5. (A) Percentage of various substitution types in all samples, (B) percentage of transversions (Tv) (interchange of purines for pyrimidine) and transition (Ti) (interchange of either purines or pyrimidines) for all samples, (C) percentage of the substitutions in each of the samples with colors denoting the various types in A.

Neoantigen burden

In an analysis that included all the genes (10260), an average of 1465 neoantigens had a ≤500nM median IC50 binding score and >1 TPM expression level in any of the 23 patients and their presence significantly correlated with the somatic mutations (R² = 0.570, P=0.001) (Figure 6). Out of the 62 COSMIC genes that were mutated in the tumor tissue, 58 genes produced at least one neoantigen. After filtering for genes that produced at least two neoantigens, 44 genes had a mean of 10.5 neoantigens ranging from 2 to 93. A total of 477 putative neoantigens were identified in these 44 genes across the 23 patients (Figure 7) predominantly derived from missense mutations (88%), indels (6%) and frameshift mutations (6%) (Figure 8). Most of the neoantigens were produced in the TNBC subtype with an average of 25 neoantigens, followed by HR+/HER2- at 20 neoantigens and HER2+ with an average of 19 neoantigens (Supplementary Figure S1). Notably, 78% of the putative breast cancer neoantigens were patient-specific (Supplementary Table S2). HLA-C*06:01 allele was associated with majority of neoantigens (194), followed by HLA-A*30:01 (131), HLA-A*02:01 (103), and HLA-B*58:01 (49). Among the genes of interest that produced putative neoantigens include MUC17, TTN, MUC16, AKAP9, NEB, RP1L1, CDH23, PCDHB10, BRCA2, TP53, TG, RB1 among others (Figure 7; Supplementary Table S3).

Figure 6

Figure 6. Correlation between tumor mutational burden and neoantigen burden for all the genes in the 23 patients (R² = 0.570, P=0.001). The neoantigens are filtered for high affinity (IC50 ≤ 500nM) and expression (transcripts per million, TPM>1) in tumor samples.

Figure 7

Figure 7. Frequency of neoantigens derived from the COSMIC genes that were mutated in the tumor tissue and produced >1 neoantigens for the 23 patients.

Figure 8

Figure 8. Summary of mutation types that produced putative neoantigens for the COSMIC genes that were mutated in the tumor tissue in the 23 Kenyan patients.

Discussion

We analyzed the mutational burden and predicted the neoantigen repertoire in 23 Kenyan breast cancer patients using WES and RNA sequencing data. Among the different breast cancer subtypes, we found that the TNBC molecular subtype had the highest mutational and neoantigen burden although there was no significant difference among the subtypes (Supplementary Figure S1, Supplementary Table S4). This is consistent with other studies (24). TNBC origin is not well understood although it is reported to be heterogeneous in nature relying on different signaling pathways such as JAK/STAT, PI3K/AKT/mTOR or NOTCH, cell cycle regulators (TP53) and genome integrity genes (BRCA1/2) (1). This makes it a disease that is difficult to manage because we do not have a clear understanding of the molecular mechanisms driving it. Yet, the high mutational and neoantigens burden combined with the patient specificity may provide an untapped opportunity to design and optimize personalized immunotherapy for this subtype.

In contrast to most populations (Caucasian American, African American, Asian and European) where TP53, PIK3CA and GATA3 are the most mutated genes (11, 27, 28), in our study population, three genes MUC16, MUC17 and TTN were highly mutated in over 50% of the samples and produced the highest number of neoantigens. MUC16 has been reported to take part in breast cancer progression and metastasis when overexpressed due to its influence on cell cycle and survival through the JAK2/STAT3 pathway (29). It has been reported as one of the highly mutated genes in breast cancer (30). MUC16 has also been described as a marker for disease progression, recurrence, and chemotherapy response (31). A high mutation frequency for MUC17 and TTN have recently been reported as an unexpected finding in a study of early onset breast cancer (EOBC) in Taiwanese women (32). MUC17 may influence chemoresistance and has recently been reported as a driver gene in adult gliomas (33, 34). For TTN, Oh et al. (35) found that mutations in TTN correlate with tumor mutational burden and high microsatellite instability, which is associated with poor breast cancer prognosis. Thus, the role of MUC17 and TTN should further be investigated on how mutations in them may relate to early onset of breast cancer in Kenyan patients (11).

We found that TP53 gene mutations significantly co-occurred with ERBB3 mutations and so did mutations in PTEN and CFAP46, whereas BRCA1 and MUC17 mutations never co-occurred. TP53 mutations are associated with tumor aggression and are found in about half of HER2-amplified tumors (36). The TP53 mutations have been implicated in poor prognosis of HER2+ subtypes compared to other subtypes (37). PTEN is a tumor suppressor gene, whose mutation has been associated with initiation, progression, and metastasis of breast cancer (38). On the other hand, although CFAP46 role in breast cancer is not yet clear, gene fusion involving various other genes such as VTI1A (reported to cause the initiation of glioma and other cancers) has been reported to play a role in breast cancer (39).

Breast tumors with either germline or somatic BRCA1 mutations show no difference in their cancer biology, but inherited mutations in this gene confers a very high lifetime risk of developing breast cancer (40–42). This could be the reason such mutations do not necessarily need to co-occur with other gene mutations to initiate or promote breast cancer progression. In our study, BRCA1 was not among the highly mutated genes considering all mutations but was among the genes with high number of missense mutations (Figure 4). In contrast, MUC17 mutations were among the most prevalent. Given the role of MUC17 mutations in chemoresistance and in early onset breast cancer (33, 34), its high prevalence and exclusive occurrence in the Kenyan samples that are prone to early onset of breast cancer should be investigated further.

Similar to most studies on neoantigen prediction in breast cancer, we have found that neoantigens burden is positively correlated with tumor mutational burden and that neoantigens were patient-specific (7, 43). Although most of the top 10 mutated genes (80%) were also the top 10 in the number of neoantigens generated, genes like TP53 and PIK3CA that are reported to be highly mutated in most patient cohorts were not among the top 10 mutated genes in this study, but generated among the highest number of neoantigens (Figures 6, 7). ARID1A gene, which showed unique mutational profile in Kenyan population using exome data compared to African American and Asian population (11), was not among the highly mutated, but produced neoantigens. We found that most neoantigens were derived predominantly from missense mutations (88%), compared to indels and frameshift mutations (12%). This is consistent with other studies although the majority do not predict neoantigens from indels and frameshift mutations (44). Similar to other studies, the TNBC subtype had more neoantigens, compared to HR+/HER2- and HER2+ subtypes (7, 44).

In our small sample cohort, we have been able to identify putative neoantigens that show patient-specificity and thus are important in tailored treatment. Interestingly, the mutations and neoantigens in this population are predominantly derived from a unique set of genes (MUC16, MUC17, TTN) compared to other populations, which provide an opportunity for validation in a much larger sample cohort. We predicted neoantigens based on binding affinity to HLA class I only as it is the most important class of antigen binding proteins in cancer immunity. However, HLA class II-based neoantigens may also have a role in tumor immune response (45). Moreover, we did not investigate the expression of the predicted neoantigens on tumor cells alongside the MHC class I molecules and their ability to activate T cells. This being a discovery study, validation of the findings need to be done in a larger cohort while addressing the highlighted limitations of this study.

Taken together, our findings corroborate the neoantigen profile in breast cancer, highlighting the patient specificity in Kenyan population breast cancer mutational and neoantigens signatures. We also describe putative neoantigens that could be used as markers for breast cancer diagnosis, treatment monitoring, and development of novel immunotherapy.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by Research and Ethics Committee, The Aga Khan University Hospital, Nairobi (Ref: 2018/REC-80) Research and Ethics Committee, AIC Kijabe Hospital, Kijabe (KH IERC-02718/2019). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

GW: Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing. JG: Formal analysis, Writing – original draft, Writing – review & editing. KM: Formal analysis, Writing – original draft, Writing – review & editing. MM: Formal analysis, Writing – original draft, Writing – review & editing. EM: Writing – original draft, Writing – review & editing. AH: Resources, Writing – original draft, Writing – review & editing. SS: Data curation, Investigation, Resources, Writing – original draft, Writing – review & editing. SA: Resources, Writing – original draft, Writing – review & editing. FM: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was funded by the National Research Fund -Kenya that supported sample collection, and by the Center for Cancer Research, National Cancer Institute, USA, that supported the sequencing work.

Acknowledgments

We would like to thank the patients for their consent to provide samples and Aga Khan University Hospital (Nairobi) and AIC Kijabe Hospital (Kijabe) for granting access to patient samples.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2024.1444327/full#supplementary-material

Supplementary Figure 1 | Statistical pairwise test (Wilcoxon’s test) for differences in mutational burden (A) and neoantigens counts (B) for the 23 samples.

References

1. Benvenuto M, Focaccetti C, Izzi V, Masuelli L, Modesti A, Bei R. Tumor antigens heterogeneity and immune response-targeting neoantigens in breast cancer. Semin Cancer Biol. (2021) 72:65–75. doi: 10.1016/j.semcancer.2019.10.023

PubMed Abstract | Crossref Full Text | Google Scholar

2. Bates JP, Derakhshandeh R, Jones L, Webb TJ. Mechanisms of immune evasion in breast cancer. BMC Cancer. (2018) 18:556. doi: 10.1186/s12885-018-4441-3

PubMed Abstract | Crossref Full Text | Google Scholar

3. Touchaei ZA, Vahidi S. MicroRNAs as regulators of immune checkpoints in cancer immunotherapy: Targeting PD-1/PD-L1 and CTLA-4 pathways. Cancer Cell Int. (2024) 24:102. doi: 10.1186/s12935-024-03293-6

PubMed Abstract | Crossref Full Text | Google Scholar

4. Shiravand Y, Khodadadi F, Kashani SMA, Hosseini-Fard SR, Hosseini S, Sadeghirad H, et al. Immune checkpoint inhibitors in cancer therapy. Curr Oncol. (2022) 29:3044–60. doi: 10.3390/curroncol29050247

PubMed Abstract | Crossref Full Text | Google Scholar

5. Brahmer J, Reckamp KL, Baas P, Crinò L, Eberhardt WEE, Poddubskaya E, et al. Nivolumab versus docetaxel in advanced squamous-cell non–small-cell lung cancer. New Engl J Med. (2015) 373:123–35. doi: 10.1056/NEJMoa1504627

PubMed Abstract | Crossref Full Text | Google Scholar

6. Valilou SF, Rezaei N. Tumor antigens. In: Vaccines for Cancer Immunotherapy. Cambridge, Massachusetts, US: Elsevier (2019). p. 61–74. doi: 10.1016/B978-0-12-814039-0.00004-7

Crossref Full Text | Google Scholar

7. Narang P, Chen M, Sharma AA, Anderson KS, Wilson MA. The neoepitope landscape of breast cancer: Implications for immunotherapy. BMC Cancer. (2019) 19:200. doi: 10.1186/s12885-019-5402-1

PubMed Abstract | Crossref Full Text | Google Scholar

8. Blass E, Ott PA. Advances in the development of personalized neoantigen-based therapeutic cancer vaccines. Nat Rev Clin Oncol. (2021) 18:215–29. doi: 10.1038/s41571-020-00460-2

PubMed Abstract | Crossref Full Text | Google Scholar

9. Pan R-Y, Chung W-H, Chu M-T, Chen S-J, Chen H-C, Zheng L, et al. Recent development and clinical application of cancer vaccine: targeting neoantigens. J Immunol Res. (2018) 2018:1–9. doi: 10.1155/2018/4325874

PubMed Abstract | Crossref Full Text | Google Scholar

10. Yarchoan M, Johnson BA, Lutz ER, Laheru DA, Jaffee EM. Targeting neoantigens to augment antitumour immunity. Nat Rev Cancer. (2017) 17:209–22. doi: 10.1038/nrc.2016.154

PubMed Abstract | Crossref Full Text | Google Scholar

11. Tang W, Zhang F, Byun JS, Dorsey TH, Yfantis HG, Ajao A, et al. Population-specific mutation patterns in breast tumors from African American, European American, and Kenyan patients. Cancer Res Commun. (2023) 3:2244–55. doi: 10.1158/2767-9764.CRC-23-0165

PubMed Abstract | Crossref Full Text | Google Scholar

12. Andrews S. FastQC: a quality control tool for high throughput sequence data (2010). Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (accessed April 15, 2024).

Google Scholar

13. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics. (2016) 32:3047–8. doi: 10.1093/bioinformatics/btw354

PubMed Abstract | Crossref Full Text | Google Scholar

14. Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. (2014) 30:2114–20. doi: 10.1093/bioinformatics/btu170

PubMed Abstract | Crossref Full Text | Google Scholar

15. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. (2013). doi: 10.48550/ARXIV.1303.3997

Crossref Full Text | Google Scholar

16. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. (2010) 20:1297–303. doi: 10.1101/gr.107524.110

PubMed Abstract | Crossref Full Text | Google Scholar

17. Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants. Bioinformatics. (2015) 31:2202–4. doi: 10.1093/bioinformatics/btv112

PubMed Abstract | Crossref Full Text | Google Scholar

18. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The ensembl variant effect predictor. Genome Biol. (2016) 17:122. doi: 10.1186/s13059-016-0974-4

PubMed Abstract | Crossref Full Text | Google Scholar

19. Kandoth C, Gao J, Mattioni M, Struck A, Boursin Y, Penson A, et al. mskcc/vcf2maf: vcf2maf v1.6.16 (v1.6.16). (2018). doi: 10.5281/ZENODO.593251. Computer software.

Crossref Full Text | Google Scholar

20. Mayakonda A, Lin D-C, Assenov Y, Plass C, Koeffler HP. Maftools: Efficient and comprehensive analysis of somatic variants in cancer. Genome Res. (2018) 28:1747–56. doi: 10.1101/gr.239244.118

PubMed Abstract | Crossref Full Text | Google Scholar

21. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. (2016) 34:525–7. doi: 10.1038/nbt.3519

PubMed Abstract | Crossref Full Text | Google Scholar

22. Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: Transcript-level estimates improve gene-level inferences. F1000Research. (2016) 4:1521. doi: 10.12688/f1000research.7563.2

PubMed Abstract | Crossref Full Text | Google Scholar

23. Kawaguchi S, Higasa K, Shimizu M, Yamada R, Matsuda F. HLA-HD: An accurate HLA typing algorithm for next-generation sequencing data. Hum Mutat. (2017) 38:788–97. doi: 10.1002/humu.23230

PubMed Abstract | Crossref Full Text | Google Scholar

24. Hundal J, Carreno BM, Petti AA, Linette GP, Griffith OL, Mardis ER, et al. pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Med. (2016) 8:11. doi: 10.1186/s13073-016-0264-5

PubMed Abstract | Crossref Full Text | Google Scholar

25. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, et al. The immune epitope database (IEDB): 2018 update. Nucleic Acids Res. (2019) 47:D339–43. doi: 10.1093/nar/gky1006

PubMed Abstract | Crossref Full Text | Google Scholar

26. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing (2023). Available at: https://www.R-project.org/ (accessed April 15, 2024).

Google Scholar

27. Pan J-W, Zabidi MMA, Ng P-S, Meng M-Y, Hasan SN, Sandey B, et al. The molecular landscape of Asian breast cancers reveals clinically relevant population-specific differences. Nat Commun. (2020) 11:6433. doi: 10.1038/s41467-020-20173-5

PubMed Abstract | Crossref Full Text | Google Scholar

28. Pipek O, Alpár D, Rusz O, Bödör C, Udvarnoki Z, Medgyes-Horváth A, et al. Genomic landscape of normal and breast cancer tissues in a Hungarian pilot cohort. Int J Mol Sci. (2023) 24:8553. doi: 10.3390/ijms24108553

PubMed Abstract | Crossref Full Text | Google Scholar

29. Lakshmanan I, Ponnusamy MP, Das S, Chakraborty S, Haridas D, Mukhopadhyay P, et al. MUC16 induced rapid G2/M transition via interactions with JAK2 for increased proliferation and anti-apoptosis in breast cancer cells. Oncogene. (2012) 31:805–17. doi: 10.1038/onc.2011.297

PubMed Abstract | Crossref Full Text | Google Scholar

30. Wang X, Guda C. Integrative exploration of genomic profiles for triple negative breast cancer identifies potential drug targets. Medicine. (2016) 95:e4321. doi: 10.1097/MD.0000000000004321

PubMed Abstract | Crossref Full Text | Google Scholar

31. Felder M, Kapur A, Gonzalez-Bosquet J, Horibata S, Heintz J, Albrecht R, et al. MUC16 (CA125): Tumor biomarker to cancer therapy, a work in progress. Mol Cancer. (2014) 13:129. doi: 10.1186/1476-4598-13-129

PubMed Abstract | Crossref Full Text | Google Scholar

32. Midha MK, Huang Y-F, Yang H-H, Fan T-C, Chang N-C, Chen T-H, et al. Comprehensive cohort analysis of mutational spectrum in early onset breast cancer patients. Cancers. (2020) 12:2089. doi: 10.3390/cancers12082089

PubMed Abstract | Crossref Full Text | Google Scholar

33. Al Amri WS, Allinson LM, Baxter DE, Bell SM, Hanby AM, Jones SJ, et al. Genomic and expression analyses define MUC17 and PCNX1 as predictors of chemotherapy response in breast cancer. Mol Cancer Ther. (2020) 19:945–55. doi: 10.1158/1535-7163.MCT-19-0940

PubMed Abstract | Crossref Full Text | Google Scholar

34. MaChado GC, Ferrer VP. MUC17 mutations and methylation are associated with poor prognosis in adult-type diffuse glioma patients. J Neurological Sci. (2023) 452:120762. doi: 10.1016/j.jns.2023.120762

PubMed Abstract | Crossref Full Text | Google Scholar

35. Oh J-H, Jang SJ, Kim J, Sohn I, Lee J-Y, Cho EJ, et al. Spontaneous mutations in the single TTN gene represent high tumor mutation burden. NPJ Genomic Med. (2020) 5:33. doi: 10.1038/s41525-019-0107-6

PubMed Abstract | Crossref Full Text | Google Scholar

36. Marvalim C, Datta A, Lee SC. Role of p53 in breast cancer progression: An insight into p53 targeted therapy. Theranostics. (2023) 13:1421–42. doi: 10.7150/thno.81847

PubMed Abstract | Crossref Full Text | Google Scholar

37. Dumay A, Feugeas J, Wittmer E, Lehmann-Che J, Bertheau P, Espié M, et al. Distinct tumor protein p53 mutants in breast cancer subgroups. Int J Cancer. (2013) 132:1227–31. doi: 10.1002/ijc.27767

PubMed Abstract | Crossref Full Text | Google Scholar

38. Chen J, Sun J, Wang Q, Du Y, Cheng J, Yi J, et al. Systemic deficiency of PTEN accelerates breast cancer growth and metastasis. Front Oncol. (2022) 12:825484. doi: 10.3389/fonc.2022.825484

PubMed Abstract | Crossref Full Text | Google Scholar

39. Tsuge S, Saberi B, Cheng Y, Wang Z, Kim A, Luu H, et al. Detection of novel fusion transcript VTI1A-CFAP46 in hepatocellular carcinoma. Gastrointestinal Tumors. (2019) 6:11–27. doi: 10.1159/000496795

PubMed Abstract | Crossref Full Text | Google Scholar

40. Milne RL, Antoniou AC. Genetic modifiers of cancer risk for BRCA1 and BRCA2 mutation carriers. Ann Oncol. (2011) 22:i11–7. doi: 10.1093/annonc/mdq660

PubMed Abstract | Crossref Full Text | Google Scholar

41. den Brok WD, Schrader KA, Sun S, Tinker AV, Zhao EY, Aparicio S, et al. Homologous recombination deficiency in breast cancer: A clinical review. JCO Precis. Oncol. (2017) 1:1–13. doi: 10.1200/PO.16.00031

PubMed Abstract | Crossref Full Text | Google Scholar

42. Bodily WR, Shirts BH, Walsh T, Gulsuner S, King M-C, Parker A, et al. Effects of germline and somatic events in candidate BRCA-like genes on breast-tumor signatures. PloS One. (2020) 15:e0239197. doi: 10.1371/journal.pone.0239197

PubMed Abstract | Crossref Full Text | Google Scholar

43. Animesh S, Ren X, An O, Chen K, Lee SC, Yang H, et al. Exploring the neoantigen burden in breast carcinoma patients. (2022). doi: 10.1101/2022.03.03.482669

Crossref Full Text | Google Scholar

44. Morisaki T, Kubo M, Umebayashi M, Yew PY, Yoshimura S, Park J-H, et al. Neoantigens elicit T cell responses in breast cancer. Sci Rep. (2021) 11:13590. doi: 10.1038/s41598-021-91358-1

PubMed Abstract | Crossref Full Text | Google Scholar

45. Alspach E, Lussier DM, Miceli AP, Kizhvatov I, DuPage M, Luoma AM, et al. MHC-II neoantigens shape tumour immunity and response to immunotherapy. Nature. (2019) 574:696–701. doi: 10.1038/s41586-019-1671-8

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: neoantigen, breast cancer, exome-seq, RNA-seq, somatic mutations, Kenya

Citation: Wagutu G, Gitau J, Mwangi K, Murithi M, Melly E, Harris AR, Sayed S, Ambs S and Makokha F (2024) Whole exome-seq and RNA-seq data reveal unique neoantigen profiles in Kenyan breast cancer patients. Front. Oncol. 14:1444327. doi: 10.3389/fonc.2024.1444327

Received: 05 June 2024; Accepted: 25 November 2024;
Published: 11 December 2024.

Edited by:

Zodwa Dlamini, Pan African Cancer Research Institute (PACRI), South Africa

Reviewed by:

Sambhavi Animesh, Massachusetts General Hospital and Harvard Medical School, United States
Serge Théophille Soubeiga, Research Institute for Health Sciences (IRSS), Burkina Faso

Copyright © 2024 Wagutu, Gitau, Mwangi, Murithi, Melly, Harris, Sayed, Ambs and Makokha. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Francis Makokha, Zm1ha29raGFAbWt1LmFjLmtl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.