ORIGINAL RESEARCH article
Front. Oncol.
Sec. Cancer Genetics
Volume 15 - 2025 | doi: 10.3389/fonc.2025.1568205
This article is part of the Research TopicCancer Epidemiology and Etiology Evaluation in Latin American PopulationView all 3 articles
Integrating Next-Generation Sequencing and Artificial Intelligence for the Identification and Validation of Pathogenic Variants in Colorectal Cancer
Provisionally accepted- 1School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT)., Bogotá, Cundinamarca, Colombia
- 2Coloproctology Department, Hospital Universitario Mayor - Méderi - Universidad del Rosario, Bogota, Colombia
- 3Cancer Research Group (CRG), Faculty of Medicine, Universidad de Las Américas, Quito, Ecuador
- 4Rosario University, Bogotá, Colombia
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: Colorectal cancer (CRC) is recognized as a multifactorial disease, where both genetic and environmental factors play critical roles in its development and progression. The identification of pathogenic germline variants has proven to be a valuable tool for early diagnosis, the implementation of surveillance strategies, and the identification of individuals at increased cancer risk. Next-generation sequencing (NGS) has facilitated comprehensive multigene analysis in both hereditary and sporadic cases of CRC.In this study, we analyzed 100 unselected Colombian patients with CRC to identify pathogenic (P) and likely pathogenic (LP) germline variants, classified according to the guidelines established by the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP). Using the BoostDM artificial intelligence method, we were able to identify oncodriver germline variants with potential implications for disease progression. We assessed the model's accuracy in predicting germline variants by comparing its results with the AlphaMissense pathogenicity prediction model. Additionally, a minigene assay was employed for the functional validation of intronic mutations.Results: Our findings revealed that 12% of the patients carried pathogenic/likely pathogenic (P/LP) variants according to ACMG/AMP criteria. Using BoostDM, we identified oncodriver variants in 65% of the cases. These results highlight the significance of expanded multigene analysis and the integration of artificial intelligence in detecting germline variants associated with CRC. The average overall AUC values for the comparison between BoostDM and AlphaMissense were 0.788 for the entire BoostDM dataset and 0.803 for the genes within our panel, with individual gene AUC values ranging from 0.606 to 0.983. Functional validation through the minigene assay revealed the generation of aberrant transcripts, potentially linked to the molecular etiology of the disease.Our study provided valuable insights into the prevalence and frequency of P/LP germline variants in unselected Colombian CRC patients through NGS. Integrating advanced genomic analysis and artificial intelligence has proven instrumental in enhancing variant detection beyond conventional methods. Our functional validation results provide insights into the potential pathogenicity of intronic variants. These findings underscore the necessity of a multifaceted approach to unravel the complex genetic landscape of CRC.
Keywords: Next generation sequencing (NGS), pathogenic germline variants, artificial intelligence, minigene assay, Functional Validation, colorectal cancer
Received: 19 Feb 2025; Accepted: 22 Apr 2025.
Copyright: © 2025 Rodriguez-Salamanca, Angulo-Aguado, Orjuela Amarillo, Duque, Sierra, Contreras, Figueroa, Restrepo, López-Cortés, Cabrera, Morel and FONSECA. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: DORA JANETH FONSECA, Rosario University, Bogotá, Colombia
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.