Lung cancer (LC) is a leading cause of cancer-deaths globally. Its lethality is due in large part to the paucity of accurate screening markers. Precision Medicine includes the use of omics technology and novel analytic approaches for biomarker development. We combined Artificial Intelligence (AI) and DNA methylation analysis of circulating cell-free tumor DNA (ctDNA), to identify putative biomarkers for and to elucidate the pathogenesis of LC.
Illumina Infinium MethylationEPIC BeadChip array analysis was used to measure cytosine (CpG) methylation changes across the genome in LC. Six different AI platforms including support vector machine (SVM) and Deep Learning (DL) were used to identify CpG biomarkers and for LC detection. Training set and validation sets were generated, and 10-fold cross validation performed. Gene enrichment analysis using g:profiler and GREAT enrichment was used to elucidate the LC pathogenesis.
Using a stringent GWAS significance threshold, p-value <5x10-8, we identified 4389 CpGs (cytosine methylation loci) in coding genes and 1812 CpGs in non-protein coding DNA regions that were differentially methylated in LC. SVM and three other AI platforms achieved an AUC=1.00; 95% CI (0.90-1.00) for LC detection. DL achieved an AUC=1.00; 95% CI (0.95-1.00) and 100% sensitivity and specificity. High diagnostic accuracies were achieved with only intragenic or only intergenic CpG loci. Gene enrichment analysis found dysregulation of molecular pathways involved in the development of small cell and non-small cell LC.
Using AI and DNA methylation analysis of ctDNA, high LC detection rates were achieved. Further, many of the genes that were epigenetically altered are known to be involved in the biology of neoplasms in general and lung cancer in particular.