AUTHOR=Cong Dan , Zhao Yanan , Zhang Wenlong , Li Jun , Bai Yuansong
TITLE=Applying machine learning algorithms to develop a survival prediction model for lung adenocarcinoma based on genes related to fatty acid metabolism
JOURNAL=Frontiers in Pharmacology
VOLUME=14
YEAR=2023
URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2023.1260742
DOI=10.3389/fphar.2023.1260742
ISSN=1663-9812
ABSTRACT=
Background: The progression of lung adenocarcinoma (LUAD) may be related to abnormal fatty acid metabolism (FAM). The present study investigated the relationship between FAM-related genes and LUAD prognosis.
Methods: LUAD samples from The Cancer Genome Atlas were collected. The scores of FAM-associated pathways from the Kyoto Encyclopedia of Genes and Genomes website were calculated using the single sample gene set enrichment analysis. ConsensusClusterPlus and cumulative distribution function were used to classify molecular subtypes for LUAD. Key genes were obtained using limma package, Cox regression analysis, and six machine learning algorithms (GBM, LASSO, XGBoost, SVM, random forest, and decision trees), and a RiskScore model was established. According to the RiskScore model and clinical features, a nomogram was developed and evaluated for its prediction performance using a calibration curve. Differences in immune abnormalities among patients with different subtypes and RiskScores were analyzed by the Estimation of STromal and Immune cells in MAlignant Tumours using Expression data, CIBERSORT, and single sample gene set enrichment analysis. Patients’ drug sensitivity was predicted by the pRRophetic package in R language.
Results: LUAD samples had lower scores of FAM-related pathways. Three molecular subtypes (C1, C2, and C3) were defined. Analysis on differential prognosis showed that the C1 subtype had the most favorable prognosis, followed by the C2 subtype, and the C3 subtype had the worst prognosis. The C3 subtype had lower immune infiltration. A total of 12 key genes (SLC2A1, PKP2, FAM83A, TCN1, MS4A1, CLIC6, UBE2S, RRM2, CDC45, IGF2BP1, ANGPTL4, and CD109) were screened and used to develop a RiskScore model. Survival chance of patients in the high-RiskScore group was significantly lower. The low-RiskScore group showed higher immune score and higher expression of most immune checkpoint genes. Patients with a high RiskScore were more likely to benefit from the six anticancer drugs we screened in this study.
Conclusion: We developed a RiskScore model using FAM-related genes to help predict LUAD prognosis and develop new targeted drugs.