AUTHOR=Jiawei Zhou , Min Mu , Yingru Xing , Xin Zhang , Danting Li , Yafeng Liu , Jun Xie , Wangfa Hu , Lijun Zhang , Jing Wu , Dong Hu TITLE=Identification of Key Genes in Lung Adenocarcinoma and Establishment of Prognostic Mode JOURNAL=Frontiers in Molecular Biosciences VOLUME=7 YEAR=2020 URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2020.561456 DOI=10.3389/fmolb.2020.561456 ISSN=2296-889X ABSTRACT=Background

The development of human tumors is associated with the abnormal expression of various functional genes, and a massive tumor-based database needs to be deeply mined. Based on a multigene prediction model, access to urgent prognosis of patients has become possible.

Materials and Methods

We selected three RNA expression profiles (GSE32863, GSE10072, and GSE43458) from the lung adenocarcinoma (LUAD) database of the Gene Expression Omnibus (GEO) and analyzed the differentially expressed genes (DEGs) between tumor and normal tissue using GEO2R program. After that, we analyzed the transcriptome data of 479 LUAD samples (54 normal tissue samples and 425 cancer tissue samples) and their clinical follow-up data from the (TCGA) database. Kaplan–Meier (KM) curve and receiver operating characteristic (ROC) were used to assess the prediction model. Multivariate Cox analysis was used to identify independent predictors. TCGA pancreatic adenocarcinoma datasets were used to establish a nomogram model.

Results

We found 98 significantly prognosis-related genes using KM and COX analysis, among which six genes were found to be the DEGs in GEO. Using multivariate analysis, it was found that a single gene could not be used as an independent predictor of prognosis. However, the risk score calculated by weighting these six genes could serve as an independent prognosis predictor. COX analysis performed with multiple covariates such as age, gender, tumor stage, and TNM typing showed that risk score could still be utilized as an independent risk factor for patient survival rate (p = 0.013) and had an applicable reliability (area under the curve, AUC = 0.665). By combining risk score and various clinical features, the nomogram model was constructed, which had been proven to have high consistency for the prediction of 3- and 5-year survival rate (concordance = 0.751) and high accuracy as tested by ROC (AUC = 0.71;AUC = 0.708).

Conclusion

We proposed a method to predict the prognosis of LUAD by weighting multiple genes and constructed a nomogram model suitable for the prognostic evaluation of LUAD, which could provide a new tool for the identification of therapeutic targets and the efficacy evaluation of LUAD.