Skip to main content

ORIGINAL RESEARCH article

Front. Mol. Biosci., 22 June 2022
Sec. Molecular Diagnostics and Therapeutics
This article is part of the Research Topic Cancer Diagnostics in Solid Tumors - From Pathology to Precision Oncology View all 20 articles

DNA Damage Response Gene-Based Subtypes Associated With Clinical Outcomes in Early-Stage Lung Adenocarcinoma

Yang Zhao&#x;Yang Zhao1Bei Qing&#x;Bei Qing2Chunwei Xu&#x;Chunwei Xu3Jing ZhaoJing Zhao4Yuchen LiaoYuchen Liao4Peng CuiPeng Cui4Guoqiang WangGuoqiang Wang4Shangli CaiShangli Cai4Yong SongYong Song3Liming Cao
Liming Cao5*Jianchun Duan
Jianchun Duan6*
  • 1Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
  • 2Department of Thoracic Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
  • 3Department of Respiratory Medicine, Jinling Hospital, Nanjing University School of Medicine, Nanjing, China
  • 4Burning Rock Biotech, Guangzhou, China
  • 5Department of Respiratory Medicine, Xiangya Hospital, Central South University, Changsha, China
  • 6CAMS Key Laboratory of Translational Research on Lung Cancer, State Key Laboratory of Molecular Oncology, Department of Medical Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences Peking Union Medical College, Beijing, China

DNA damage response (DDR) pathways play a crucial role in lung cancer. In this retrospective analysis, we aimed to develop a prognostic model and molecular subtype based on the expression profiles of DDR-related genes in early-stage lung adenocarcinoma (LUAD). A total of 1,785 lung adenocarcinoma samples from one RNA-seq dataset of The Cancer Genome Atlas (TCGA) and six microarray datasets of Gene Expression Omnibus (GEO) were included in the analysis. In the TCGA dataset, a DNA damage response gene (DRG)–based signature consisting of 16 genes was constructed to predict the clinical outcomes of LUAD patients. Patients in the low-DRG score group had better outcomes and lower genomic instability. Then, the same 16 genes were used to develop DRG-based molecular subtypes in the TCGA dataset to stratify early-stage LUAD into two subtypes (DRG1 and DRG2) which had significant differences in clinical outcomes. The Kappa test showed good consistency between molecular subtype and DRG (K = 0.61, p < 0.001). The DRG subtypes were significantly associated with prognosis in the six GEO datasets (pooled estimates of hazard ratio, OS: 0.48 (0.41–0.57), p < 0.01; DFS: 0.50 (0.41–0.62), p < 0.01). Furthermore, patients in the DRG2 group benefited more from adjuvant therapy than standard-of-care, which was not observed in the DRG1 group. In summary, we constructed a DRG-based molecular subtype that had the potential to predict the prognosis of early-stage LUAD and guide the selection of adjuvant therapy for early-stage LUAD patients.

Introduction

Lung cancer is the major cause of global cancer mortality in 2020, with an estimated 1.8 million deaths worldwide (Sung et al., 2021). Non–small cell lung cancer (NSCLC) represents 85% of all lung cancers. Based on histology, NSCLC can be further divided into lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LSCC), large-cell carcinoma, etc. (Bender, 2014). The survival of patients with NSCLC is largely determined by the tumor stage at diagnosis. Only 15% of patients with late-stage disease (stages III–IV) are alive after 5 years, which makes NSCLC one of the cancers with the worst prognosis (Necchi et al., 2017). Although the 5-year survival rate increases to approximately 60% and 40% for stage I and stage II patients, respectively, around 30–55% of them experienced disease recurrence within 5 years after surgery (Howington et al., 2013; Wang et al., 2017). In recent years, the immuno-oncology (IO)–based strategies, such as immune checkpoint inhibitors (ICIs), the combination of different ICIs, or chemotherapies, have achieved evolutionized improvements in the treatment for a subset of patients with lung cancer (Listì et al., 2019; Passiglia et al., 2021). Besides the breakthrough in cancer therapy, it is also important to improve recurrence prediction and clinical management with the increase of early-stage tumors due to the progress of lung cancer screening.

The rapid development of high-throughput technologies, especially DNA microarrays and RNA-sequencing, has facilitated the exploration of several expression-based gene signatures for risk stratification in NSCLC patients. Beer et al. proposed a 50-gene signature to identify low- and high-risk stage I lung adenocarcinomas using microarray analysis (Beer et al., 2002). The Director’s Challenge Consortium validated the performance of several such prognostic models in a large multi-site cohort with 442 lung adenocarcinomas (Chen et al., 2007; Shedden et al., 2008; Sun et al., 2008). In addition, a 14-gene expression signature (RT-PCR–based) has been commercialized to stratify different risk groups for resected non-squamous NSCLC patients (Kratz et al., 2012). A 25–immune gene signature and a 31–proliferation gene signature both have shown promising clinical utility for risk stratification and individualized management in NSCLC patients (Wistuba et al., 2013; Li et al., 2017). However, none of these signatures was further analyzed in patients with and without adjuvant therapy to validate the potential clinical utility in the guidance of adjuvant therapy.

Genomic instability is one of the key hallmarks of cancer, and DNA damage response (DDR) plays a significant role in maintaining genomic integrity (Hanahan and Weinberg, 2011). The DDR system is a complex signaling network which involves eight pathways: base excision repair (BER), mismatch repair (MMR), homologous recombination repair (HRR), nonhomologous end joining (NHEJ), checkpoint factors (CPF), Fanconi anemia (FA), nucleotide-excision repair (NER), and DNA translesion synthesis (TLS) (Scarbrough et al., 2016). These pathways operate collectively to detect diverse types of DNA lesions and activate signaling mechanisms to boost the repair machine (Jackson and Bartek, 2009). Previous studies have demonstrated that the DDR pathways play significant roles in cancer progression and the response to cancer therapies. Several prognostic models, based on DDR genes, have been constructed for glioblastoma, ovarian cancer, and low-grade gliomas (Knijnenburg et al., 2018; Gobin et al., 2019; Sun et al., 2019; Pang et al., 2020). However, the DDR genes identified in these prognostic models vary widely between different cancers, suggesting that DDR genes may exert different molecular effects in different cellular environments. The relationships of various DDR genes with prognosis in lung adenocarcinoma are not well-established.

In this study, we aimed to identify and validate a group of DDR genes to stratify early-stage LUAD patients into different subtypes with different prognoses and guide the use of adjuvant therapy.

Materials and Methods

Molecular and Clinical Data

The LUAD dataset of The Cancer Genome Atlas (TCGA) and six microarray datasets of Gene Expression Omnibus (GEO) were included in the analysis. For the TCGA dataset, RNA-sequencing data (FPKM format), genetic mutations, copy number variant (CNV), and clinical features, including age, sex, tumor stage, histology subtype, adjuvant treatment, and follow-up information, were obtained from the GDC (https://portal.gdc.cancer.gov/). In addition, normalized microarray data and the corresponding clinical characteristics of patients with early-stage (stages I and II) lung adenocarcinoma from six GEO cohorts (GSE31210, GSE37745, GSE68465, GSE30219, GSE72094, and GSE13213) were obtained for further external validation in this study.

DDR Gene–Based Signature Construction

A total of 200 DDR-related genes were curated and analyzed to identify prognosis-related markers (Scarbrough et al., 2016). These genes used in the study are listed in Supplementary Table S1. Univariable Cox regression and LASSO Cox regression analyses with minimum partial likelihood deviance were performed to select genes associated with OS. We defined the risk score using the following formula

riskscore=k=0n(coef of genekexpr of genek ),

where n is the number of markers. The nearest neighbor estimation method was applied to identify the best cutoff point of risk score to stratify patients into high- and low-risk subgroups. Kaplan–Meier (KM) analysis and receiver operating characteristic (ROC) curve were used to assess the performance of the signature.

Association Between DDR Signature and Genome Features and Gene Expression

In order to explore the potential molecular mechanisms of the DDR-gene–based signature, the associations of the DDR signature with somatic mutation, CNV, genomic scar signature, and gene expression data were analyzed based on TCGA data.

DDR Molecular Subtype Identification and Validation

Unsupervised clustering with the hierarchical cluster algorithm (based on Euclidean distance and Ward’s linkage) of the expression profiles of the genes in the DDR signature was performed to identify molecular subtype in early-stage LUAD. The default parameters of the hclust function were used to perform the classification. The cluster number was selected as 2. It was further validated in six GEO datasets.

Statistical Analyses

R software v4.0.2 was used for all the bioinformatics and statistical analyses, including data preprocess, LASSO Cox regression, CNV and mutation visualization, and ROC analysis. The KM method and log-rank test were adopted to generate and evaluate the statistical significance of the survival curves between groups. The specificity and sensitivity of the signature were evaluated using the ROC curve, and the area under the curve (AUC) of distinct survival time was quantified using R-package pROC. The Kappa consistency test was used to analyze the consistency between the two group methods. The Cox proportional hazards model was applied to identify the independence of the signature. The prognostic values of single genes in signatures were accessed using the “szcox” function of the ezcox package. R-package SubgrPlots was used for subgroup analysis, which was visualized using the Forester package. A propensity score matching (PSM) analysis was performed according to a 1:1 ratio between the two subgroups (with or without adjuvant therapy) to adjust for clinicopathologic characteristics bias using the MatchIt package. Heat maps of TCGA–LUAD and GEO datasets were generated using the pheatmap package. The maftools package was used to visualize the mutation landscape in the TCGA–LUAD dataset. Two-sided p < 0.05 was considered to be statistically significant.

Results

Construction of a 16-Gene Signature

A total of 1,785 primary LUAD tumors and their clinicopathological features were downloaded from TCGA and GEO databases, and the baseline characteristics are summarized in Supplementary Table S2. To identify the survival-related genes, univariable Cox regression was performed in the 200 DDR-related genes with the TCGA–LUAD dataset (n = 500), and 46 DDR genes were identified to be significantly associated with OS. Then, ten-fold cross-validation of LASSO Cox was implemented using the “glmnet” package, and 16 genes (PCNA, XRCC5, XRCC6, RFC3, FANCL, NEIL1, NEIL3, NBN, ERCC1, REV3L, REV1, HFM1, DDB1, EXO1, RAD23B, and POLD2) were identified to be the most informative and were used to construct a risk score (Figure 1A). In brief, NEIL1, HFM1, REV3L, and REV1 genes were protective factors (all HRs < 1, p < 0.05), while the others genes were risk factors for the prognosis in patients with LUAD (all HRs > 1, p < 0.05) (Figure 1B). Then, we established a DNA damage response gene (DRG)–based signature for each patient based on the following formula: DRG = (−0.0618*PCNA)+(0.2175*XRCC5)+(0.1094*XRCC6)+(−0.0929*RFC3)+(0.1894*FANCL)+(−0.0008*NEIL1)+(0.0095*NEIL3)+(0.1651*NBN)+(0.1594*ERCC1)+(−0.0817*REV3L)+(−0.0640*REV1)+(−0.0119*HFM1)+(0.1600*DDB1)+(0.1070*EXO1)+(0.2896*RAD23B)+(0.0548*POLD2). The best cutoff of 21.96 was used to stratify the patients into high- or low-risk groups. The AUCs for 1-, 3-, and 5-year overall survival (OS) rate predictions for the DRG of the TCGA–LUAD dataset were 0.716, 0.707, and 0.644, respectively (Figure 1C). The KM curves revealed significantly higher OS with lower DRG (HR = 0.38, 95% CI: 0.28–0.53, p < 0.001, Figure 1D). Similar results for disease-free survival (DFS) were obtained. The AUCs for 1-, 3-, and 5-year were 0.650, 0.622, and 0.589, respectively, (Figure 1E) and the association between the DRG and DFS was significant (log-rank p < 0.001; HR = 0.60, 95% CI: 0.45–0.81, Figure 1F). Next, we tested the independent prognostic prediction value of the DRG. After adjusting for clinical features, including age, sex, tumor stage, and smoking, as well as the driver gene mutation (EGFR, KRAS, ALK, ROS1, BRAF, and TP53), the DRG served as an independent prognostic biomarker for predicting outcomes (OS, HR: 0.42 (0.30–0.58); DFS, HR: 0.43 (0.31–0.53), Table 1).

FIGURE 1
www.frontiersin.org

FIGURE 1. Selection of prognostic markers. (A) Tuning parameter (λ) selection in the LASSO model using 10-fold cross-validation via minimum criteria. (B) Forest plot showing the results of univariable Cox regression analyses. (C) Predictive value of 16 genes in the overall survival of patients in the LUAD dataset. (D) Kaplan–Meier curves of overall survival for high- and low-risk patient groups in the TCGA–LUAD dataset. Patients were divided into two groups with a cutoff score of 21.96. (E) Predictive value of 16 genes in the disease-free survival of patients in the LUAD dataset. (F) Kaplan–Meier curves of disease-free survival for high- and low-risk patient groups in the TCGA–LUAD dataset. Patients were divided into two groups with a cutoff score of 21.96.

TABLE 1
www.frontiersin.org

TABLE 1. Univariable analysis and multivariable Cox regression analyses of OS and DFS in TCGA cohorts.

Association Between the 16 Genes and Clinicopathological Factors

To further study the underlying mechanism of the DRG, we explored the molecular function and the association with prognosis of genes in the DRG. Most of them had positive coefficients in this regression equation with HR > 1, indicating poor prognostic genes, while genes (REV3L, REV1, NEIL1, and HFM1) had negative coefficients with HR < 1 (Figure 2A). To depict the genomic and expression alterations of the 16 DDR genes, we further described the prevalence of somatic mutations, CNV, and mRNA expression of the 16 genes in LUAD patients (Figure 2B). Of the 486 LUAD patients, 63 (13.0%) patients harbored at least one mutation of the pattern genes. Among them, HFM1 had the highest mutation frequency (4%) followed by EOX1 and REV3L, while there were no mutations in ERCC1, NEIL1, and PCNA. Meanwhile, CNV analysis showed that EXO1, NBN, and POLD2 had a widespread frequency of CNV gain. Furthermore, the mRNA expressions for these genes were significantly higher in patients with CNV gain, suggesting CNV alteration may be a vital contributor to the altered mRNA expression of these genes. Moreover, protein expression levels of 13 genes were obtained from The Human Protein Atlas (THPA). Representative IHC images revealed that these proteins had upregulated expression in lung adenocarcinoma tissues and downregulated expression in normal lung tissues (Supplementary Figure S1 and Supplementary Table S3).

FIGURE 2
www.frontiersin.org

FIGURE 2. Mechanism of validation in mutation, CNV, mRNA expression, and genome instability. (A) Sankey plot showing the correlations among 16 genes, DDR pathways, and prognostic value. (B) Left panel is the CNV variation frequency of 16 genes, and the deletion frequency is shown by gray dots; the middle panel is the mutation frequency of 16 genes between high- and low-DDR score samples. The right panel is the expression of 16 genes between normal tissues and tumor tissues. Tumor, gray; normal, yellow. (CO) Comparison of the genome instability (NtAI, LST, LOH, HRD, Aneuploidy Score (AS), AS_del, AS_amp, TMB, SNP, and indel) and expression pattern of TP53, ATM, and ATR between high-risk and low-risk patients in the TCGA dataset. High, gray; low, yellow. (P) Expression profiles of 16 genes between high- and low-risk groups in the stages I and II TCGA–LAUD dataset. p-value of continuous variables was tested using Wilcoxon rank-sum test. Pearson’s chi-square test was used to test the categorical variables.

To identify the biological significance of the genes in the DRG signature, GO analysis was conducted, and the results revealed that these genes were enriched in DNA-dependent DNA replication, nucleotide-excision repair, DNA recombination, and DNA geometric change. Furthermore, the outcomes of KEGG pathway analysis illustrated that these genes were mainly enriched in the Fanconi anemia pathway, base excision repair, and homologous recombination (Supplementary Figure S2).

Next, we investigated the associations between the two groups and various genomic features. The high-risk group was associated with higher aneuploidy score (AS), tumor mutational burden (TMB), SNP, and indel burden than the low-risk group (Figure 2C–H). Higher mRNA expression of ATM was observed in the low-risk group than in the high-risk group, while not for TP53 and ATR (Figure 2I–K). We also observed that samples in the high-risk group exhibited higher genomic instability—telomeric allelic imbalance (TAI), large-scale state transitions (LST), loss of heterozygosity (LOH), and an incorporated homologous recombination deficiency (HRD) score (Figure 2L-O). These results showed the heterogeneity in genomic scar and DDR checkpoint gene expression between the two groups.

Molecular Subtype Identification

As shown in Figure 2P, two expression patterns of the 16 genes were identified from the expression heat map of these signature genes in patients with stages I and II LUAD from the TCGA cohort. Patients in the low-risk group had better clinical outcomes (OS and DFS) and showed significantly higher expressions of REV1, REV3L, HFM1, and NEIL1, while the other genes had significantly lower expressions in this group. Meanwhile, the low-risk group had a higher percentage of patients with stage I than in the high-risk group (Chi test, p = 0.0025). The abovementioned results demonstrated that the DRG-related genes could be used to classify the early-stage LUAD patients.

Unsupervised hierarchical clustering (based on Euclidean distance and Ward’s linkage) of the expression profiles of DDR genes was used to identify molecular subtype instead of the formula derived from the TCGA cohort. The expression profile of the 16 genes was used to develop a DRG-related molecular subtype to stratify early-stage LUAD into two subtypes (DRG1 and DRG2) with statistically significant differences in clinical outcomes. A clustering heat map was generated to illustrate that the expressions of DRG-related genes were significantly different between the two subtypes (Figure 3A). The Kappa consistency test revealed the consistency of the two methods (DRG and molecular subtype, K = 0.61, p < 0.001, Figure 3A). As shown in Figure 3B, 71.4% (142/199) of the low-risk DRG patients were grouped into DRG1 subtype, and 89.9% (169/188) of the high-risk DRG patients were grouped into DRG2 subtype. Similar results were also documented in the Kaplan–Meier analysis (DFS, log-rank p = 0.001; HR = 0.57, 95% CI: 0.40–0.80; OS, log-rank p < 0.001; HR = 0.43, 95% CI: 0.28–0.65, Figure 3C,D). After adjusting for clinical factors, the molecular subtype remained an independent prognostic molecular classifier for DFS and OS (DFS, HR = 0.60, 95% CI: 0.41–0.86, p = 0.006; OS, HR = 0.50, 95% CI: 0.32–0.76, p = 0.011, Figure 3E). These results indicated that DRG-related genes could stratify early-stage LUAD into two molecular subtypes with distinct prognosis.

FIGURE 3
www.frontiersin.org

FIGURE 3. Molecular subtype identification. (A) Expression profiles of 16 genes between DRG1 and DRG2 in the stages I and II TCGA–LAUD dataset. p-value of continuous variables was tested using the Wilcoxon rank-sum test. The consistency of DRG and molecular subtype was tested using the Kappa consistency test. Pearson’s chi-square test was used to test the categorical variables. (B) Venn plot presenting the intersection of patient share by molecular subtype and DRG. (C) Kaplan–Meier curves showing DFS between DRG1 (yellow) and DRG2 (gray) in patients with early-stage LUAD. (D) Kaplan–Meier curves showing OS between DRG1 (yellow) and DRG2 (gray) in patients with early-stage LUAD. (E) Multivariable analysis of DFS and OS with a Cox proportional hazards model in early-stage lung carcinoma.

Validation in GEO Datasets and Meta-analysis

In order to validate the molecular subtype and prognostic prediction of the DRG-related genes, a total of 1,285 stage III LUAD patient RNA expression microarray data were collected. The expression patterns of these genes and the survival status of patients in each GEO dataset are shown in Figure 4 and Supplementary Figure S3. The patients in the DRG1 subtype had a longer OS and DFS than those in the DRG2 subtype (GSE31210: OS, log-rank p < 0.001, HR = 0.28, 95% CI: 0.15–0.55; DFS, log-rank p < 0.001, HR = 0.33, 95% CI: 0.20–0.55. GSE37745: OS, log-rank p = 0.003, HR = 0.47, 95% CI: 0.28–0.78; DFS, log-rank p = 0.039, HR = 0.39, 95% CI: 0.16–0.98. GSE68465: OS, log-rank p = 0.003, HR = 0.60, 95% CI: 0.43–0.84; DFS, log-rank p < 0.001, HR = 0.50, 95% CI: 0.37–0.68. GSE30219: OS, log-rank p = 0.002, HR = 0.40, 95% CI: 0.22–0.72; DFS, log-rank p < 0.001, HR = 0.22, 95% CI: 0.09–0.53. GSE72094: OS, log-rank p = 0.002, HR = 0.49, 95% CI: 0.32–0.77. GSE13213: OS, log-rank p < 0.001, HR = 0.22, 95% CI: 0.01–0.49).

FIGURE 4
www.frontiersin.org

FIGURE 4. Validation in the GEO datasets and meta-analysis. (AD) Kaplan–Meier curves showing overall survival between DRG1 (yellow) and DRG2 (gray) in GSE31210, GSE37745, GSE68465, and GSE30219. (EH) Kaplan–Meier curves showing disease-free survival between DRG1 (yellow) and DRG2 (gray) in GSE31210, GSE37745, GSE68465, and GSE30219. (I) Pooled estimates of overall survival. (J) Pooled estimates of disease-free survival.

A meta-analysis was performed with a fixed-effects model, and the results indicated that compared with the DRG2 subtype, patients with the DRG1 subtype exhibited higher OS (HR = 0.48, 95% CI: 0.41–0.57, p < 0.01, Figure 4I) and DFS (HR = 0.5, 95% CI: 0.41–0.62, p < 0.01, Figure 4J) in the overall dataset. Heterogeneities were not significant in all pooled analyses (OS, p = 0.63; DFS, p = 0.43).

Then, we tested whether the molecular subtype could serve as an independent prognostic factor for early-stage lung adenocarcinoma. In multivariable analysis, the associations of DDR subtypes and prognosis were still significant (Tables 2, 3), which confirmed that the selected DDR genes could stratify patients with different prognoses.

TABLE 2
www.frontiersin.org

TABLE 2. Univariable analysis and multivariable Cox regression analyses of OS in six validation cohorts.

TABLE 3
www.frontiersin.org

TABLE 3. Univariable analysis and multivariable Cox regression analyses of DFS in four validation cohorts.

Subgroup Analysis

A stratification analysis was conducted to assess whether clinical factors had interaction effects on the DRG subtypes. Patients in TCGA and GSE31210 datasets were artificially stratified based on clinical factors, such as age (≤60/>60), sex (female/male), stage (I/II), smoking (ever/never), and adjuvant treatment (no/yes). As shown in Figure 5A,B and Supplementary Figure S4A-B, patients in the DRG1 subtype had higher OS and DFS than the DRG2 subtype irrespective of their age, sex, and smoking status. Meanwhile, a significant interaction (p = 0.01) between adjuvant treatment and DRG subtypes was observed in early-stage LUAD patients. Furthermore, we examined the association between adjuvant treatment and prognosis in DRG1 and DRG2 subtypes. We found that in the DRG2 subtype, patients with adjuvant treatment tended to have longer OS and DFS than patients without adjuvant treatment (OS, log-rank p = 0.259, HR = 0.33, 95% CI: 0.04–2.5; DFS, log-rank p = 0.105, HR = 0.32, 95% CI: 0.08–1.4), while in the DRG1 subtype, the results were opposite (OS, log-rank p = 0.001, HR = 5.3, 95% CI: 1.7–16; DFS, log-rank p < 0.001, HR = 6.7, 95% CI: 3.0–15). The abovementioned observation was not statistically significant because the sample size was limited (Supplementary Figure S4C,D and Figure 5C,D). After the patients were matched by propensity score, similar results were observed (Supplementary Figure S5).

FIGURE 5
www.frontiersin.org

FIGURE 5. Expression pattern of 16 genes is a prognostic biomarker and predicts adjuvant therapy benefits in the GSE31210 dataset. (A) Subgroup analyses of overall survival to estimate the clinical prognostic value between DRG1 and DRG2 as independent clinical factors. (B) Subgroup analyses of disease-free survival to estimate the clinical prognostic value between DRG1 and DRG2 in independent clinical factors. (C) Kaplan–Meier curves of overall survival between patients treated with or without adjuvant therapy. (D) Kaplan–Meier curves of disease-free survival between patients treated with or without adjuvant therapy.

Discussion

In this study, we trained and validated 16 DDR genes with prognostic values and classification effects in early-stage lung adenocarcinoma and classified patients into two subtypes, DRG1 and DRG2. Furthermore, we found that the DRG1 patients without adjuvant therapy and the DRG2 patients with adjuvant therapy tended to have prolonged survival than other patients in the corresponding subtypes.

The DDR system comprised eight pathways with diverse biological functions to maintain genomic integrity. In this study, we discovered that the 16 identified DDR genes were mainly involved in TLS, NER, and BER pathways. REV3L and REV1 were involved in TLS whose lower expressions were associated with worse prognoses. In human cells, when the expressions of TLS genes decrease, the DNA replication stress escalates the accumulative fork stalling and double-strand breaks (DSBs), resulting in genome instability and poor survival (Ghosal and Chen, 2013). These two genes are important DNA polymerase and deoxycytidyl transferase, which play significant roles in maintaining genome stability in the advent of DNA damage (Sasatani et al., 2020; Zhou et al., 2020). It has been reported that lower REV3L expression was also shown to be associated with lower DFS and OS (Agulló-Ortuño et al., 2020), which was consistent with our findings.

Furthermore, we discovered that genes with higher expression in the DRG2 subtype were mainly involved in NER and BER pathways. RAD23B, DDB1, and ERCC family genes (ERCC1, ERCC5, and ERCC6) are key genes in the NER pathway. Many studies have reported that they are significantly correlated with prognosis in different cancer types, such as colorectal cancer, pancreatic cancer, gastric cancer, and so on (Luo et al., 2018; Zhang et al., 2019; Li et al., 2021). NEIL3, PCNA, RFC3, and POLD2 play important roles in the BER pathway which are recruited to DNA lesions and cleave and repair the damaged bases cooperatively (Robertson et al., 2009; Hurst et al., 2021; Wang et al., 2021). Zhao et al. found that NEIL3 activated cell cycle progression, leading to poor prognosis (Zhao et al., 2021). Zhang et al. discovered that RFC3 was involved in the epithelial–mesenchymal transition in lung adenocarcinoma, resulting in worse survival (Gong et al., 2019). In tumorigenesis, increased DNA replication stress results in the increased generation of reactive oxygen species (ROS), leading to DNA damage (Jackson and Bartek, 2009). Accumulating evidence supports that NER and BER pathways are involved in the repair of oxidative DNA lesions. Therefore, high expressions of NER and BER genes suggest that more oxidative DNA lesions are being generated, which lead to genome instability and poor prognosis (Melis et al., 2013). Therefore, the imbalance of DNA damage and repair can increase the genome instability and promote tumor cell proliferation which might contribute to worse survival.

The use of adjuvant therapy in early-stage (IA–IIB) lung adenocarcinoma is controversial in NCCN guidelines and mainly depends on the physician’s experience (Zheng and Bueno, 2015). Although several studies have constructed various gene expression signatures to stratify LUAD patients, none of them have provided sufficient evidence about whether the high-risk patients could benefit from adjuvant therapy (Chen et al., 2007; Shedden et al., 2008; Sun et al., 2008). In our study, we classified LUAD patients into DRG1 and DRG2 subtypes and explored the interaction between these subtypes and adjuvant therapy in the GSE31210 dataset, which was a relatively rigorous clinical trial with clear inclusion and exclusion criteria. The patients in GSE31210 received no neoadjuvant therapies before surgery, whose stages were pathologically defined. Based on the GSE31210 cohort, we found that in the DRG2 subtype, the prognosis of patients who received adjuvant therapy had prolonged survival than those who did not, whereas in the DRG1 subtype, the patients without adjuvant therapy had better prognosis. Several previous studies also revealed that the low activity of TLS including low expression of REV3L enhanced the chemosensitivity of cancer (Wang et al., 2015; Yang et al., 2015; Agulló-Ortuño et al., 2020), which supports our findings that patients in the DRG2 subtype may benefit from adjuvant chemotherapy. In summary, the different clinical benefits of adjuvant therapy in various subtypes suggest that DRG subtypes have the potential to guide the selection of adjuvant therapy for early-stage LUAD patients.

In the present study, we identified novel DDR-gene expression subtypes and explored the association with prognosis and adjuvant therapy. However, there are still some limitations in our study. The current study was a retrospective analysis with a limited sample size in a public database. In addition, the mRNA expression in our study was based on RNA-seq or microarray whose results were less stable than those of RT-PCR or IHC, so the evidence would be more solid if it was validated by RT-PCR or IHC, as well as more cost-effective in clinical application scenarios (Pisapia et al., 2022). However, our findings have been validated in six independent cohorts to reduce false-positive results, and they were further validated in the adjuvant therapy subgroups to confirm the role in guiding therapy selection. In the future, prospective studies with large sample sizes are required to confirm the clinical utility of the 16 DDR-gene expression subtypes on the platform of RT-PCR or IHC.

In summary, we explored the association between DDR gene expression and prognosis in patients with stage I or II LUAD. Sixteen DDR gene–related subtypes were constructed to predict prognosis and guide the use of adjuvant therapy. More research studies are warranted to further confirm the clinical utility of the 16 DDR-gene classifiers.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding authors.

Author Contributions

Conceptualized and designed by JD and LC; administrative support provided by JD and LC; provision of study materials or patients by YZ, BQ, CX, JZ, YL, PC, GW, SC, and YS; collection and assembly of data by BQ and JZ. Data analysis and interpretation conducted by YZ, CX, and YL. All authors contributed to the writing of manuscript. All authors have provided final approval of the manuscript.

Funding

This work was supported by the General Program of National Natural Science Foundation of China (81972905).

Conflict of Interest

JZ, YL, PC, GW, and SC are employees of Burning Rock Biotech.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmolb.2022.901829/full#supplementary-material.

References

Agulló-Ortuño, M. T., García-Ruiz, I., Díaz-García, C. V., Enguita, A. B., Pardo-Marqués, V., Prieto-García, E., et al. (2020). Blood mRNA Expression of REV3L and TYMS as Potential Predictive Biomarkers from Platinum-Based Chemotherapy Plus Pemetrexed in Non-small Cell Lung Cancer Patients. Cancer Chemother. Pharmacol. 85, 525–535. doi:10.1007/s00280-019-04008-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Beer, D. G., Kardia, S. L. R., Huang, C.-C., Giordano, T. J., Levin, A. M., Misek, D. E., et al. (2002). Gene-Expression Profiles Predict Survival of Patients with Lung Adenocarcinoma. Nat. Med. 8, 816–824. doi:10.1038/nm733

PubMed Abstract | CrossRef Full Text | Google Scholar

Bender, E. (2014). Epidemiology: The Dominant Malignancy. Nature 513, S2–S3. doi:10.1038/513S2a

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, H.-Y., Yu, S.-L., Chen, C.-H., Chang, G.-C., Chen, C.-Y., Yuan, A., et al. (2007). A Five-Gene Signature and Clinical Outcome in Non-small-cell Lung Cancer. N. Engl. J. Med. 356, 11–20. doi:10.1056/nejmoa060096

PubMed Abstract | CrossRef Full Text | Google Scholar

Ghosal, G., and Chen, J. (2013). DNA Damage Tolerance: a Double-Edged Sword Guarding the Genome. Transl. Cancer Res. 2, 107–129. doi:10.3978/j.issn.2218-676X.2013.04.01

PubMed Abstract | CrossRef Full Text | Google Scholar

Gobin, M., Nazarov, P. V., Warta, R., Timmer, M., Reifenberger, G., Felsberg, J., et al. (2019). A DNA Repair and Cell-Cycle Gene Expression Signature in Primary and Recurrent Glioblastoma: Prognostic Value and CLinical Implications. Cancer Res. 79, 1226–1238. doi:10.1158/0008-5472.CAN-18-2076

PubMed Abstract | CrossRef Full Text | Google Scholar

Gong, S., Qu, X., Yang, S., Zhou, S., Li, P., and Zhang, Q. (2019). RFC3 Induces Epithelial-mesenchymal Transition in Lung Adenocarcinoma Cells through the Wnt/β-catenin Pathway and Possesses Prognostic Value in Lung Adenocarcinoma. Int. J. Mol. Med. 44. doi:10.3892/ijmm.2019.4386

CrossRef Full Text | Google Scholar

Hanahan, D., and Weinberg, R. A. (2011). Hallmarks of Cancer: The Next Generation. Cell 144, 646–674. doi:10.1016/j.cell.2011.02.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Howington, J. A., Blum, M. G., Chang, A. C., Balekian, A. A., and Murthy, S. C. (2013). Treatment of Stage I and II Non-Small Cell Lung Cancer: Diagnosis and Management of Lung Cancer, 3rd ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 143, e278S–e313S. doi:10.1378/chest.12-2359

PubMed Abstract | CrossRef Full Text | Google Scholar

Hurst, V., Challa, K., Shimada, K., and Gasser, S. M. (2021). Cytoskeleton Integrity Influences XRCC1 and PCNA Dynamics at DNA Damage. Mol. Biol. Cell. 32, br6. doi:10.1091/mbc.E20-10-0680

PubMed Abstract | CrossRef Full Text | Google Scholar

Jackson, S. P., and Bartek, J. (2009). The DNA-Damage Response in Human Biology and Disease. Nature 461, 1071–1078. doi:10.1038/nature08467

PubMed Abstract | CrossRef Full Text | Google Scholar

Knijnenburg, T. A., Wang, L., Zimmermann, M. T., Chambwe, N., Gao, G. F., Cherniack, A. D., et al. (2018). Genomic and Molecular Landscape of DNA Damage Repair Deficiency across the Cancer Genome Atlas. Cell Rep. 23, 239–e6. doi:10.1016/j.celrep.2018.03.076

PubMed Abstract | CrossRef Full Text | Google Scholar

Kratz, J. R., He, J., Van Den Eeden, S. K., Zhu, Z. H., Gao, W., Pham, P. T., et al. (2012). A Practical Molecular Assay to Predict Survival in Resected Non-squamous, Non-small-cell Lung Cancer: Development and International Validation Studies. Lancet 379, 823–832. doi:10.1016/S0140-6736(11)61941-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, B., Cui, Y., Diehn, M., and Li, R. (2017). Development and Validation of an Individualized Immune Prognostic Signature in Early-Stage Nonsquamous Non-Small Cell Lung Cancer. JAMA Oncol. 3, 1529. doi:10.1001/jamaoncol.2017.1609

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, J., Tian, L., Jing, Z., Guo, Z., Nan, P., Liu, F., et al. (2021). Cytoplasmic RAD23B Interacts with CORO1C to Synergistically Promote Colorectal Cancer Progression and Metastasis. Cancer Lett. 516, 13–27. doi:10.1016/j.canlet.2021.05.033

PubMed Abstract | CrossRef Full Text | Google Scholar

Listì, A., Barraco, N., Bono, M., Insalaco, L., Castellana, L., Cutaia, S., et al. (2018). Immuno-targeted Combinations in Oncogene-Addicted Non-small Cell Lung Cancer. Transl. Cancer Res. 8, S55–S63. doi:10.21037/tcr.2018.10.04

PubMed Abstract | CrossRef Full Text | Google Scholar

Luo, S. S., Liao, X. W., and Zhu, X. D. (2018). Prognostic Value of Excision Repair Cross-Complementing mRNA Expression in Gastric Cancer. Biomed. Res. Int. 2018. doi:10.1155/2018/6204684

PubMed Abstract | CrossRef Full Text | Google Scholar

Melis, J. P. M., Van Steeg, H., and Luijten, M. (2013). Oxidative DNA Damage and Nucleotide Excision Repair. Antioxidants Redox Signal. 18, 2409–2419. doi:10.1089/ars.2012.5036

PubMed Abstract | CrossRef Full Text | Google Scholar

Necchi, A., Joseph, R. W., Loriot, Y., Hoffman-Censits, J., Perez-Gracia, J. L., Petrylak, D. P., et al. (2017). Atezolizumab in Platinum-Treated Locally Advanced or Metastatic Urothelial Carcinoma: Post-Progression Outcomes from the Phase II IMvigor210 Study. Ann. Oncol. 28, 3044–3050. doi:10.1093/annonc/mdx518

PubMed Abstract | CrossRef Full Text | Google Scholar

Pang, F.-M., Yan, H., Mo, J.-L., Li, D., Chen, Y., Zhang, L., et al. (2020). Integrative Analyses Identify a DNA Damage Repair Gene Signature for Prognosis Prediction in Lower Grade Gliomas. Future Oncol. 16, 367–382. doi:10.2217/fon-2019-0764

PubMed Abstract | CrossRef Full Text | Google Scholar

Passiglia, F., Galvano, A., Gristina, V., Barraco, N., Castiglia, M., Perez, A., et al. (2021). Is There Any Place for PD-1/CTLA-4 Inhibitors Combination in the First-Line Treatment of Advanced NSCLC?-a Trial-Level Meta-Analysis in PD-L1 Selected Subgroups. Transl. Lung Cancer ResLung Cancer Res. 10, 3106–3119. doi:10.21037/TLCR-21-52

CrossRef Full Text | Google Scholar

Pisapia, P., Pepe, F., Baggi, A., Barberis, M., Galvano, A., Gristina, V., et al. (2022). Next Generation Diagnostic Algorithm in Non-Small Cell Lung Cancer Predictive Molecular Pathology: The KWAY Italian Multicenter Cost Evaluation Study. Crit. Rev. Oncology/Hematology 169, 103525. doi:10.1016/j.critrevonc.2021.103525

CrossRef Full Text | Google Scholar

Robertson, A. B., Klungland, A., Rognes, T., and Leiros, I. (2009). DNA Repair in Mammalian Cells: Base Excision Repair: the Long and Short of it. Cell. Mol. Life Sci. 66, 981–993. doi:10.1007/s00018-009-8736-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Sasatani, M., Zaharieva, E. K., and Kamiya, K. (2020). The In Vivo Role of Rev1 in Mutagenesis and Carcinogenesis. Genes Environ 42, 9. doi:10.1186/s41021-020-0148-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Scarbrough, P. M., Weber, R. P., Iversen, E. S., Brhane, Y., Amos, C. I., Kraft, P., et al. (2016). A Cross-Cancer Genetic Association Analysis of the DNA Repair and DNA Damage Signaling Pathways for Lung, Ovary, Prostate, Breast, and Colorectal Cancer. Cancer Epidemiol. Biomarkers Prev. 25, 193–200. doi:10.1158/1055-9965.EPI-15-0649

PubMed Abstract | CrossRef Full Text | Google Scholar

Shedden, K., Taylor, J. M. G., Enkemann, S. A., Tsao, M. S., Yeatman, T. J., Gerald, W. L., et al. (2008). Gene Expression-Based Survival Prediction in Lung Adenocarcinoma: A Multi-Site, Blinded Validation Study. Nat. Med. 14, 822–827. doi:10.1038/nm.1790

PubMed Abstract | CrossRef Full Text | Google Scholar

Sun, Z., Wigle, D. A., and Yang, P. (2008). Non-Overlapping and Non-Cell-Type-Specific Gene Expression Signatures Predict Lung Cancer Survival. J. Clin. Oncol 26, 877–883. doi:10.1200/JCO.2007.13.1516

PubMed Abstract | CrossRef Full Text | Google Scholar

Sun, H., Cao, D., Ma, X., Yang, J., Peng, P., Yu, M., et al. (2019). Identification of a Prognostic Signature Associated with DNA Repair Genes in Ovarian Cancer. Front. Genet. 10, 839. doi:10.3389/fgene.2019.00839

PubMed Abstract | CrossRef Full Text | Google Scholar

Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., et al. (2021). Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA A Cancer J. Clin. 71, 209–249. doi:10.3322/caac.21660

CrossRef Full Text | Google Scholar

Wang, W., Sheng, W., Yu, C., Cao, J., Zhou, J., Wu, J., et al. (2015). REV3L Modulates Cisplatin Sensitivity of Non-small Cell Lung Cancer H1299 Cells. Oncol. Rep. 34, 1460–1468. doi:10.3892/or.2015.4121

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, X., Janowczyk, A., Zhou, Y., Thawani, R., Fu, P., Schalper, K., et al. (2017). Prediction of Recurrence in Early Stage Non-Small Cell Lung Cancer Using Computer Extracted Nuclear Features from Digital H&E Images. Sci. Rep. 7, 13543. doi:10.1038/s41598-017-13773-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, W., Yin, Q., Guo, S., and Wang, J. (2021). NEIL3 Contributes toward the Carcinogenesis of Liver Cancer and Regulates PI3K/Akt/mTOR Signaling. Exp. Ther. Med. 22, 1053. doi:10.3892/etm.2021.10487

PubMed Abstract | CrossRef Full Text | Google Scholar

Wistuba, I. I., Behrens, C., Lombardi, F., Wagner, S., Fujimoto, J., Raso, M. G., et al. (2013). Validation of a Proliferation-Based Expression Signature as Prognostic Marker in Early Stage Lung Adenocarcinoma. Clin. Cancer Res. 19, 6261–6271. doi:10.1158/1078-0432.CCR-13-0596

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, L., Shi, T., Liu, F., Ren, C., Wang, Z., Li, Y., et al. (2015). REV3L, a Promising Target in Regulating the Chemosensitivity of Cervical Cancer Cells. Plos One 10, e0120334. doi:10.1371/journal.pone.0120334

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y., Lei, Y., Xu, J., Hua, J., Zhang, B., Liu, J., et al. (2019). Role of Damage DNA-Binding Protein 1 in Pancreatic Cancer Progression and Chemoresistance. Cancers 11, 1998. doi:10.3390/cancers11121998

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, C., Liu, J., Zhou, H., Qian, X., Sun, H., Chen, X., et al. (2021). NEIL3 May Act as a Potential Prognostic Biomarker for Lung Adenocarcinoma. Cancer Cell Int. 21, 228. doi:10.1186/s12935-021-01938-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, Y., and Bueno, R. (2015). Commercially Available Prognostic Molecular Models in Early-Stage Lung Cancer: a Review of the Pervenio Lung RS and Myriad myPlan Lung Cancer Tests. Expert Rev. Mol. Diagnostics 15, 589–596. doi:10.1586/14737159.2015.1028371

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, Y.-K., Li, X.-P., Yin, J.-Y., Zou, T., Wang, Z., Wang, Y., et al. (2020). Association of Variations in Platinum Resistance-Related Genes and Prognosis in Lung Cancer Patients. J. Cancer 11, 4343–4351. doi:10.7150/jca.44410

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: DNA damage response, signature, prognostic, molecular subtype, early-stage, lung adenocarcinoma

Citation: Zhao Y, Qing B, Xu C, Zhao J, Liao Y, Cui P, Wang G, Cai S, Song Y, Cao L and Duan J (2022) DNA Damage Response Gene-Based Subtypes Associated With Clinical Outcomes in Early-Stage Lung Adenocarcinoma. Front. Mol. Biosci. 9:901829. doi: 10.3389/fmolb.2022.901829

Received: 22 March 2022; Accepted: 11 May 2022;
Published: 22 June 2022.

Edited by:

Umberto Malapelle, University of Naples Federico II, Italy

Reviewed by:

Valerio Gristina, University of Palermo, Italy
Ramya Sivakumar, University of Washington, United States
Xiawei Cheng, East China University of Science and Technology, China

Copyright © 2022 Zhao, Qing, Xu, Zhao, Liao, Cui, Wang, Cai, Song, Cao and Duan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Liming Cao, Y2xtaW5nQGNzdS5lZHUuY24=; Jianchun Duan, ZHVhbmppYW5jaHVuNzlAMTYzLmNvbQ==

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.