- 1Department of Respiratory and Infectious Disease of Geriatrics, The First Hospital of China Medical University, Shenyang, China
- 2Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
- 3Key Laboratory of Anticancer Drugs and Biotherapy of Liaoning Province, The First Hospital of China Medical University, Shenyang, China
- 4Liaoning Province Clinical Research Center for Cancer, Shenyang, China
- 5Marlene and Stewart Greenebaum Comprehensive Cancer Center, University of Maryland, Baltimore, Baltimore, MD, United States
Patients with EGFR-mutant non-small-cell lung cancer (NSCLC) greatly benefit from EGFR-tyrosine kinase inhibitors (EGFR-TKIs) while the prognosis of patients who lack EGFR-sensitive mutations (EGFR wild type, EGFR-WT) remains poor due to a lack of effective therapeutic strategies. There is an urgent need to explore the key genes that affect the prognosis and develop potentially effective drugs in EGFR-WT NSCLC patients. In this study, we clustered functional modules related to the survival traits of EGFR-WT patients using weighted gene co-expression network analysis (WGCNA). We used these data to establish a two-gene prognostic signature based on the expression of CYP11B1 and DNALI1 by combining the least absolute shrinkage and selection operator (LASSO) algorithms and Cox proportional hazards regression analysis. Following the calculation of risk score (RS) based on the two-gene signature, patients with high RSs showed a worse prognosis. We further explored targeted drugs that could be effective in patients with a high RS by the connectivity map (CMap). Surprisingly, multiple HDAC inhibitors (HDACis) such as trichostatin A (TSA) and vorinostat (SAHA) that may have efficacy were identified. Also, we proved that HDACis could inhibit the proliferation and metastasis of NSCLC cells in vitro. Taken together, our study identified prognostic biomarkers for patients with EGFR-WT NSCLC and confirmed a novel potential role for HDACis in the clinical management of EGFR-WT patients.
Introduction
Lung cancer has the highest morbidity and mortality in China and around the world. Most patients presented with lung cancer at a late stage owing to hidden onset and unspecific symptoms associated with the disease (1, 2). Lung cancer is generally classified into non-small-cell lung cancer (NSCLC) and small cell lung cancer (SCLC). However, this traditional classification according to histological assessment fails to account for the complex prognosis and drug resistance associated with the disease (3).
Radiotherapy combined with chemotherapy is the major treatment strategy for SCLC, whereas targeted therapy has become the first-line treatment for NSCLC patients carrying specific driver mutations (4–6). Epidermal growth factor receptor (EGFR)-tyrosine kinase inhibitors (TKI) such as gefitinib and erlotinib were the first targeted therapy for NSCLC. They have been widely applied in the clinical application for NSCLC patients carrying EGFR-sensitive mutations such as in-frame deletions at exon 19 and exon 21 point mutations (L858R). Also, EGFR-TKIs have significantly prolonged disease-free survival (DFS) compared with platinum-based chemotherapy (7, 8). However, only 20–30% of all NSCLC patients with EGFR-sensitive mutations can benefit from EGFR-TKIs. For patients with no EGFR gene mutations or an unknown mutation status, platinum-based doublet chemotherapy regimens remain the standard first-line therapy (9, 10). In these cases, the tumor response rate is estimated to be less than 10% and overall survival (OS) is only slightly improved (11). There is an unmet need to develop a novel therapy and to improve the prognosis for patients with EGFR wild-type (EGFR-WT) NSCLC.
The rapid development of bioinformatics analysis has allowed the development of novel biomarkers that can predict prognosis in patients with lung cancer (such as PD-L1 (12, 13), GLUT1 (14), and Ki-67 (15, 16)). However, little effort has been focused on the identification of specific biomarkers for EGFR-WT patients. Thioredoxin reductases 1 (TrxR1) has been reported to be related to the poor prognosis in EGFR-WT and ALK-negative NSCLC (17). As the statistical power of individual biomarkers is considered to be weak, it is necessary to establish a gene signature biomarker to improve the accuracy of prognosis prediction (18–20). Weighted gene co-expression network analysis (WGCNA) is a systems biology approach that clusters genes with a high co-expression relationship into the same module (21). WGCNA has been widely used to assess the functions of transcriptome systems (22), to identify gene modules related to clinical parameters and to investigate cancer biomarkers (23–25). However, WGCNA has not yet been reported to reveal the prognostic prediction of biomarkers in EGFR-WT NSCLC patients.
In this study, WGCNA was conducted on the expression profiles of EGFR-WT NSCLC patients and a two-gene prognosis signature was obtained by LASSO COX regression. We performed connectivity map (CMap) database analysis to identify HDACis as potential drugs to effectively target EGFR-WT NSCLC patients with a high risk score (RS). Our findings provided a further understanding for prognosis prediction and clinical treatment of EGFR-WT NSCLC patients.
Materials and Methods
The flow chart for this research is shown in Figure 1.
Data Acquisition and Consolidation
GSE31852 database expression profile and clinical data were downloaded from the GEO database (Table S1 ) as a training set, and 62 EGFR-WT patients with complete survival data were selected for further analysis (Table S2). Gene expression profiles of these samples were annotated by using the Human Gene 1.0 ST Array (Table S3, Affymetrix, Santa Clara, CA) according to Affymetrix protocols (Table S4). Probes with no gene or duplicate-gene annotation were excluded.
GSE31210 database (Tables S5-S8) was selected as the validation set, which contains the expression profile information, gene mutation status, and progression-free survival (PFS) of 226 NSCLC expression profiles (127 EGFR mutations, 20 KRAS mutations, 11 EML4-ALK fusion mutations, and 68 EGFR/KRAS/ALK-WT cases). Samples of 423 EGFR-WT and 75 EGFR-mutant LUAD patients from The Cancer Genome Atlas Program were downloaded from the The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) as another validation dataset (Table S9).
WGCNA Network Construction
R package “WGCNA” was used for the automatic construction of a co-expression network. Firstly, a hierarchical clustering analysis of the samples was undertaken to ensure that there is little difference between the samples in the GSE31852 dataset (Figure S1A). The co-expression similarity matrix of gene expression was defined according to the Pearson correlation coefficient. Following the selection of an appropriate soft threshold β, the unweighted co-expression similarity matrix was converted into a weighted adjacency matrix. Then, the topological overlap matrix (TOM) was constructed using the degree of dissimilarity between the nodes and the dissimilarity index was defined between the nodes (26). Finally, by using the dynamic tree-cutting algorithm, the TOM was modified and the network modules were initially identified by satisfying conditions such that the difference between these modules is less than 0.25, or the similarity exceeds 0.75 (Table S10).
R package “WGCNA” was used sequentially to visualize the constructed network module and elucidate the correlation between external information. Modules with a significance P < 0.05 in the correlation test were defined being related to the trait. All genes in modules related to prognosis (time and status) were included in the construction of a prognostic risk signature for EGFR-WT patients.
LASSO Regression and Multivariate COX Regression
R packages “glmnet” and “survival” were used to perform COX regression analysis through the LASSO algorithm. Those parameters with non-zero regression coefficients in the LASSO regression results were further included in the multivariate COX regression analysis. Genes with statistical significance in the multivariate Cox regression analysis were used to calculate their weighted gene expression values to establish RSs for each patient. The RS formula was established as follows:
ExpmRNA represents the expression level of each gene, and βmRNA denotes the regression coefficient of the gene in the multi-factor COX regression model.
Internal and External Verification of the Prognostic Risk Score Signature
X-tile software was used to calculate the best cut-off value of the patient’s RS. According to the best cut-off value, all patients were divided into a high-RS group and a low-RS group. Kaplan–Meier survival analysis was performed using the “survival” package with the “log-rank” method. Both the consistency parameter C-index of the survival model and the accuracy of the prediction model in the training set were validated by the resampling method for internal cross-validation using R package “boot.” R package “survivalROC” was used to plot the ROC curve and calculate the area under the curve (AUC).
Gene Set Enrichment Analysis for Biological Function
GSEA Version 3.0 software was employed to enrich the main biological function pathways in the high-RS group, referring to “c2.cp.kegg.v6.2.symbols.gmt” and “h.all.v7.1.symbols.gmt” gene sets taken from the MsigDB database. All processes were performed according to the default parameters of the GSEA software. The number of random combinations was set to 1,000 and the results were sorted according to normalized enrichment scores (NES).
Differential Gene Screening and Targeted Drug Prediction
Differentially expressed genes (DEGs) of the high-RS group versus low-RS group with |log fold change (log FC)| > 0.585 and p-value < 0.05 were analyzed by the R package “limma” (Table S11). Then, to find those drugs targeting high-RS patients, the differential gene sets were input into the CMap drug database (http://www.broadinstitute.org/cmap). The results included genes, diseases, or drug networks that were similar with, or opposite to, the expression profile. A positive score meant that the change in the expression profile caused by a drug was similar to the input gene expression profile. Conversely, a negative score indicated that the change in the expression profile caused by a drug was opposite that in the input gene expression profile. A drug with a negative score may reverse the corresponding gene expression in the disease and thus serves as a potential targeting drug for the disease (Table S12). Potential compound drugs were selected for verification according to the correlation score (less than 90) of the drugs (Table S12) based on published data from the literature (26).
Cell Culture and Reagents
Lung adenocarcinoma cell lines A549 and H1299 without EGFR mutations were purchased from the Type Culture Collection of the Chinese Academy of Sciences (Shanghai, China), and cultured in RPMI-1640 (GibcoBRL, USA) supplemented with 10% fetal bovine serum (FBS), penicillin (100 U/ml), and streptomycin (100 mg/ml), in a humid atmosphere containing 5% CO2 at 37°C. Trichostatin A (TSA, HY-15144) and vorinostat (SAHA, HY-10221) were purchased from MedChem Express (Monmouth Junction, NJ, USA).
MTT Assay
Approximately 2,000 cells per well in 96-well plates were treated with various concentrations of TSA or SAHA for 48 or 72 h. Then, we added 20 μl of 3-(4,5-dimethylthiazolyl-2-yl)-2,5-diphenyltetrazolium bromide (MTT) (5 mg/ml) to each well and the cells were incubated for another 4 h at 37°C. After removing the medium, cells were lysed in 200 ml dimenthylsulfoxide (DMSO) at room-temperature, and the optical density (OD) was measured at a wavelength of 570 nm with a microplate reader (Bio-Rad Laboratories, Hercules, CA, USA).
Transwell Assay
Cells were treated with HDACis for 24 h, collected, and resuspend in serum-free media. Transwell chambers (Corning, NY, USA) were plated into a 24-well plate; 2 × 104 cells in 200 μl of serum-free medium were seeded onto the upper chamber and 500 μl of medium with 10% FBS was added to the lower chamber with or without HDACis. After incubation for 24 h, the chambers were fixed with methanol and cells on the upper membrane were removed. Cells in the lower membrane were stained with Wright-Giemsa dye and the number of cells counted.
Statistical Analysis
All statistical analyses and visualization were performed using GraphPad Prism 7.0 and R (version 3.6.2). Student’s t-test was used to compare the differences between the two groups: P < 0.05 was considered statistically significant.
Results
Data Collation and Patient Characteristics
In order to identify the key prognostic biomarkers for EGFR-WT NSCLC patients, we performed a systematic analysis of the GSE31852 dataset that included 62 patients with complete progression-free survival (PFS) information. The general information of these patients is summarized in Table 1. These patients included 24 lung adenocarcinomas, accounting for 38.7% of the samples, and six lung squamous cell carcinomas, accounting for 9.7% of the sample. The cohort consisted of 21 men and 16 women, accounting for 33.9 and 25.8% of the samples respectively. Six patients had no previous history of smoking, and 31 patients were former smokers. According to the expression profiles, all of the patients were clustered gene expression datasets (Figure S1A). As no obvious outliers in the expression profiles were observed, all of the 62 EGFR-WT patients were included in the subsequent analysis.
Construction of a Scale-Free Network by WGCNA
WGCNA was conducted on the gene expression profile data to identify the gene network modules co-expressed in EGFR-WT patients and to explore the relationship between these gene network modules and prognosis. Firstly, a soft threshold (β = 18) with an R2 > 0.9 was defined to establish an adjacency matrix (Figures 2A, B). All genes were clustered into 21 modules with different colors (Figure 2C). The gray module in which the genes were not clustered was excluded from the analysis. The similarity between each module was less than 0.75 (Figures 2D, E).
Figure 2 Construction of the weighted gene co-expression network analysis (WGCNA) network in EGFR-WT NSCLC patients and module identification. (A) Network topology for different soft-threshold powers. (B) The mean connectivity for different soft threshold powers. (C) One-step network construction by weighted gene co-expression network analysis (WGCNA). (D) Hierarchical clustering tree for clustering modules. (E) Heatmap of the correlation between identified modules (F) Correlation of modules and traits. Colors represent the correlation coefficient, while the number on each module represents the associated P-value.
We calculated the module eigengenes (ME) value of the samples which represented the gene expression pattern of each module. We found that the gene expression of every sample in each module was different whereas the gene expression of each sample in every module was similar. These results suggested that the modules composed of genes with similar expression patterns may have important biological functions (Figure S1B). Then, according to gene significance (GS) which was defined as the association of a single gene with external information (clinical pathology parameters), the correlation between the survival parameters (PFS time and state) and related modules was explored (Table S10, Figure 2F). Among each of the 20 modules, the turquoise and green modules were both significantly associated with PFS (P < 0.05) compared to the other traits. These modules were most closely related to PFS time and PFS status with correlation coefficients of −0.42 (P < 0.0001) and −0.35 (P < 0.0001), respectively. These results indicated that the gene expression patterns in the turquoise and green modules were most closely correlated with prognosis in the EGFR-WT patients.
Construction of Two-Gene Prognostic Signature-Based RS for EGFR-WT NSCLC Patients
As each module in the WGCNA network based on gene expression profiles could be regarded as a characteristic for EGFR-WT patients, the turquoise module (Figure S2A) and green modules (Figure S2B) may contain the prognostic prediction genes for EGFR-WT patients. The genes of these two modules were combined as candidate gene sets for prognostic markers in EGFR-WT patients. To further select the key genes related to prognosis, we performed the Least Absolute Shrinkage and Selection Operator (LASSO) regression combined with multivariate COX regression analysis (Figures 3A and S2C, D). CYP11B1and DNALI1 were identified as independent prognostic factors for PFS of EGFR-WT NSCLC patients by univariate and multivariate Cox regression (Table 2). Based on the expression of independent prognostic factors and their corresponding coefficients in the regression analysis, the two-gene signature-based RS of each EGFR-WT patient was derived based on the following formula:
Figure 3 Identification of the two-gene prognostic signature-based risk score (RS) for EGFR-WT NSCLC patients. (A) Six robust markers obtained by LASSO regression. (B) The distribution of RSs in GSE31852. Top: two groups for EGFR-WT patients according to the best cut-off of risk score. Middle: relationship between RS (X-axis) and PFS time (Y-axis). Bottom: heatmap plot for the expression of genes in the two-gene signature. (C) Kaplan-Meier curve of PFS probability based on the RS in EGFR-WT NSCLC. (D) ROC curves were used to compare the predictive ability of the two-gene prognostic signature for 1, 2, and 6-month survival probabilities. PFS status: SD, stable disease; PD, progressive disease.
Table 2 Univariate and multivariate Cox regression analysis between six candidate markers and progression-free survival (PFS) after Least Absolute Shrinkage and Selection Operator (LASSO) analysis.
According to the best cut-off value (RS = 19.1), the patients were divided into high- and low-RS groups (Figure 3B). Kaplan-Meier survival analysis showed that the PFS of patients in the high-RS group was significantly shorter than that in the low-RS group (Figure 3C, HR = 3.99, 95% CI = 2.04–7.82, P < 0.0001). The C-index was 0.8625 (95% CI = 0.7579–0.9671, P < 0.001) by internal cross-validation.
To assess the predictive ability of the model for short-term PFS, the receiver operating characteristic (ROC) curves of the two-gene signature-based on the RS and gene expression were drawn at 1, 2, and 6 months, and the AUC was determined. The results showed that the two-gene RS model was a good indicator of short-term PFS (Figure 3D). The AUCs for the two-gene RS at 1, 2, and 6 months were 0.736, 0.781, and 0.848, respectively. These values were significantly better than those obtained based on simple gene expression for predicting PFS at each time point (Figures S2E, F).
Taken together, these results indicated that the two-gene RS signature established by the PFS-related modules of WGCNA reflected the survival of EGFR-WT patients and had better predictive power than independent prognostic gene expression.
External Validation of Two-Gene RS for EGFR-WT Patients
To validate the significance of the two-gene signature for the prognosis of PFS in EGFR-WT NSCLC patients, the GSE31210 database was selected for external verification. All patients were divided into high- or low-RS groups according to the best cut-off value (RS = 10.1) and analyzed based on the Kaplan-Meier survival curves. In patients with EGFR mutations, the PFS of low-RS group was significantly shorter than that of high-RS group (Figures S3A, B). In contrast, in patients with EGFR-WT (including KRAS mutations, ALK fusion, and none of three mutations named as triple-negative type), the PFS of high-RS patients tended to be shorter although it failed to reach statistical significance (P > 0.05) potentially due to the small sample size (Figures 4A, B). Further refinement of the grouping indicated that the two-gene RS signature had better performance in predicting PFS for patients with triple-negative disease (HR = 3.11, 95% CI = 1.35–7.19, P < 0.01) (Figures 4C, D). The C-index results of the prognostic model in the three classifications of populations also showed that the RS could fit the true situation of PFS for patients with triple-negative lung cancer (Table 3). However, no significant difference in the long-term OS was found between the high- and low-RS groups regardless of the types of mutations (Figure S4).
Figure 4 The verification of the two-gene prognostic signature for non-small-cell lung cancer (NSCLC) patients with EGFR-WT in GSE31210 and The Cancer Genome Atlas (TCGA) databases. The distribution of RSs in patients with (A) EGFR-WT, (C) EGFR/ALK/KRAS wild type in GSE31210, and (E) EGFR-WT in TCGA. Top: two groups for patients according to the best cut-off of RS. Middle: relationship between RS and PFS information. Bottom: heatmap plot for the expression of genes in the two-gene signature. Kaplan-Meier curve of PFS probability based on the RS in patients with (B) EGFR-WT, (D) EGFR/ALK/KRAS wild type in GSE31210, and (F) EGFR-WT in TCGA.
Our results demonstrated that the two-gene RS was significant for survival prediction in EGFR-WT patients particularly in those who did not have EGFR/KRAS/ALK mutations. The two-gene signature was also verified in TCGA database (Figures S3C, D and 4E, F). All NSCLC patients were also divided into high- and low-RS groups and analyzed from the Kaplan-Meier survival curves. The survival time of high-RS patients was significantly shorter than that of low-RS patients (HR = 2.1, 95% CI = 1.51–2.91, P < 0.0001), suggesting that the two-gene signature had a generalized and adaptive capacity in predicting the prognosis of EGFR-WT NSCLC patients.
Verification of HDAC Inhibitors as Potential Targeted Drugs for EGFR-WT NSCLC
To clarify the biological characteristics of high-RS populations, we performed GSEA on the gene expression profile of EGFR-WT NSCLC patients using the tumor-related HALLMARK pathway gene set and the classic Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway gene set. Our results showed that both pathways were enriched in metastasis-related pathways including epithelial-mesenchymal transition, apical junction, focal adhesion, ECM receptor interaction and regulation of actin cytoskeleton. These data indicated that the poor prognosis of high-RS patients may be attributed to the enhancement of tumor metastatic ability (Figure 5A).
Figure 5 Compounds predicted to target EGFR-WT non-small-cell lung cancer (NSCLC) patients with a high risk score (RS) and verification in vitro. (A) Gene Set Enrichment Analysis (GSEA) of EGFR-WT NSCLC patients with high RS according to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (top) and HALLMARK pathways (bottom). (B) Heatmap for the analysis of differentially expressed mRNA. (C) Schematic illustration for of the drug screening procedure through querying of the connectivity map (CMap).
To ascertain potential drug targeting in the EGFR-WT high-RS population, we performed differential gene analysis between the high- and low-RS groups, and identified 25 up-regulated differential expression genes (DEGs) and 36 down-regulated DEGs (Table S11, Figure 5B). We compared these DEGs with the CMap database, aiming to identify the drugs that interacted with the DEGs as detailed in the Materials and Methods (Figure 5C). Twenty-four candidate drugs were identified, and interestingly, nine were classified as HDACi (Tables 4, S12). Considering the safety and feasibility of HDCAis for clinical applications, we verified two common HDACis, TSA, and SAHA in EGFR-WT lung cancer cells (A549 and H1299). Both HDACis significantly inhibited the proliferation and migration of these cell models (Figures 6A, B), indicating the potential of TSA and SAHA as novel treatment strategies in EGFR-WT lung cancer.
Figure 6 Verification of the effect of HDACis on EGFR-WT non-small-cell lung cancer (NSCLC) cells in vitro. (A) Inhibition of cell proliferation of trichostatin A (TSA) and vorinostat (SAHA) as assessed by MTT assay in A549 and H1299 cells. (B) Inhibition of migration ability by TSA and SAHA as assessed by Transwell assays in A549 and H1299 cells. ****P < 0.0001.
Discussion
In the current study, we established a RS based on a two-gene signature for EGFR-WT NSCLC patients and found that EGFR-WT patients with a high RS had a worse prognosis. Mechanistically, a high RS might cause poor prognosis by activating multiple metastasis-related pathways such as epithelial-mesenchymal transition and ECM receptor interaction. Furthermore, HDACis were screened out as potential targeted drugs for EGFR-WT NSCLC and found to inhibit cell proliferation and migration of EGFR-WT NSCLC cells.
Although well-known driver genes such as EGFR and ALK with sensitive mutations could strongly predict the efficacy of targeted drugs in NSCLC patients, about 50% of NSCLC patients without driver gene mutations do not benefit from the personalized targeted therapy (27, 28). For those patients that do not have corresponding mutations or have unknown mutation status, platinum-based doublet chemotherapy remains the standard first-line regimen which has poor efficacy (9, 10). In these patients, docetaxel or pemetrexed could be used as second-line single-agent chemotherapies. However, the tumor response rate was less than 10% and the OS was only slightly improved with these treatments (11). The current known tumor-driver mutations are not sufficient to fully predict the drug response of lung cancer. This study aimed to ascertain the characteristics of gene expression profiles in EGFR-WT NSCLC patients and provide evidence for the clinical treatment for NSCLC patients without specific mutations.
The rapid development of NGS technology and bioinformatics has allowed major progress in the prediction of NSCLC prognosis. In addition to the TNM stage, multi-gene prognostic signatures based on transcriptome sequencing have been developed for prognosis. However, the majority of studies of gene signatures for predicting prognosis in lung cancer have focused on RSs related to specific mechanisms, starting with functional molecules or cell components, such as immune infiltration (29–31), EMT scores (32), hypoxic or metabolic catabolites (33, 34), and non-coding RNA (35, 36). Little research has considered specific mutations.
Few gene signature studies have considered a scale-free property of the gene interaction network using WGCNA (37). WGCNA bridges the gap from individual genes to systems biology networks and can be utilized to identify hub genes that play key roles in the disease by converting the relationship between genes from a constant probability to the value adding a correlation weight (38). This allows the conversion from a random network to a scale-free network (39). Unfortunately, WGCNA on EGFR-WT NSCLC patients has not yet been reported.
In the present study, we used WGCNA to cluster the gene expression pattern of the EGFR-WT population, studied the co-expressed gene modules and correlated the gene modules with patient prognostic information for the first time. Due to the limited omics data containing information on mutations, the accuracy and comprehensiveness of our two-gene signature need to be validated in a large cohort of lung cancer patients. Also, we observed that the two-gene signature predicted the prognosis of patients without known driver gene mutations (EGFR/KRAS/ALK-WT) better than that of patients with EGFR-WT alone. Thus, the accumulation of next-generation sequencing (NGS) data in clinical tumor assessment and treatment may allow the further verification of a two-gene signature in a larger multi-omics database.
Considering the genes in WGCNA modules might be interrelated, they were not suitable for direct prognostic signature construction as this may result in multi-collinearity. We performed LASSO regression on the genes in the WGCNA modules to screen for the prognostic factors for the subsequent multivariate COX regression model. These analyses led to the selection of CYP11B1 and DNALI1 genes to construct a prognostic risk signature for EGFR-WT patients.
CYP11B1 encodes the cytochrome P450 family 11 subfamily B member 1 protein that is mostly expressed in the adrenal glands and is related to excessive cortisol secretion (40). In cancer, CYP11B1 is not only related to aldosterone- and cortisol-co-secreting adrenal tumors (41, 42), but it also affects the drug response of breast cancer (43), gastrointestinal tumors (44), leukemia (45), and other tumors. DNALI1 encodes dynein axonemal light intermediate chain 1 protein which is a human homologue of p28 in Chlamydia. DNALI1 is widely expressed in the human testis, ovaries and other tissues (28, 46), but the functions of DNALI1 in physiological processes and tumor development remain unclear. Limited studies have reported that DNALI1 is down-regulated in breast cancer (47) and negatively correlated with poor prognosis (48, 49). However, the expression pattern and function of both genes in lung cancer, as well as their influence on prognosis are unknown.
In this study, we analyzed the predictive significance of both molecules in the prognosis of NSCLC patients from online databases (Figure S5). The prognosis of patients with high expression of CYP11B1 or low expression of DNALI1 was significantly poor (P < 0.0001). Furthermore, the efficiency of prognostic prediction in the EGFR-WT population from RS based on the two-gene signature was better than that by individual genes (Figures S2E, F). The RS based on the two-gene signature had good predictive significance for the prognosis of EGFR-WT patients. This may provide a good reference value for clinical decision-making in EGFR-WT patients, particularly in patients without known driver mutations.
From the online drug database named CMap, HDACis were screened out as potential drugs to improve the prognosis of EGFR-WT patients with a high RS based on the two-gene signature.
CMap is a database developed by Broad Research Institute to reveal the functional relationships between small molecule compounds, genes, and disease states (50). CMap was employed to compare the differentially expressed gene list with the reference gene sets after specific treatments in the database. A correlation score (−100 to 100) was obtained according to the enrichment of differentially expressed genes in the reference gene expression profile. A positive score indicated that the up-regulation or down-regulation pattern of input genes was similar to the pattern of reference gene expression treated with different drugs. In contrast, a negative score indicated that the drugs regulated the expression of genes in an opposite direction. Finally, all treatments in the database were ranked according to the correlation score with the reference gene expression profile.
We found that among a total of 24 drug candidates with scores less than −90, 9 were HDACis, accounting for more than one-third of those identified. HDACis, such as romidepsin and vorinostat, are approved by the Food and Drug Administration (FDA) in cutaneous T cell lymphoma therapy (51, 52). However, although multiple preclinical studies have shown that HDAC inhibitors play a significant anti-cancer role in vitro or in animal models (53, 54), HDACis monotherapy clinical trials for lung cancer have failed (55–57). The results in the present study independently predicted that HDACis may be potential drugs for patients with high-RS EGFR-WT NSCLC and implicated the possibility of HDACis as single drugs for lung cancer therapy. Furthermore, HDACis were found to inhibit the proliferation and migration capacity of EGFR-WT NSCLC cells which was consistent with previous studies. Our research provides support for the independent application of HDAC inhibitors in EGFR wild-type NSCLC. Large-scale clinical trials should be carried out to confirm the efficacy of HDACis in EGFR-WT NSCLC patients.
In conclusion, the current study showed that a two-gene signature could effectively predict the survival of EGFR-WT patients, especially in NSCLC patients without known gene mutations. Our data also indicate that HDACis, such as TSA or SAHA, might be potentially effective clinical drugs for high-RS EGFR-WT patients. This study may fill the gap in lung cancer data analysis on one-specific mutant population, highlights the need for differential analysis of different oncomutations in cancer and also provides clues for the clinical treatment of EGFR-WT NSCLC patients.
Data Availability Statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.
Author Contributions
YW and CZ analyzed the data and drafted the manuscript and all figures. WL and DW helped interpreted the data. YCheng and YChen completed analysis of drug prediction. KH and YL completed the in vitro experiments. JQ edited language grammars and all tables. XC and XH designed the study and revised the manuscript. All authors contributed to the article and approved the submitted version.
Funding
This study was supported by the National Natural Science Foundation of China (Grant No. 81972197;No. 81472193), the Key Research and Development Program of Liaoning Province (2018225060), the Natural Science Foundation of Liaoning Province (2019-ZD-777), the Technological Special Project of Liaoning Province of China (2019020176- JH1/103), the Science and Technology Plan Project of Liaoning Province (No. 2013225585), Science and Technology Plan Project of Shenyang City (19–112–4–099).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2021.620154/full#supplementary-material
References
1. Chen W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, et al. Cancer statistics in China, 2015. CA Cancer J Clin (2016) 66(2):115–32. doi: 10.3322/caac.21338
2. Carney DN. Lung cancer–time to move on from chemotherapy. N Engl J Med (2002) 346(2):126–8. doi: 10.1056/NEJM200201103460211
3. Gridelli C, Rossi A, Carbone DP, Guarize J, Karachaliou N, Mok T, et al. Non-small-cell lung cancer. Nat Rev Dis Primers (2015) 1:15009. doi: 10.1038/nrdp.2015.9
4. Mok TS, Wu YL, Thongprasert S, Yang CH, Chu DT, Saijo N, et al. Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. N Engl J Med (2009) 361(10):947–57. doi: 10.1056/NEJMoa0810699
5. Wu YL, Zhou C, Hu CP, Feng J, Lu S, Huang Y, et al. Afatinib versus cisplatin plus gemcitabine for first-line treatment of Asian patients with advanced non-small-cell lung cancer harbouring EGFR mutations (LUX-Lung 6): an open-label, randomised phase 3 trial. Lancet Oncol (2014) 15(2):213–22. doi: 10.1016/S1470-2045(13)70604-1
6. Ramalingam SS, Vansteenkiste J, Planchard D, Cho BC, Gray JE, Ohe Y, et al. Overall Survival with Osimertinib in Untreated, EGFR-Mutated Advanced NSCLC. N Engl J Med (2020) 382(1):41–50. doi: 10.1056/NEJMoa1913662
7. Zhong WZ, Wang Q, Mao WM, Xu ST, Wu L, Shen Y, et al. Gefitinib versus vinorelbine plus cisplatin as adjuvant treatment for stage II-IIIA (N1-N2) EGFR-mutant NSCLC (ADJUVANT/CTONG1104): a randomised, open-label, phase 3 study. Lancet Oncol (2018) 19(1):139–48. doi: 10.1016/S1470-2045(17)30729-5
8. Yue D, Xu S, Wang Q, Li X, Shen Y, Zhao H, et al. Erlotinib versus vinorelbine plus cisplatin as adjuvant therapy in Chinese patients with stage IIIA EGFR mutation-positive non-small-cell lung cancer (EVAN): a randomised, open-label, phase 2 trial. Lancet Respir Med (2018) 6(11):863–73. doi: 10.1016/S2213-2600(18)30277-7
9. Azzoli CG, Temin S, Aliff T, Baker S Jr, Brahmer J, Johnson DH, et al. 2011 Focused Update of 2009 American Society of Clinical Oncology Clinical Practice Guideline Update on Chemotherapy for Stage IV Non-Small-Cell Lung Cancer. J Clin Oncol (2011) 29(28):3825–31. doi: 10.1200/JCO.2010.34.2774
10. Scagliotti GV, Parikh P, von Pawel J, Biesma B, Vansteenkiste J, Manegold C, et al. Phase III study comparing cisplatin plus gemcitabine with cisplatin plus pemetrexed in chemotherapy-naive patients with advanced-stage non-small-cell lung cancer. J Clin Oncol (2008) 26(21):3543–51. doi: 10.1200/JCO.2007.15.0375
11. Hanna N, Shepherd FA, Fossella FV, Pereira JR, De Marinis F, von Pawel J, et al. Randomized phase III trial of pemetrexed versus docetaxel in patients with non-small-cell lung cancer previously treated with chemotherapy. J Clin Oncol (2004) 22(9):1589–97. doi: 10.1200/JCO.2004.08.163
12. Takada K, Toyokawa G, Shoji F, Okamoto T, Maehara Y. The Significance of the PD-L1 Expression in Non-Small-Cell Lung Cancer: Trenchant Double Swords as Predictive and Prognostic Markers. Clin Lung Cancer (2018) 19(2):120–9. doi: 10.1016/j.cllc.2017.10.014
13. Aguiar PN Jr, De Mello RA, Hall P, Tadokoro H, Lima Lopes G. PD-L1 expression as a predictive biomarker in advanced non-small-cell lung cancer: updated survival data. Immunotherapy (2017) 9(6):499–506. doi: 10.2217/imt-2016-0150
14. Zhang B, Xie Z, Li B. The clinicopathologic impacts and prognostic significance of GLUT1 expression in patients with lung cancer: A meta-analysis. Gene (2019) 689:76–83. doi: 10.1016/j.gene.2018.12.006
15. Wei DM, Chen WJ, Meng RM, Zhao N, Zhang XY, Liao DY, et al. Augmented expression of Ki-67 is correlated with clinicopathological characteristics and prognosis for lung cancer patients: an up-dated systematic review and meta-analysis with 108 studies and 14,732 patients. Respir Res (2018) 19(1):150. doi: 10.1186/s12931-018-0843-7
16. Grant L, Banerji S, Murphy L, Dawe DE, Harlos C, Myal Y, et al. Androgen Receptor and Ki67 Expression and Survival Outcomes in Non-small Cell Lung Cancer. Horm Cancer (2018) 9(4):288–94. doi: 10.1007/s12672-018-0336-7
17. Chen G, Chen Q, Zeng F, Zeng L, Yang H, Xiong Y, et al. The serum activity of thioredoxin reductases 1 (TrxR1) is correlated with the poor prognosis in EGFR wild-type and ALK negative non-small cell lung cancer. Oncotarget (2017) 8(70):115270–9. doi: 10.18632/oncotarget.23252
18. Chen HY, Yu SL, Chen CH, Chang GC, Chen CY, Yuan A, et al. A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med (2007) 356(1):11–20. doi: 10.1056/NEJMoa060096
19. Kadara H, Behrens C, Yuan P, Solis L, Liu D, Gu X, et al. A five-gene and corresponding protein signature for stage-I lung adenocarcinoma prognosis. Clin Cancer Res (2011) 17(6):1490–501. doi: 10.1158/1078-0432.CCR-10-2703
20. Okayama H, Schetter AJ, Ishigame T, Robles AI, Kohno T, Yokota J, et al. The expression of four genes as a prognostic classifier for stage I lung adenocarcinoma in 12 independent cohorts. Cancer Epidemiol Biomarkers Prev (2014) 23(12):2884–94. doi: 10.1158/1055-9965.EPI-14-0182
21. Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol (2005) 4:Article17. doi: 10.2202/1544-6115.1128
22. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics (2008) 9:559. doi: 10.1186/1471-2105-9-559
23. Link VM, Duttke SH, Chun HB, Holtman IR, Westin E, Hoeksema MA, et al. Analysis of Genetically Diverse Macrophages Reveals Local and Domain-wide Mechanisms that Control Transcription Factor Binding and Function. Cell (2018) 173(7):1796–809 e17. doi: 10.1016/j.cell.2018.04.018
24. Zhang XJ, Cheng X, Yan ZZ, Fang J, Wang X, Wang W, et al. An ALOX12-12-HETE-GPR31 signaling axis is a key mediator of hepatic ischemia-reperfusion injury. Nat Med (2018) 24(1):73–83. doi: 10.1038/nm.4451
25. Doering TA, Crawford A, Angelosanto JM, Paley MA, Ziegler CG, Wherry EJ. Network analysis reveals centrally connected genes and pathways involved in CD8+ T cell exhaustion versus memory. Immunity (2012) 37(6):1130–44. doi: 10.1016/j.immuni.2012.08.021
26. Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu X, et al. A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles. Cell (2017) 171(6):1437–52 e17. doi: 10.1016/j.cell.2017.10.049
27. Riely GJ, Politi KA, Miller VA, Pao W. Update on epidermal growth factor receptor mutations in non-small cell lung cancer. Clin Cancer Res (2006) 12(24):7232–41. doi: 10.1158/1078-0432.CCR-06-0658
28. Fukuoka M, Wu YL, Thongprasert S, Sunpaweravong P, Leong SS, Sriuranpong V, et al. Biomarker analyses and final overall survival results from a phase III, randomized, open-label, first-line study of gefitinib versus carboplatin/paclitaxel in clinically selected patients with advanced non-small-cell lung cancer in Asia (IPASS). J Clin Oncol (2011) 29(21):2866–74. doi: 10.1200/JCO.2010.33.4235
29. Li B, Cui Y, Diehn M, Li R. Development and Validation of an Individualized Immune Prognostic Signature in Early-Stage Nonsquamous Non-Small Cell Lung Cancer. JAMA Oncol (2017) 3(11):1529–37. doi: 10.1001/jamaoncol.2017.1609
30. Song Q, Shang J, Yang Z, Zhang L, Zhang C, Chen J, et al. Identification of an immune signature predicting prognosis risk of patients in lung adenocarcinoma. J Transl Med (2019) 17(1):70. doi: 10.1186/s12967-019-1824-4
31. Chen F, Yang Y, Zhao Y, Pei L, Yan H. Immune Infiltration Profiling in Nonsmall Cell Lung Cancer and Their Clinical Significance: Study Based on Gene Expression Measurements. DNA Cell Biol (2019) 38(11):1387–401. doi: 10.1089/dna.2019.4899
32. Chae YK, Chang S, Ko T, Anker J, Agte S, Iams W, et al. Epithelial-mesenchymal transition (EMT) signature is inversely associated with T-cell infiltration in non-small cell lung cancer (NSCLC). Sci Rep (2018) 8(1):2918. doi: 10.1038/s41598-018-21061-1
33. Heiden BT, Chen G, Hermann M, Brown RKJ, Orringer MB, Lin J, et al. 18F-FDG PET intensity correlates with a hypoxic gene signature and other oncogenic abnormalities in operable non-small cell lung cancer. PloS One (2018) 13(7):e0199970. doi: 10.1371/journal.pone.0199970
34. Lim SL, Jia Z, Lu Y, Zhang H, Ng CT, Bay BH, et al. Metabolic signatures of four major histological types of lung cancer cells. Metabolomics (2018) 14(9):118. doi: 10.1007/s11306-018-1417-x
35. Lin T, Fu Y, Zhang X, Gu J, Ma X, Miao R, et al. A seven-long noncoding RNA signature predicts overall survival for patients with early stage non-small cell lung cancer. Aging (Albany NY) (2018) 10(9):2356–66. doi: 10.18632/aging.101550
36. Wang X. Improving microRNA target prediction by modeling with unambiguously identified microRNA-target pairs from CLIP-ligation studies. Bioinformatics (2016) 32(9):1316–22. doi: 10.1093/bioinformatics/btw002
37. Xie H, Xie C. A Six-Gene Signature Predicts Survival of Adenocarcinoma Type of Non-Small-Cell Lung Cancer Patients: A Comprehensive Study Based on Integrated Analysis and Weighted Gene Coexpression Network. BioMed Res Int (2019) 2019:4250613. doi: 10.1155/2019/4250613
38. Duan H, Ge W, Zhang A, Xi Y, Chen Z, Luo D, et al. Transcriptome analyses reveal molecular mechanisms underlying functional recovery after spinal cord injury. Proc Natl Acad Sci U S A (2015) 112(43):13360–5. doi: 10.1073/pnas.1510176112
39. Barabasi AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nat Rev Genet (2004) 5(2):101–13. doi: 10.1038/nrg1272
40. Zhu W, Chen Z, Li Q, Tan G, Hu G. Inhibitors of 11beta-Hydroxylase (CYP11B1) for Treating Diseases Related to Excess Cortisol. Curr Med Chem (2016) 23(6):623–33. doi: 10.2174/0929867323666160122114947
41. Murakami M, Rhayem Y, Kunzke T, Sun N, Feuchtinger A, Ludwig P, et al. In situ metabolomics of aldosterone-producing adenomas. JCI Insight (2019) 4(17):e130356. doi: 10.1172/jci.insight.130356
42. Fallo F, Castellano I, Gomez-Sanchez CE, Rhayem Y, Pilon C, Vicennati V, et al. Histopathological and genetic characterization of aldosterone-producing adenomas with concurrent subclinical cortisol hypersecretion: a case series. Endocrine (2017) 58(3):503–12. doi: 10.1007/s12020-017-1295-4
43. Brixius-Anderko S, Scott EE. Structure of human cortisol-producing cytochrome P450 11B1 bound to the breast cancer drug fadrozole provides insights for drug design. J Biol Chem (2019) 294(2):453–60. doi: 10.1074/jbc.RA118.006214
44. Ravegnini G, Urbini M, Simeon V, Genovese C, Astolfi A, Nannini M, et al. An exploratory study by DMET array identifies a germline signature associated with imatinib response in gastrointestinal stromal tumor. Pharmacogenomics J (2019) 19(4):390–400. doi: 10.1038/s41397-018-0050-4
45. Haznedaroglu IC, Malkan UY. Local bone marrow renin-angiotensin system in the genesis of leukemia and other malignancies. Eur Rev Med Pharmacol Sci (2016) 20(19):4089–111.
46. Fagerberg L, Hallstrom BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics (2014) 13(2):397–406. doi: 10.1074/mcp.M113.035600
47. Parris TZ, Danielsson A, Nemes S, Kovacs A, Delle U, Fallenius G, et al. Clinical implications of gene dosage and gene expression patterns in diploid breast carcinoma. Clin Cancer Res (2010) 16(15):3860–74. doi: 10.1158/1078-0432.CCR-10-0889
48. Tian W, Li Y, Zhang J, Li J, Gao J. Combined analysis of DNA methylation and gene expression profiles of osteosarcoma identified several prognosis signatures. Gene (2018) 650:7–14. doi: 10.1016/j.gene.2018.01.093
49. Ye Z, Wang F, Yan F, Wang L, Li B, Liu T, et al. Bioinformatic identification of candidate biomarkers and related transcription factors in nasopharyngeal carcinoma. World J Surg Oncol (2019) 17(1):60. doi: 10.1186/s12957-019-1605-9
50. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science (2006) 313(5795):1929–35. doi: 10.1126/science.1132939
51. Mann BS, Johnson JR, Cohen MH, Justice R, Pazdur R. FDA approval summary: vorinostat for treatment of advanced primary cutaneous T-cell lymphoma. Oncologist (2007) 12(10):1247–52. doi: 10.1634/theoncologist.12-10-1247
52. Grant C, Rahman F, Piekarz R, Peer C, Frye R, Robey RW, et al. Romidepsin: a new therapy for cutaneous T-cell lymphoma and a potential therapy for solid tumors. Expert Rev Anticancer Ther (2010) 10(7):997–1008. doi: 10.1586/era.10.88
53. Shieh JM, Tang YA, Hu FH, Huang WJ, Wang YJ, Jen J, et al. A histone deacetylase inhibitor enhances expression of genes inhibiting Wnt pathway and augments activity of DNA demethylation reagent against nonsmall-cell lung cancer. Int J Cancer (2017) 140(10):2375–86. doi: 10.1002/ijc.30664
54. Wu YF, Ou CC, Chien PJ, Chang HY, Ko JL, Wang BY. Chidamide-induced ROS accumulation and miR-129-3p-dependent cell cycle arrest in non-small lung cancer cells. Phytomedicine (2019) 56:94–102. doi: 10.1016/j.phymed.2018.09.218
55. Traynor AM, Dubey S, Eickhoff JC, Kolesar JM, Schell K, Huie MS, et al. Vorinostat (NSC# 701852) in patients with relapsed non-small cell lung cancer: a Wisconsin Oncology Network phase II study. J Thorac Oncol (2009) 4(4):522–6. doi: 10.1097/jto.0b013e3181952478
56. Reid T, Valone F, Lipera W, Irwin D, Paroly W, Natale R, et al. Phase II trial of the histone deacetylase inhibitor pivaloyloxymethyl butyrate (Pivanex, AN-9) in advanced non-small cell lung cancer. Lung Cancer (2004) 45(3):381–6. doi: 10.1016/j.lungcan.2004.03.002
Keywords: non-small cell lung cancer, histone deacetylase inhibitor, WGCNA, EGFR wild type, metastasis, proliferation
Citation: Wang Y, Zheng C, Lu W, Wang D, Cheng Y, Chen Y, Hou K, Qi J, Liu Y, Che X and Hu X (2021) Bioinformatics-Based Identification of HDAC Inhibitors as Potential Drugs to Target EGFR Wild-Type Non-Small-Cell Lung Cancer. Front. Oncol. 11:620154. doi: 10.3389/fonc.2021.620154
Received: 22 October 2020; Accepted: 26 January 2021;
Published: 08 March 2021.
Edited by:
Dong-Hua Yang, St. John’s University, United StatesReviewed by:
Wenliang Li, University of Texas Health Science Center at Houston, United StatesGloryn Chia, National University of Singapore, Singapore
Jun Chen, Second Affiliated Hospital of Dalian Medical University, China
Copyright © 2021 Wang, Zheng, Lu, Wang, Cheng, Chen, Hou, Qi, Liu, Che and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Xuejun Hu, xjhu@cmu.edu.cn; Xiaofang Che, xfche@cmu.edu.cn
†These authors have contributed equally to this work