- Department of Oral and Maxillofacial Surgery, NanFang Hospital, Southern Medical University, Guangzhou, China
Objectives: Oral squamous cell carcinoma (OSCC) is the most common oral cancer with a poor prognosis owing to limited understanding of the disease mechanisms. The aim of this study was to explore and identify the potential biomarkers in OSCC by integrated bioinformatics analysis.
Materials and Methods: Expression profiles of long non-coding RNAs (lncRNAs), microRNAs (miRNAs), and messenger RNAs (mRNAs) were downloaded from The Cancer Genome Atlas (TCGA) and differentially expressed RNAs (DERNAs) were subsequently identified in OSCC by bioinformatics analysis. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were used to analyze DERNAs. Then, the competing endogenous RNA (ceRNA) network was constructed in Cytoscape and the protein -protein interaction (PPI) network was established in the STRING database. We established a risk model to predict the overall survival of OSCC on the basis of DElncRNAs with Kaplan–Meier analysis and combined with logrank p test. Furthermore, we identified potential biomarkers by combining univariate Cox regression with overall survival rate, which were then validated in Gene Expression Omnibus (GEO), OSCC cell lines and OSCC specimens.
Results: A total of 1,919 DEmRNAs, 286 DElncRNAs and 111 DEmiRNAs were found to be dysregulated in OSCC. A ceRNA network included 46 DElncRNAs,7 DEmiRNAs and 10 DEmRNAs, and the PPI network included 712 DEmRNAs including 31 hub genes. Moreover, a 7 lncRNAs risk model was established and four genes (CMA1, GNA14, HCG22, HOTTIP) were identified as biomarkers on overall survival in patients with OSCC.
Conclusions: This study successfully constructed a ceRNA network and a PPI network which play a crucial role in OSCC. A risk model was established to predict the prognosis, and four DERNAs are revealed with overall survival in patients with OSCC, suggesting that they may be potential biomarkers in tumor diagnosis and treatment.
Introduction
Oral squamous cell carcinoma (OSCC) is one of the most common oral cancers worldwide (1) and has the most severe impact on the quality of life of patients. The most significant risk factors in OSCC are cigarettes, alcohol and betel nut consumption, which seem to have a synergistic effect. One recent study showed that it may be associated with the infection of human papillomavirus (HPV) (2). Although medical equipment and treatment methods are more advanced in recently years, the 5-year overall survival rate of OSCC remains only 40–50% (3) on account of relatively low responsiveness to treatment, severe drug resistance (4), diagnosis at terminal stage, etc., whereas the 5-year overall survival rate can rise markedly to more than 85% in patients diagnosed with early-staged tumors (5). Therefore, early diagnosis is very important for improving the prognosis of patients with OSCC.
As is known, more than 70% of the human genome can be transcribed to functional products (6). Among them, long non-coding RNAs (lncRNAs), RNA transcripts longer than 200 nucleotides and those with limited protein-coding potential are in the majority. LncRNAs affect various aspects of cellular homeostasis in OSCC, including proliferation, survival, migration, and genomic stability (7). Moreover, cancer-specific lncRNA expression patterns appear more tissue- and stage-specific than those of protein-coding genes, indicating the potential development of lncRNAs as powerful alternative biomarkers and therapeutic targets (8). Meanwhile, recent research demonstrated that lncRNA combined with mRNA biomarkers may improve diagnosis (9). However, there are fewer effective lncRNAs and mRNAs biomarkers used for diagnosis in OSCC.
In 2011, Salmena et al. put forward a competing endogenous RNA (ceRNA) hypothesis (10). MiRNA as a regulatory molecule regulates mRNA expression by targeting the 3'-UTR (11), which typically inhibits the translation and the stability of mRNAs (12). However, lncRNAs can compete with the miRNA binding to the mRNA to regulate gene expression. Recently, increasing evidence indicated that this hypothesis was closely related to the development and initiation of OSCC. For example, lncRNA TUG1 through sponging miR-524-5p to mediate DLX1 expression promotes proliferative and migratory potentials in OSCC (13). In addition, protein-protein interaction (PPI) also plays a crucial role in various cancers. PPI is composed of proteins interacting with each other to participate in various biological processes such as biological signal transmission, regulation of gene expression, energy metabolism, and cell cycle regulation. For example, c-Myc, MMP-2, and GSK3β regulated the expression of MMP9 to accelerate OSCC progression and invasion (14). However, fewer studies have been reported on the ceRNA and PPI networks in OSCC.
In this study, we aim to analyze the differentially expressed profile of non-coding RNAs(ncRNAs) in the OSCC and establish a Cox regression model to predict the overall survival of patients with OSCC. Further, we analyzed and predicted the functions of the ceRNA and PPI networks. This study will contribute to understanding the molecular mechanism and provide new therapeutic targets for OSCC.
Materials and Methods
Data Preparation
All primitive data of OSCC (oral cavity, floor of mouth, palate, buccal mucosa, the anterior 2/3 of the tongue, gingiva, and so on) from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) were downloaded through GDC Data Transfer Tool, including the RNA-Seq and miRNA-Seq of Transcriptome Profiling and Clinical data. Then, we excluded three samples because of their low-quality clinical data. Finally, 316 OSCC samples and 32 normal control samples were collected in our study. Gene expression profiles of OSCC were obtained from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), including GSE9844 (26 OSCC samples and 12 normal controls), GSE30784 (167 OSCC samples and 45 normal controls), and GSE74530 (6 OSCC samples and 6 normal controls).
Collection of OSCC Specimens
A total of 49 pairs of OSCC specimens and normal adjacent tissues (about more than 1.5 cm from the edge of the tumor) were collected at Nanfang Hospital, Southern Medical University (Guangzhou, China), and written informed consent was obtained from all patients. The tissue specimens were stored in RNA Wait and then frozen at −80°C. All tumor tissues and normal adjacent tissues were pathologically confirmed as squamous cell carcinoma and normal tissues, respectively.
Differentially Expressed Gene Analysis
EdgeR (Version 3.8) package in R software was used to analyze and identify differentially expressed RNAs (DERNAs) and differentially expressed miRNAs (DEmiRNAs) with the thresholds of |log2 (fold change [FC])|>2.0 and FDR (adjusted P-value) <0.01 (15). Then, we used the annotation file in GTF format (Homo_sapiens.GRCh38.95.chr.gtf) to identify and annotate differentially expressed long non-coding RNA (DElncRNAs) with the thresholds of |log2FC|>2.0 and FDR <0.01. The heatmap and volcano were constructed by the gplots package in R software.
Functional Enrichment Analysis
We employed DAVID (https://david.ncifcrf.gov/) to obtain information for Gene Ontology (GO) including biological processes, the cellular component and molecular function. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was used to annotate the potential functions. A significance level of P < 0.05 was set as the cutoff criteria and the plots were constructed by the gplots package in R software.
Protein-Protein Interaction Analysis
The DEmRNAs were enrolled in a protein-protein interaction (PPI) network through the STRING database (https://string-db.org/) with a confidence score >0.9, and the PPI network was visualized in Cytoscape (Version 3.7.1) software. Moreover, genes with degree> = 25 were selected as hub genes. Subsequently, module analysis (16) of the PPI network was performed using the Molecular Complex Detection (MCODE) tool of Cytoscape software, and GO and KEGG analysis of the modules was carried out using the DAVID database.
Construction of the ceRNA Network
According to the hypothesis of ceRNA, a lncRNA-miRNA-mRNA network was constructed. Relevant miRNA-target data were obtained from the miRcode database (http://www.mircode.org/) (17). Then, the DElncRNA-DEmiRNA interactions were predicted according to the miRcode database. Furthermore, target DEmRNAs were predicted for DEmiRNAs using miRDB (http://www.mirdb.org/) (18), miRTarBase(http://mirtarbase.mbc.nctu.edu.tw/) (19) and TargetScan database (http://www.targetscan.org/) (20), and only the miRNA-mRNA interactions that existed in all the three databases were enrolled in the ceRNA network. Eventually, Cytoscape (Version 3.7.1) was employed to establish the lncRNA-miRNA-mRNA network.
Cox Risk Regression Establishment and Validation
The lncRNAs raw data were transformed and normalized in a log2(x+1) manner (21). OSCC samples were randomly divided into a training set and a validation set. Univariate Cox regression was used to select prognosis-associated genes (p < 0.05). Subsequently, we performed Cox regression analysis combined with LASSO to establish a prognostic risk score model, and the penalty regularization parameter lambda (λ) was chosen through cross-validation with an n-fold equal to 10 by using the R package glmnet (21). Lambda.min was identified to pick out the variables. According to these variables, a stepwise regression was performed to establish the Cox model. Finally, a validation set and Kaplan–Meier survival curves along with a logrank p test were applied to validate its accuracy. In addition, receiver operating characteristic (ROC) analysis was used to estimate the predictive power of this signature.
Cell Culture
The human OSCC cell lines SCC9, SCC15, SCC25, CAL27, and KB and the normal oral epithelial cell line HOK were obtained from the Institute of Antibody Engineering, Southern Medical University (Guangzhou, China). Cells HOK, SCC9, SCC15, and SCC25 were cultured in Dulbecco's modified Eagle's medium F12 (DMEM/F12) (Invitrogen, Carlsbad, CA, USA), CAL27 in α-MEM (Invitrogen, Carlsbad, CA, USA) and KB in RPMI 1640 (Invitrogen, Carlsbad, CA, USA), supplemented with 10% fetal bovine serum (FBS) at 37 °C with 5% CO2.
RNA Extraction and qRT-PCR
Total RNA was extracted from tissues and cells in a TRIzol reagent (Takara) manner. RNA reverse transcription to cDNA was performed with a Reverse Transcription Kit (Vazyme). Quantitative real-time Polymerase Chain Reaction (qRT-PCR) analyses used SYBR Green I (Vazyme) in triplicate. The results were normalized to the expression of GAPDH. The primer sequences are as follows. HCG22 forward primer (5′-3′): CTTCTGCTGCTCCTGCTTCT; reverse primer (5′-3′): ACTCCATCTCTCCAGGTCCC. HOTTIP forward primer (5′-3′): CCTAAAGCCA CGCTTCTTTG; reverse primer (5′-3′): TGCAGGCTGGAGATCCTACT. GNA14 forward primer (5′-3′) CCCA ACAAGATGTGCTTCGC; reverse primer (5′-3′) TCCGTCTTTCCGATCGTTGG. CMA1 forward primer (5′-3′) TCAGCTGTGTGTGGGCAATC; reverse primer (5′-3′) CTTTGCATCCG ACCGTCCAT. GAPDH forward primer (5′-3′): CGCTGAGTACGTCGTGGAGTC; GAPDH reverse primer: (5′-3′) GCTGATGATCTTGAGGCTGTTGTC.
Western Blot Analysis
Cells and OSCC tissues were lysed in RIPA lysis buffer. According to the procedure, proteins were separated by electrophoresis, transferred to membranes and then sealed with 5% skim milk. The primary antibodies including CMA1 (dilution 1:1,000), GNA14 (dilution 1:1,000) and α-tublin (dilution 1:1,000) were incubated in 4°C for one night. Subsequently, goat anti-mouse and goat anti-rabbit secondary antibodies were incubated for 1 h at room temperature, and finally immunoreactive bands were visualized with a chemiluminescence system.
Biomarkers Screening and Validation
The status and survival time of OSCC patients were extracted. Subsequently, the mRNA was enrolled in the PPI and ceRNA networks, and lncRNA and miRNA identified in ceRNA were selected for screening biomarkers. We used univariate Cox regression to screen prognostic factors (p < 0.05), and those prognostic factors whose expression levels were significantly relevant to patients' overall survival (p < 0.01) were selected as primitive biomarkers. Ultimately, the gene expression profile from the GEO and OSCC cell lines and tissues were used to verify these biomarkers. In addition, a combination of two or more biomarkers was performed to predict OSCC overall survival according to the gene expression in TCGA.
Statistical Analysis
Statistical analyses were performed using SPSS23.0 software (IBM). The differences between groups were tested using a two-tailed Student's t-test. The survival analysis between two groups was conducted by logrank test. P-values <0.05 were considered statistically significant. Differences were considered significant if *p < 0.05; **p < 0.01; ***p < 0.001; or ****p < 0.0001.
Results
Identification of DEmRNA, DEmiRNAs, and DElncRNAs
RNA expression profiles and corresponding clinical data of 316 OSCC patients and 32 normal controls were downloaded from TCGA database. With the cut-off criteria unified, |log2FoldChange|>2 and FDR <0.01, 1919 DEmRNAs (673 upregulated and 1246 downregulated, Figure 1A), 286 DElncRNAs (192 upregulated and 94 downregulated, Figure 1B) and 111 DEmiRNAs (61 upregulated and 50 downregulated, Figure 1C) were sorted out.
Figure 1. Distributions of differentially expressed genes in oral squamous cell carcinoma (OSCC) (|log2FC| >2.0 and adjusted P-value < 0.01) between 316 tumor tissues and 32 normal tissues. The ascending normalized expression level in the heatmaps is colored from green to red. Red means gene upregulation, green indicates downregulation and black means normal expression. Furthermore, each column represents a sample and each row represents a differentially expressed gene. The heatmaps plot 1919 DEmRNAs (A), 192 DElncRNAs (B), and 111 DEmiRNAs (C). Similar with heatmaps, red stands for upregulations, green stands for downregulation and black stands for normal expression in volcanoes. Each point represents a gene.
Functional Analysis of DERNAs
GO and KEGG enrichment analysis were used to explore the potential function of DERNAs. The results indicated that these genes were mainly enriched in tissue development, cell differentiation and calcium binding (Figures 2A,B). Moreover, the majority of genes were located in the extracellular region. In addition, KEGG pathway analyses demonstrate that the most significant pathways were the calcium signaling, protein digestion and absorption and focal adhesion pathways (Figures 2C,D).
Figure 2. GO and KEGG pathway analysis of DERNAs. (A) The numbers of genes enriched in each GO category. The y axis represents the GO categories including the biological process, cellular component and molecular function, and the x axis represents the enrichment score. Furthermore, the color stands for p-value. (B) GO analysis contains the biological process (BP, in green), cellular component (CC, in blue) and molecular function (MF, in purple). The y axis represents the target gene and the x axis represents the biological process. (C) The most important pathways in DERNAs. The y axis represents the pathways and the x axis represents enriched gene numbers, and the color means adjust P-value. (D) The Netplot of KEGG pathways, mean enrichment of genes in different pathways. The number adjacent to nodes stands for gene ID.
Protein-Protein Interaction (PPI) Network Analysis
A total of 712 proteins and 3,181 edges were selected in the PPI network. The confidence score is >0.9. A total of 31 hub genes were selected from PPI network with degree> = 25 (Figure 3A). Among them, 6 mRNAs including GNA14, GRM5, KRT3, KRT26, TACR1, and HTR2C were related to overall survival. In addition, according to module analysis, three modules were identified in the PPI network (Figures 3B–E). Surprisingly, most hub genes, including all of these associated with overall survival (Supplementary Figure 1), were enriched in module 2 indicating that module 2 plays a vital role in the PPI network. According to the GO terms analysis, three modules related to cell adhesion, tissue development, and protein binding played a crucial role in cancer. With respect to KEGG enrichment analysis, modules 1 and 2 regulated metabolic pathways such as protein digestion and absorption (Tables 1, 2). Moreover, modules 1 and 3 were significantly relevant to the PI3K-Akt signaling and calcium signaling pathways, which were associated with the occurrence and development of tumors (Table 3).
Figure 3. Protein-protein network (PPI) based on the DEmRNAs with a combined score was >0.90. (A) 31 hug genes were listed because the degree was >25 in the PPI network. (B) Module analysis of DEmRNA enrolled in PPI network with the criterion cut. Height = 0.8, min size = 10. Same color means it belongs to the same module. And 3 modules were visualized in Cytoscape (C–E). The connection between the nodes indicates the potential interaction between different mRNA, and red represents the hub gene in the PPI network. Meanwhile, GO and KEGG analysis of 3 modules was performed in DAVID (Tables 1–3).
Construction of ceRNA Network in OSCC
A total of 46 lncRNAs, 10 mRNAs and 7 miRNAs were enrolled in the ceRNA network (Figure 4A). We employed miRcode to predict the potential miRNA target by DElncRNAs. As a result, 46 of 286 DElncRNAs and 7 of 111 DEmiRNAs formed 119 lncRNA-miRNA pairs (Table 4). Then, miRDB, miRanda and TargetScan were used to predict the miRNA-mRNA pairs. Only the miRNA-mRNA interactions that exist in all three databases were brought into the ceRNA network (Table 5). Finally, there were 10 DEmRNAs could target to the 7 miRNAs (Figure 4B). Therefore, according to Table 6, the ceRNA network including 46 lncRNAs, 10 mRNAs and 7 miRNAs was completely constructed, and the lncRNA-miRNA-mRNA network was visualized in Cytoscape (Version 3.7.1).
Figure 4. CeRNA network was visualized in Cytoscape (A). The yellow represents miRNA upregulation and purple downregulation. The red means lncRNA upregulation and green downregulation. The mazarine indicates mRNA upregulation and blue downregulation. Gray edges indicate lncRNA-miRNA-mRNA interactions. (B) Venn diagram of DEmiRNAs might target DEmRNAs.
Table 5. The miRDB, miRanda, and TargetScan database revealed interactions between DEmiRNAs and DEmRNAs.
Establishment and Validation of Cox Regression Model
A total of 316 OSCC samples were randomly divided into a training set and validation set. Subsequently, combined univariate Cox regression with a LASSO Cox regression model along with 10-fold cross-validation, 11 variables (AC073130.1, AFAP1-AS1, AQP4-AS1, C11orf97, HOTTIP, HOXA11-AS, LINC00460, LINC01234, LINC01391, SLC8A1-AS1, and SRGAP3-AS2) were identified (Figures 5A,B). Furthermore, a stepwise regression was performed according to the 11 lncRNAs. Consequently, 7 lncRNAs including AFAP1-AS1, AQP4-AS1, C11orf97, HOTTIP, LINC00460, LINC01234, and SLC8A1-AS1 were harvested in Cox regression. Risk score = (0.06844*AFAP1-AS1)–(1.7559* C11orf97) +(0.18417*HOTTIP)+(0.16046* LINC00460)+ (0.09473*LINC01234)–(0.20486*SLC8A1-AS1)- (0.20076*AQP4-AS1) (Figure 5C). Afterwards, the OSCC patients were divided into high risk and low risk group based on median value of Cox regression model. The distribution of the risk score along with the corresponding survival data and the 7 lncRNAs expression demonstrated that the high-risk OSCC patients tended to experience shorter survival time, and low-risk patients were opposite (Figures 5D,E). Correspondingly, the protective genes in high risk group expression level were lower than low group. On the contrary, risky genes were higher in high-risk group (Figure 5F).
Figure 5. (A) LASSO coefficient profiles of the genes associated with the overall survival of OSCC. (B) Partial likelihood deviance was plotted vs. log(c). The vertical dotted line indicates the lambda value with the minimum error and the largest lambda value, where the deviance is within one SE of the minimum. (C) Forest map based on the risk score model. Left vertical dotted line indicates protective genes and right risk genes. (D) Overview of risk evaluation models. The y axis represents percentage and x axis log2 (risk score). (E) The scatter diagram based on survival time and log2 (risk score). The red means death and green life. The higher log2 (risk score) is, the shorter the time survival. (F) Differentially expressed lncRNAs which were enrolled in the risk model heatmap.
Kaplan–Meier curves along with logrank p test were used to evaluate its accuracy in the training and validation set. According to the risk formula, obvious differences in survival analysis were determined in high- and low-risk groups (Supplementary Figures 2A,B). Meanwhile, stratified analysis in disparate grades was further carried out and indicated that risk level was relevant to prognosis (Supplementary Figures 2C,D). Subsequently, the ROC was plotted, and its area Under the Curve (AUC) is 0.776 (Supplementary Figure 2E).
Screening Biomarkers
A total of 6 genes (GNA14, CMA1, DKK1, HOXC6, HCG22, and HOTTIP) were identified as primitive biomarkers. The results of univariate Cox regression indicated that there were 36 mRNAs (Table 7-3) and 6 lncRNAs (Table 7-1) regarded as prognostic factors (p < 0.05). However, none of the miRNAs were related to prognosis (Table 7-2). Meanwhile, following the combined Kaplan–Meier curves with logrank p test, the genes were clearly associated with overall survival (p < 0.01) and selected as primitive biomarkers. Finally,4 mRNAs (Figures 6A–D) and 2 lncRNAs (Figures 6E,F) were identified. Subsequently, GEO expression profiles were used to verify the 6 biomarkers. As expected, most of the biomarker expression levels in GEO were consistent with TCGA. However, lncRNA HOTTIP upregulated in TCGA and there was no difference in GEO.
Figure 6. Combining Kaplan–Meier survival analysis with univariate Cox regression to screen biomarkers in OSCC patients. Kaplan–Meier survival curves and GEO gene expression profiles for the three protective genes, GNA14 (A), CMA1 (B) and HCG22 (C), and risky genes DKK1 (D), HOXC6 (E), and HOTTIP (F) were associated with overall survival in OSCC. GNA14:G protein subunit alpha 14; CMA1:chymase 1; DKK1: dickkopf WNT signaling pathway inhibitor 1; HOXC6:homeobox C6; HOTTIP:HOXA distal transcript antisense RNA; HCG22:HLA complex group 22.
Validation for Biomarkers
A total of 2 mRNAs (GNA14 and CMA1) and 2 lncRNAs (HCG22 and HOTTIP) were differentially expressed in OSCC cell lines and OSCC tissues. Our results revealed that CMA1, GNA14, and HCG22 had low expression in OSCC cell lines and tissues. However, lncRNA HOTTIP was highly expressed in tumor tissues compared with adjacent normal tissues. Meanwhile, Kaplan- Meier analysis suggested that low expression of GNA14, CMA1, and HCG22 were related to poor survival (Figures 7A–C). High HOTTIP expression showed obviously poorer overall survival than those with low HOTTIP expression (Figure 7D). Unfortunately, there was no difference for DKK1 and HOXC6 in OSCC tissues, indicating that the 2 mRNAs might not be biomarkers in OSCC (Supplementary Figure 3A). Meanwhile, the protein levels of CMA1 and GNA14 were detected in OSCC cell lines and tissues (Supplementary Figure 3B). Finally, a combination of these biomarkers can predict overall survival better (Supplementary Figures 3A,B).
Figure 7. The expression levels of 2 mRNAs, GNA14 (A) and CMA1 (B), and 2 lncRNAs HCG22 (C), and HOTTIP (D) in 5 OSCC cell lines and 49 pairs adjacent tissues and tumor tissues. HOK is used as control. Then, Kaplan–Meier analysis along with logrank p was used to compare the survival of the low expression group and the high expression group. However, mRNA levels of DKK1 (E) and HOXC6 (F) showed no difference in OSCC tissues. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001.
Discussion
In the past decades, the 5-year survival rate of OSCC has remained low (3) in spite of advances in surgical treatment and radiotherapy. Hence, it is vital to find and identify biomarkers for early diagnosis and prognosis of OSCC.
In this study, a total of 1,919 DEmRNAs, 286 DElncRNAs and 111 DEmiRNAs were identified. GO analysis revealed that the function of DERNAs is mainly associated with tissue development, cell differentiation and calcium binding, which play a vital role in tumorigenesis. In addition, KEGG pathways analysis showed that DERNAs mainly enriched in the protein digestion and absorption, calcium signaling and focal adhesion pathways. Among these pathways, the calcium signaling and focal adhesion pathways are significantly associated with cancers. For example, abnormal Ca2+ level is related to glioblastoma and gastric adenocarcinoma (22, 23). In addition, research shows that focal adhesion is relevant to therapy resistance and plays a vital role in carcinogenesis, tumor progression and metastasis (24).
In the present study, 712 mRNAs were enrolled in the PPI network, and module analysis was performed. The majority of these genes were relevant to the PI3K-Akt signaling and calcium signaling pathways, which are associated with occurrence and development of tumors (25, 26). Meanwhile, the PI3K-Akt signaling pathway also played a significant role in OSCC (27). Six hub genes associated with overall survival—GNA14, GRM5, KRT3, KRT26, TACR1, and HTR2C—were enriched in module 2, indicating that module 2 plays a vital role in the PPI network and OSCC prognosis. To our surprise, GNA14 was identified as a potential biomarker by PCR assay and as a hub gene by bioinformatics analysis, which indicated that GNA14 may play a crucial role in OSCC carcinogenesis.
In our ceRNA network, 2 lncRNAs (HOTIP,HCG22) were identified as prognostic biomarkers. Recently, increasing studies showed that aberrant expression of HOTTIP is associated with various cancers. For instance, knockdown of HOTTIP suppresses growth and invasion and induces apoptosis of oral tongue squamous carcinoma cells (28). However, HOTTIP expression has no difference in GEO, which may be associated with racial difference. In addition, bioinformatics analysis confirmed that HCG22 is differentially expressed in various cancers (29), and the mechanism in OSCC should be researched.
Because of poor prognosis and high recurrence, we constructed a risk score formula by comprehensive analysis of lncRNA to predict overall survival in OSCC. Finally, 7 lncRNA were enrolled in Cox regression and it can predict overall survival accurately. In oral cancer, overexpression of the lncRNA AFAP1-AS1 is associated with the proliferation, invasion and survival of tongue squamous cell carcinoma via the Wnt/β-catenin signaling pathway. LINC00460/EZH2/ KLF2 and LINC00460/miR-149-5p/CUL4A crosstalk serve as critical effectors in CRC tumorigenesis and progression (30). Chen et al. showed that LINC01234 expression was significantly upregulated in gastric cancer tissues and functioned as a ceRNA to regulate CBFB expression by sponging miR-204-5p (31). However, there is no report on lncRNA C11orf97, SLC8A1-AS1, and AQP4-AS1. Whether these lncRNAs play important roles in the development and progression of OSCC remains to be further investigated. Meanwhile, univariate Cox regression and multivariate Cox regression analysis suggested that HOTTIP was related to prognosis, which indicated that HOTTIP was regarded as an independent prognostic biomarker.
Eventually, four biomarkers including GNA14, CMA1, HOTTIP, and HCG22 were selected as biomarkers in OSCC by combining comprehensive analysis with PCR validation. Neuhaus J et al. indicated that aberrant CMA1 expression was found in Prostate Cancer (32). In addition, GNA14 was identified as biomarkers and a hub gene, suggesting that GNA14 was obviously relevant to OSCC. Our analysis shows that lncRNA HOTTIP and HCG22 are also biomarkers of OSCC. HOTTIP and HCG22 can interact with hsa-mir-21 in the ceRNA network, which promote oral cancer invasion via the Wnt/β-catenin pathway (33). Accordingly, exploring their mechanism in OSCC may provide new therapeutic targets. Furthermore, DKK1 and HOXC6 were excluded because there was no difference in mRNA level in OSCC tissues. The reasons may be racial difference or limited sample size.
We successfully constructed the ceRNA network, which reveals the potential mechanisms between lncRNA and mRNA. It may provide a new vision on therapeutic targets in OSCC by exploring the underlying mechanisms. At the same time, we also constructed a 7 lncRNA risk score model which is positively associated with overall survival in OSCC. We can put forward reasonable therapy at an appropriate time according to the risk model. Meanwhile, determining revisiting patients and follow-up improve the overall survival in OSCC based on risk level. Eventually, based on the 4 biomarkers, it may be beneficial to realize early-stage diagnosis and provide new therapeutic targets in OSCC.
Though the study might have crucial clinical importance, we still need to consider several limitations. First, in terms of sample numbers, both the TCGA database and clinical specimens which were collected at Nanfang Hospital are far from inadequate. We need to collect more information to continue verifying its accuracy. Second, the information from TCGA may show bias. Although we have validated it in cell lines and clinical specimens, we should continue to do further research. Third, the function and mechanism of biomarkers in OSCC need to be further studied in vivo and in vitro.
Conclusion
In summary, our study identifies that 2 mRNAs and 2 lncRNAs might be novel important prognostic factors and potential treatment targets for OSCC. Furthermore, we established a disordered lncRNA-miRNA-mRNA ceRNA network which is beneficial to understanding the relationship between lncRNA and mRNA and provides efficient strategies for subsequent functional studies of lncRNAs. Meanwhile, construction of the PPI network provides a new vision for OSCC treatment, and the risk score model is helpful for improving the overall survival in OSCC.
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found here: https://portal.gdc.cancer.gov/, https://www.ncbi.nlm.nih.gov/geo/, https://string-db.org/, http://www.mircode.org/, http://www.mirdb.org/, http://mirtarbase.mbc.nctu.edu.tw/, http://www.targetscan.org.
Ethics Statement
The study protocol was approved by the Ethics Committees of Nanfang Hospital of Guangdong Province (NO: NFEC-2018-027).
Author Contributions
GH: design and initiation of the study, quality control of data and algorithms, data analysis and interpretation, manuscript preparation and editing. QW: data acquisition. ZZ: statistical analysis. TS: patient recruitment and clinical support and oversight. X-ZL: study concept, design and initiation of the study. All authors are revision and approval the final version of the paper.
Funding
This study was supported by the National Natural Science Foundation of China (Grant Number: 81472536), the Science and Technology Planning Project of Guangdong Province (Grant Number: 2017A020215181), the Scientific Research Launch Project of Southern Medical University (Grant Numbers: CX2018N016, PY2018N031), the Project of Guangdong Provincial Department of Education (Grant Number: 2018KTSCX026) and the NanFang Hospital President Funding (Grant Number: 2014027).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2019.01054/full#supplementary-material
Supplementary Figure 1. 6 hub genes associated with overall survival. (A) GNA14, (B) GRM5, (C) (KRT3), (D) KRT26, (E) TRAC1, (F) HTR2C.
Supplementary Figure 2. Kaplan–Meier analysis along with logrank p was used to compare the survival of the low-risk group and high-risk group. (A) Survival analysis of training set. (B) Survival analysis of validation set. (C) OSCC patients with grade I and II. (D) OSCC patients with grade IV and III. (E) ROC based on risk score model.
Supplementary Figure 3. The mRNA expression level of DKK1 and HOXC6 in OSCC (A). The protein level of CMA1 and GNA14 in OSCC cell lines and tissues (B). Combination of two and more biomarkers predicted OSCC patients overall survival.
Abbreviations
OSCC, oral squamous cell carcinoma; TCGA, The Cancer Genome Atlas; DERNAs, differentially expressed RNAs; DEmRNAs, differentially expressed mRNAs; DElncRNAs, differentially expressed lncRNAs; DEmiRNAs, differentially expressed miRNAs; GEO, Gene Expression Omnibus; ceRNA, competing endogenous RNA; PPI, protein-protein interaction; DMEM, Dulbecco's modified Eagle's medium; FBS, fetal bovine serum; MREs, miRNA response elements; ROC, receiver operating characteristic; GO, Gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; ncRNAs, non-cording RNAs. 3'-UTR, 3'-Untranslated region.
References
1. Leemans CR, Braakhuis BJ, Brakenhoff RH. The molecular biology of head and neck cancer. Nat Rev Cancer. (2011) 11:9–22. doi: 10.1038/nrc2982
2. Ang KK, Harris J, Wheeler R, Weber R, Rosenthal DI, Nguyen-Tan PF, et al. Human papillomavirus and survival of patients with oropharyngeal cancer. N Engl J Med. (2010)363:24–35. doi: 10.1056/NEJMoa0912217
3. Zini A, Czerninski R, Sgan-Cohen HD. Oral cancer over four decades: epidemiology, trends, histology, and survival by anatomical sites. J Oral Pathol Med. (2010) 39:299–305. doi: 10.1111/j.1600-0714.2009.00845.x
4. Vermorken JB, Herbst RS, Leon X, Amellal N, Baselga J. Overview of the efficacy of cetuximab in recurrent and/or metastatic squamous cell carcinoma of the head and neck in patients who previously failed platinum-based therapies. Cancer-Am Cancer Soc. (2008) 112:2710–9. doi: 10.1002/cncr.23442
5. Takahashi H, Yanamoto S, Yamada S, Umeda M, Shigeta T, Minamikawa T, et al. Effects of postoperative chemotherapy and radiotherapy on patients with squamous cell carcinoma of the oral cavity and multiple regional lymph node metastases. Int J Oral Maxillofac Surg. (2014) 43:680–5. doi: 10.1016/j.ijom.2013.11.013
6. Hangauer MJ, Vaughn IW, McManus MT. Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs. PLoS Genet. (2013) 9:e1003569. doi: 10.1371/journal.pgen.1003569
7. Zhang L, Meng X, Zhu XW, Yang DC, Chen R, Jiang Y, Xu T. Long non-coding RNAs in Oral squamous cell carcinoma: biologic function, mechanisms and clinical implications. Mol Cancer. (2019) 18:102. doi: 10.1186/s12943-019-1021-3
8. Wang J, Zhang X, Chen W, Hu X, Li J, Liu C. Regulatory roles of long noncoding RNAs implicated in cancer hallmarks. Int J Cancer. (2019). doi: 10.1002/ijc.32277. [Epub ahead of print].
9. Shao T, Huang J, Zheng Z, Wu Q, Liu T, Lv X. SCCA, TSGF, and the long non-coding RNA AC007271.3 are effective biomarkers for diagnosing oral squamous cell carcinoma. Cell Physiol Biochem. (2018) 47:26–38. doi: 10.1159/000489741
10. Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell. (2011) 146:353–8. doi: 10.1016/j.cell.2011.07.014
11. El-Sakka H, Kujan O, Farah CS. Assessing miRNAs profile expression as a risk stratification biomarker in oral potentially malignant disorders: a systematic review. Oral Oncol. (2018)77:57–82. doi: 10.1016/j.oraloncology.2017.11.021
12. Cao M, Zheng L, Liu J, Dobleman T, Hu S, Go V, et al. MicroRNAs as effective surrogate biomarkers for early diagnosis of oral cancer. Clin Oral Investig. (2018) 22:571–81. doi: 10.1007/s00784-017-2317-6
13. Liu S, Liu LH, Hu WW, Wang M. Long noncoding RNA TUG1 regulates the development of oral squamous cell carcinoma through sponging miR-524-5p to mediate DLX1 expression as a competitive endogenous RNA. J Cell Physiol. (2019) 234:20206–16. doi: 10.1002/jcp.28620
14. Pramanik KK, Nagini S, Singh AK, Mishra P, Kashyap T, Nath N, et al. Glycogen synthase kinase-3β mediated regulation of matrix metalloproteinase-9 and its involvement in oral squamous cell carcinoma progression and invasion. Cell Oncol. (2018)41:47–60. doi: 10.1007/s13402-017-0358-0
15. Lun AT, Chen Y, Smyth GK. It's DE-licious: a recipe for differential expression analyses of RNA-seq experiments using quasi-likelihood methods in edgeR. Methods Mol Biol. (2016) 1418:391–416. doi: 10.1007/978-1-4939-3578-9_19
16. Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. (2003) 4:2. doi: 10.1186/1471-2105-4-2
17. Jeggari A, Marks DS, Larsson E. miRcode: a map of putative microRNA target sites in the long non-coding transcriptome. Bioinformatics. (2012) 28:2062–3. doi: 10.1093/bioinformatics/bts344
18. Wang X. Improving microRNA target prediction by modeling with unambiguously identified microRNA-target pairs from CLIP-ligation studies. Bioinformatics. (2016) 32:1316–22. doi: 10.1093/bioinformatics/btw002
19. Chou CH, Shrestha S, Yang CD, Chang NW, Lin YL, Liao KW, et al. miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions. Nucleic Acids Res. (2018) 46:D296–302. doi: 10.1093/nar/gkx1067
20. Fromm B, Billipp T, Peck LE, Johansen M, Tarver JE, King BL, et al. A uniform system for the annotation of vertebrate microRNA genes and the evolution of the human microRNAome. Annu Rev Genet. (2015) 49:213–42. doi: 10.1146/annurev-genet-120213-092023
21. Mao Y, Fu Z, Dong L, Zheng Y, Dong J, Li X. Identification of a 26-lncRNAs risk model for predicting overall survival of cervical squamous cell carcinoma based on integrated bioinformatics analysis. DNA Cell Biol. (2019) 38:322–32. doi: 10.1089/dna.2018.4533
22. Waugh MG. Chromosomal instability and phosphoinositide pathway gene signatures in glioblastoma multiforme. Mol Neurobiol. (2016) 53:621–30. doi: 10.1007/s12035-014-9034-9
23. Dai L, Zhuang L, Zhang B, Wang F, Chen X, Xia C, et al. DAG/PKCdelta and IP3/Ca(2)(+)/CaMK IIβ operate in parallel to each other in PLCγ1-driven cell proliferation and migration of human gastric adenocarcinoma cells, through Akt/mTOR/S6 pathway. Int J Mol Sci. (2015) 16:28510–22. doi: 10.3390/ijms161226116
24. Eke I, Cordes N. Focal adhesion signaling and therapy resistance in cancer. Semin Cancer Biol. (2015) 31:65–75. doi: 10.1016/j.semcancer.2014.07.009
25. Monteith GR, McAndrew D, Faddy HM, Roberts-Thomson SJ. Calcium and cancer: targeting Ca2+ transport. Nat Rev Cancer. (2007) 7:519–30. doi: 10.1038/nrc2171
26. Faes S, Dormond O. PI3K and AKT: unfaithful partners in cancer. Int J Mol Sci. (2015) 16:21138–52. doi: 10.3390/ijms160921138
27. Wang H, Wu Q, Liu Z, Luo X, Fan Y, Liu Y, et al. Downregulation of FAP suppresses cell proliferation and metastasis through PTEN/PI3K/AKT and Ras-ERK signaling in oral squamous cell carcinoma. Cell Death Dis. (2014) 5:e1155. doi: 10.1038/cddis.2014.122
28. Mu M, Li Y, Zhan Y, Li X, Zhang B. Knockdown of HOXA transcript at the distal tip suppresses the growth and invasion and induces apoptosis of oral tongue squamous carcinoma cells. Onco Targets Ther. (2018) 11:8033–44. doi: 10.2147/OTT.S174637
29. Li X, Xiao X, Chang R, Zhang C. Comprehensive bioinformatics analysis identifies lncRNA HCG22 as a migration inhibitor in esophageal squamous cell carcinoma. J Cell Biochem. (2019). doi: 10.1002/jcb.29218. [Epub ahead of print].
30. Lian Y, Yan C, Xu H, Yang J, Yu Y, Zhou J, et al. A novel lncRNA, LINC00460, affects cell proliferation and apoptosis by regulating KLF2 and CUL4A expression in colorectal cancer. Mol Ther Nucleic Acids. (2018) 12:684–97. doi: 10.1016/j.omtn.2018.06.012
31. Chen X, Chen Z, Yu S, Nie F, Yan S, Ma P, et al. Long Noncoding RNA LINC01234 functions as a competing endogenous RNA to regulate CBFB expression by sponging miR-204-5p in gastric cancer. Clin Cancer Res. (2018) 24:2002–14. doi: 10.1158/1078-0432.CCR-17-2376
32. Neuhaus J, Schiffer E, Mannello F, Horn LC, Ganzer R, Stolzenburg JU. Protease expression levels in prostate cancer tissue can explain prostate cancer-associated seminal biomarkers-an explorative concept study. Int J Mol Sci. (2017) 18:E976. doi: 10.3390/ijms18050976
Keywords: competing endogenous RNA, protein-protein interaction, long non-coding RNA, biomarker, oral squamous cell carcinoma
Citation: Huang G, Wu Q, Zheng Z, Shao T and Lv X-Z (2019) Identification of Candidate Biomarkers and Analysis of Prognostic Values in Oral Squamous Cell Carcinoma. Front. Oncol. 9:1054. doi: 10.3389/fonc.2019.01054
Received: 08 July 2019; Accepted: 27 September 2019;
Published: 18 October 2019.
Edited by:
Edward Wenge Wang, City of Hope National Medical Center, United StatesReviewed by:
Keqiang Zhang, City of Hope National Medical Center, United StatesYuanfei Yao, City of Hope National Medical Center, United States
Omar Kujan, University of Western Australia, Australia
Copyright © 2019 Huang, Wu, Zheng, Shao and Lv. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Xiao-Zhi Lv, bHh6c3VyZ2VvbiYjeDAwMDQwOzEyNi5jb20=