Skip to main content

METHODS article

Front. Genet., 27 July 2021
Sec. Genomic Assay Technology
This article is part of the Research Topic Novel approaches and concepts of biomarker discovery for cancer View all 8 articles

mRBioM: An Algorithm for the Identification of Potential mRNA Biomarkers From Complete Transcriptomic Profiles of Gastric Adenocarcinoma

\r\nChanglong Dong,,Changlong Dong1,2,3Nini Rao,,*Nini Rao1,2,3*Wenju Du,,Wenju Du1,2,3Fenglin Gao,,Fenglin Gao1,2,3Xiaoqin Lv,,Xiaoqin Lv1,2,3Guangbin Wang,,Guangbin Wang1,2,3Junpeng Zhang,,Junpeng Zhang1,2,3
  • 1Center for Informational Biology, School of Life Sciences and Technology, University of Electronic Science and Technology of China, Chengdu, China
  • 2School of Life Sciences and Technology, University of Electronic Science and Technology of China, Chengdu, China
  • 3Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China

Purpose: In this work, an algorithm named mRBioM was developed for the identification of potential mRNA biomarkers (PmBs) from complete transcriptomic RNA profiles of gastric adenocarcinoma (GA).

Methods: mRBioM initially extracts differentially expressed (DE) RNAs (mRNAs, miRNAs, and lncRNAs). Next, mRBioM calculates the total information amount of each DE mRNA based on the coexpression network, including three types of RNAs and the protein-protein interaction network encoded by DE mRNAs. Finally, PmBs were identified according to the variation trend of total information amount of all DE mRNAs. Four PmB-based classifiers without learning and with learning were designed to discriminate the sample types to confirm the reliability of PmBs identified by mRBioM. PmB-based survival analysis was performed. Finally, three other cancer datasets were used to confirm the generalization ability of mRBioM.

Results: mRBioM identified 55 PmBs (41 upregulated and 14 downregulated) related to GA. The list included thirteen PmBs that have been verified as biomarkers or potential therapeutic targets of gastric cancer, and some PmBs were newly identified. Most PmBs were primarily enriched in the pathways closely related to the occurrence and development of gastric cancer. Cancer-related factors without learning achieved sensitivity, specificity, and accuracy of 0.90, 1, and 0.90, respectively, in the classification of the GA and control samples. Average accuracy, sensitivity, and specificity of the three classifiers with machine learning ranged within 0.94–0.98, 0.94–0.97, and 0.97–1, respectively. The prognostic risk score model constructed by 4 PmBs was able to correctly and significantly (∗∗∗p < 0.001) classify 269 GA patients into the high-risk (n = 134) and low-risk (n = 135) groups. GA equivalent classification performance was achieved using the complete transcriptomic RNA profiles of colon adenocarcinoma, lung adenocarcinoma, and hepatocellular carcinoma using PmBs identified by mRBioM.

Conclusions: GA-related PmBs have high specificity and sensitivity and strong prognostic risk prediction. MRBioM has also good generalization. These PmBs may have good application prospects for early diagnosis of GA and may help to elucidate the mechanism governing the occurrence and development of GA. Additionally, mRBioM is expected to be applied for the identification of other cancer-related biomarkers.

Introduction

Gastric cancer is a global health problem, with more than 1 million patients being diagnosed worldwide each year. Gastric cancer remains the third leading cause of cancer-related death, despite a worldwide decline in morbidity and mortality over the past 5 years (Bray et al., 2018; Thrift and El-Serag, 2020). Gastric adenocarcinoma (GA) is a type of gastric cancer caused by malignant transformation of gastric gland cells. Incidence of GA accounts for approximately 95% of gastric malignancies (Lawrence, 2004), and GA pathogenesis has not been fully elucidated. Five-year survival rate of early gastric cancer can reach >90% (Tan, 2019), and 5-year survival rate of patients with advanced gastric cancer is only 20–40% (Siegel et al., 2016; Song Z. et al., 2017). Therefore, an improvement in early diagnosis and treatment of GA can decrease GA incidence and mortality.

Several studies have suggested that molecular biomarkers are important for early diagnosis, treatment, and evaluation of prognosis of cancer (Parker et al., 2009; Collins and Varmus, 2015; Pellegrini et al., 2015). According to the central dogma of biology, RNA carries genetic and regulatory information that reflects the state of the cells. RNA biomarkers have considerably higher sensitivity and specificity for the detection of cancer samples compared with those of protein biomarkers and can more dynamically reflect cellular states and regulatory processes to provide additional cellular information compared with that provided by DNA biomarkers (Xi et al., 2017). Furthermore, miRNAs can regulate gene expression by binding to mRNAs or related proteins (Bartel, 2009). LncRNAs can competitively bind miRNAs as competing endogenous RNAs (ceRNAs) to regulate gene expression and cellular functions (Xia et al., 2014; Song Y. X. et al., 2017). Therefore, mRNAs occupy a key position in the complex regulatory processes involving three types of biomolecules. Abnormal expression of mRNAs in the key positions of the regulatory network can easily bias the overall stability of the network. mRNAs may cause abnormal activation of one or more signaling pathways, which also leads to abnormal expression or function of the biomolecules in these signaling pathways to promote physiological and tissue disorders, such as cancer (Lu et al., 2016; Duan et al., 2020; Hu et al., 2020; Wei et al., 2020). mRNAs that occupy the key positions are more likely to be biomarkers.

Many mRNA biomarkers associated with occurrence and development of GA were identified using experimental and computational methods. Representative studies can be summarized as follows. Yoon et al. (2019) confirmed that the activation of KRAS in GA cells stimulates epithelial-to-mesenchymal transition to form cancer stem-like cells, thereby promoting metastasis. Huang C. et al. (2020) found that overexpression of DGKi in GA indicates poor prognosis, and the MAPK signaling pathway may be one of the key pathways that regulate occurrence and development of GA by DGKi. Necula et al. (2020) showed that overexpression of COL10A1 in GA patients is associated with poor survival and that COL10A1 can be used as a potential biomarker for early detection of GA. Wang (2017) identified 446 differentially expressed (DE) mRNAs in the gene expression profile related to gastric cancer, used these DE mRNAs to construct a protein-protein interaction network, and finally identified five key mRNAs in the protein-protein interaction network (COL5A2, TOP2A, KIF20A, FN1, and PRC1). However, existing GA-related mRNA biomarkers are not sufficient to provide accurate GA diagnosis in the clinic and thoroughly elucidate GA pathogenesis. Identification of GA-related mRNA markers with high sensitivity and specificity is of great significance for early diagnosis, targeted therapy, and analysis of prognosis of GA. Therefore, this study first proposes an algorithm to identify potential mRNA biomarkers (PmBs) related to GA based on complete transcriptomic RNA (including mRNA, lncRNA, and miRNA) profiles of GA. The proposed algorithm evaluates the potential of an mRNA with abnormal expression as GA biomarker in the regulation of transcriptional coexpression and at the protein-protein interaction level. The integrated analysis of multiple omics data objectively avoids the problems of signal noise and high inaccuracy caused by single omics analysis. Then, the sample classification power and prognostic relevance of PmBs were analyzed to assess their reliability and value for assistance with clinical diagnosis. The novelty of this paper are as follows:

1. An novel algorithm named mRBioM for the identification of potential mRNA biomarkers from complete transcriptomic profiles of GA was developed.

2. A cancer-related factor was proposed to distinguish whether a single sample is cancer or normal, which may have good application prospects in the personalized diagnosis of cancers.

3. The mRBioM-based prognostic risk score model was constructed to assess the overall survival rate of cancer patients.

Materials and Methods

Data Collection

The complete transcriptome TCGA-STAD dataset of RNAs (including mRNA, lncRNA, and microRNA) of GA patients published by various countries was obtained from the Genomic Data Commons of National Cancer Institute in July, 2019. The pathological tissue types of the source data were limited to GA. The dataset included 279 GA patients and the corresponding clinical information (Table 1). The dataset included 257 cases that had only GA tissue samples, 20 cases that had GA and paired paracancerous tissue samples, and 2 cases that had only paracancerous tissue samples. Detailed information about these 299 samples is shown in Supplementary Table 1.

TABLE 1
www.frontiersin.org

Table 1. Statistics of clinical information of included 279 GA patients.

TCGA-STAD was organized into five subsets for various studies: dataset 1 for GA-related PmB identification, datasets 2–4 for evaluation of PmB classification, and dataset 5 for survival analysis, as shown in Figure 1A. Three other cancer-related RNA transcriptomic profiles were downloaded from the Genomic Data Commons database in May of 2020 and were used to verify the generalization ability of mRBioM: TCGA-COAD, including 478 cases of colon cancer and 41 cases of normal tissues; TCGA-LUAD, including 533 cases of lung adenocarcinoma and 59 cases of normal tissues; and TCGA-LIHC, including 371 cases of liver cancer and 50 cases of normal tissues. The characteristics of the three datasets are shown Figure 1B.

FIGURE 1
www.frontiersin.org

Figure 1. Data organization and utilization. (A) Five subsets from the TCGA-STAD dataset. (B) TCGA-COAD, TCGA-LUAD, and TCGA-LIHC. C, cancer sample; N, adjacent normal sample; CF, cancer-related factor; CFth, threshold of CF; ML, machine learning.

mRBioM Algorithm

The amount of information for a molecule can determine whether this molecule is in a key position in the regulatory network (Teschendorff et al., 2014). Thus, mRBioM identified PmBs by evaluating the amount of information for each DE mRNA based on the transcriptional coexpression relationships between DE mRNAs, miRNAs, and lncRNAs and in the PPI network. The steps of the mRBioM algorithm are described below.

DE RNA Analysis

The limma package of R (Ritchie et al., 2015) was used to identify DE RNAs from dataset 1 containing 20 GA and 20 paracancer samples (a total of 40 samples) from TCGA-STAD. Dataset 1 was preprocessed by cleaning and standardization; next, the logarithm of the expression fold change (FC) of each RNA in GA vs. adjacent normal samples was calculated. The log2FC value and corresponding corrected p-value (represented by Padj) of each RNA were used to determine whether an RNA was differentially expressed. The screening conditions for DE RNAs in this study were Padj < 0. 05 or 0.01 and |log2FC | 1.

Calculation of the Coexpression Correlation Coefficient Matrix for RNAs

Suppose that we identified N, J, and K DE mRNAs, DE miRNAs, and DE lncRNAs, respectively. The expression vector of each DE RNA in all samples was extracted from dataset 1. Pearson correlation coefficients Mxy and Lxz between DE mRNA x(x = 1,⋅, N) and DE miRNA y(y = 1,⋅, J) and between DE mRNA x and DE lncRNA z(z = 1,⋅, K), respectively, were calculated according to Eqs. (1) and (2).

M x y = i = 1 40 ( x i - x ¯ ) ( y i - y ¯ ) i = 1 40 ( x i - x ¯ ) 2 i = 1 40 ( y i - y ¯ ) 2 (1)
L x z = i = 1 40 ( x i - x ¯ ) ( z i - z ¯ ) i = 1 40 ( x i - x ¯ ) 2 i = 1 40 ( z i - z ¯ ) 2 (2)

where xi, yi, and zi and x¯, y¯, and z¯ are the i-th element and the average value of all elements in the expression vectors of DE mRNA x, DE miRNA y, and DE lncRNA z, respectively. Pearson correlation coefficients between all DE mRNAs and DE miRNAs and between all DE mRNAs and DE lncRNAs constitute two correlation coefficient matrixes, which are represented by M (N × J) and L (N × K), respectively.

Calculation of the Amount of Information for DE mRNA in the Coexpression Network

The connection of each molecule in the regulatory network is influenced by many factors, such as environment and diet, and has a degree of uncertainty that accounts for the amount of information for each molecule (Teschendorff et al., 2014). In this study, we propose to use the information rate of a DE mRNA in the transcriptional coexpression networks to measure the uncertainty of its connection and then use Shannon’s information entropy theory to estimate the amount of coexpression information for a DE mRNA.

The information rate for DE mRNA x in the coexpression network between DE mRNA and DE miRNA was defined as the ratio of a significant pearson correlation coefficient (p < 0.05) in the x-th line corresponding to DE mRNA x in M to the sum of all significant pearson correlation coefficients (p < 0.05) in the x-th line of M, which measures the correlation degree between a DE mRNA x and a DE miRNA y (y = 1,⋅, J’). All information rates for DE mRNA x associated with other DE miRNAs constitute the information rate vector px defined by Eq. (3). Similarly, the information rate vector qx for DE mRNA x in the coexpression network of DE mRNAs and DE lncRNAs is defined according to Eq. (4).

p x = M x y = 1 J M x y (3)
q x = L x z = 1 K L x z (4)

where Mxnd Lx are the vectors composed of the pearson correlation coefficients with statistical p values less than 0.05 in the x-th row of M and L, respectively; M’xy (y = 1, 2,⋅, J’) and L’xz (z = 1, 2, ⋅, K’) are the pearson correlation coefficients with statistical p-values less than 0.05 in the x-th row of M and L, respectively; and J’ and K’ are the corresponding numbers.

According to Shannon’s information entropy theory, the amount of coexpression information for DE mRNA x (expressed as SRNAx) is estimated by Eq. (5).

S R N A x = y = 1 J - p x y l o g 2 p x y + z = 1 K - q x z l o g 2 q x z (5)

where pxy is the y-th information rate in px, y = 1, 2, ⋅,J’; qxz is the z-th information rate in qx, z = 1, 2,⋅, K’.

Estimation of the Amount of Information for DE mRNA in the Protein-Protein Interaction Network

We constructed a protein-protein interaction network based on the protein interaction information of all DE mRNAs acquired from the online STRING database1. Higher protein-protein connectivity score in the protein-protein interaction network corresponds to greater amount of interaction information between two proteins (Szklarczyk et al., 2019). Therefore, we used cs to measure the amount of protein interaction information (represented by SPPIx) that corresponds to DE mRNA x according to Eq. (6).

S P P I x = 1 + j N c s x j (6)

where csxj = 1) represents the connection score between a protein encoded by DE mRNA x and a protein encoded by another DE mRNA j (jN, j≠x).

Identification of PmBs Associated With GA

The sum of SRNAx and SPPIx normalized by maximum was used as the total information amount of DE mRNA x (denoted by Sx) according to Eq. (7).

S x = S R N A x m a x { S R N A x } + S P P I x m a x { S P P I x } (7)

All DE mRNAs were sorted according to Sx (x = 1, 2 ⋅, N), and PmBs were identified based on the change trend of Sx (x = 1, 2⋅,N). The number of identified PmBs was recorded as Q. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses of PmBs were performed by the clusterProfiler R package to investigate the functions of PmBs (Yu et al., 2012).

Evaluation of Sample Classification Power of PmBs

We designed four classifiers based on PmBs to discriminate the positive GA and negative control samples to illustrate the value of PmBs identified by mRBioM in auxiliary clinical diagnosis. The performance of the four classifiers was evaluated by sensitivity, specificity, and accuracy.

Cancer-Related Factor

The cancer-related factor of a sample was determined by the expression values of PmBs in the samples and was used to discriminate the sample types. The cancer-related factor value of a sample was defined as the ratio of the average logarithm values of the expression of upregulated and downregulated PmBs in the sample according to Eq. (8).

C F = 1 n u p × i = 1 n u p log 2 E x u p ( i ) 1 n d n × j = 1 n d n log 2 E x d n ( j ) (8)

where CF indicates the value of cancer-related factor. nupnd Exup(i) are the number of upregulated PmBs and the expression value of the i-th upregulated PmB in a sample, respectively. Similarly, ndnnd Exdn(j) are the number of downregulated PmBs and the expression value of the j-th downregulated PmB.

We randomly selected n (in this instance, n = 10) GA and adjacent normal samples from the mRNA expression profile of dataset 1 to identify the best CF threshold for discrimination of the positive and negative samples, and only the expression value of Q PmBs from each sample was used to form dataset 2 (C = 10, N = 10). Next, the expression profiles of 2n samples in dataset 2 were converted into a new expression profile containing n samples. The expression value vector Sm (dimension is 1 × n) of the m-th (m = 1, 2,⋅,Q) PmB in the synthetic expression profile was calculated according to Eq. (9).

S m = S tm + S nm × i = 1 n S t m ( i ) i = 1 n S n m ( i ) 2 (9)

where Stm (dimension is 1 × n) and Snm (dimension is 1 × n) are the expression value vectors of the GA and control samples in dataset 2 of m-th (m = 1, 2,⋅,Q) PmB, respectively. Stm(i) and Snm(i) are the i-th expression value elements (I = 1, 2,⋅,n) of Stm (dimension is 1 × n) and Snm (dimension is 1 × n), respectively.

Next, Eq. (8) was used to calculate the CF of the i-th sample in the generated expression profile (denoted as CFi, i = 1, 2,⋅,n), and the geometric mean value of the CF values of n samples (Eq. 10) was used as the threshold of CF (denoted as CFth).

C F t h = i = 1 n C F i n (10)

Finally, the samples of dataset 2 were excluded from TCGA-STAD, and the remaining samples only with the expression values of Q PmB were used to form dataset 3 (C = 267, N = 12), which was used to test the ability of cancer-related factor to recognize the GA samples. If the CF of a sample was greater than CFth, the sample was identified as GA (positive); otherwise, the sample was identified as control (negative).

Classifiers With Machine Learning

Three classifiers with machine learning based on random forest (RF) (Wang H. et al., 2020), support vector machine (SVM) (Zhang et al., 2017; Zhang and Liu, 2017), and naive Bayes (NB) (Dou et al., 2015) were constructed using the normalized expression values of PmBs as the classification feature implemented by randomForest R package (Liaw and Wiener, 2002), the svm function of the e1071 R package (Meyer et al., 2019), and the NaiveBayes function of the klaR R package (Weihs et al., 2005), respectively. Of course, there are other improved Bayesian models that can replace NB classification algorithms (Nagarajan et al., 2013; Thapa et al., 2018). Since the unbalanced sample size between the GA and control groups will affect the classification effect of the three classifiers, we used the downsampling method to randomly extract 28 samples from 277 GA samples and retain all 22 adjacent normal samples in TCGA-STAD, which formed validation dataset 4 (C = 28, N = 22). Finally, the performance of the three classifiers with machine learning was confirmed on dataset 4 by using the fivefold cross-validation method.

PmBs-Based Survival Analysis

We excluded 10 patients with missing survival time or less than 30 day survival from the cohort of 279 patients in TCGA-STAD to exclude patients who died from other factors and finally used the transcription profiles of 269 GA patients with 55 PmBs to form dataset 5 for survival analysis. The average survival time of GA patients in dataset 5 was 21.575 ± 17.506 months, and 105 GA patients died at the end of follow-up, accounting for 39% of the total cohort.

Clinical information about patients (Supplementary Table 1) and dataset 5 (C = 269) were integrated, and a univariate Cox regression model of the survival R package (Peterschmitt et al., 2018) was used to identify survival-related PmBs that have a significant impact on survival time (p < 0.05); then, a multivariate Cox regression model was used to determine T survival-related PmBs to construct a prognostic risk model (Lossos et al., 2004) used to calculate the survival-based risk score of a patient (Eq. 11).

Risk score = t = 1 T E x p P m B ( t ) × W P m B ( t ) (11)

where ExpPmB(t) is the expression value of t-th survival-related PmB in the patient sample, and WPmB(t) is the corresponding multivariate Cox regression coefficient of t-th survival-related PmB, t = 1, 2,⋅, T.

Then, the median of the risk scores of all patients in dataset 5 was used as the cutoff value to divide the patients into the high- and low-risk groups. Finally, Kaplan–Meier analysis was used to assess the overall survival rate of patients in the high- and low-risk groups, and the log-rank test was used to determine whether there is a significant difference in the overall survival rate of patients in the high-risk vs. low-risk groups. In addition, we used the survivalROC package (Kamarudin et al., 2017) of R to perform ROC curve analysis to evaluate the sensitivity and specificity of the prognostic risk model.

Results

DE mRNAs and PmBs in GA

A total of 170 DE mRNAs |log2FC(| 1, Padj < 0.01), 623 DE lncRNAs |log2FC(| 1, Padj < 0.05), and 52 DE miRNAs |log2FC(| > 1, Padj < 0.01) were obtained. Figure 2A shows the volcano plots of significantly DE RNAs, the details of all DE mRNAs are shown in Supplementary Table 2. And the results of the protein-protein interacti network analysis are shown in the attached file “string_protein_interactions_170.tsv.”

FIGURE 2
www.frontiersin.org

Figure 2. DE RNAs and the screening results of PmBs. (A) Volcano plot of DE RNAs (circles: DE mRNA, squares: DE miRNA, triangles: DE lncRNA); red dots represent up-regulated DE RNAs, and blue dots represent down-regulated DE RNAs. (B) The total information amount plot of DE mRNAs; the abscissa represents symbols of mRNA (part of the symbols is displayed), and the ordinate is the total information amount of each DE mRNA. (C) Heatmap of the PmBs of adjacent normal group vs. GA group. DE, differentially expressed; PmBs, potential mRNA biomarkers; N, adjacent normal sample; C, cancer sample; TIA, total amount of information.

The total information amount for each DE mRNA was calculated by mRBioM, and the curve constructed by total information amount of all DE mRNAs from large to small is shown in Figure 2B. There is a significant decrease of curve after the orange area and finally the curve tends to be stable. Therefore, a total of 55 DE mRNAs with total information amount corresponding to the orange region were identified as PmBs for further study (Table 2). A literature search confirmed that 13 PmBs were related to GA (23.64%), and 27 PmBs were related to other cancers (49.09%) (Table 2). The expression distribution of 55 PmBs is shown in Figure 2C, corresponding to 41 upregulated PmBs (lower right corner vs. lower left corner) and 14 downregulated PmBs (upper right corner vs. upper left corner).

TABLE 2
www.frontiersin.org

Table 2. The identified PmBs and their total information amount.

Functional Enrichment Analysis of PmBs in GA

GO and KEGG functional enrichment analyses were performed by clusterProfiler of R using 55 PmBs to investigate the potential functions of these biomarkers. As shown in Figure 3A, the GO terms indicated that these 55 PmBs were mainly concentrated in chromatin binding (p < 0.05). The results of KEGG analysis with p < 0.05 suggested that these 55 PmBs were mainly related to pathways closely associated with occurrence and development of cancer, such as mitophagy-animal, ribosome biogenesis in eukaryotes, MAPK signaling pathway, cAMP signaling pathway, central carbon metabolism, microRNAs in cancer, and renal cell carcinoma (Figure 3B).

FIGURE 3
www.frontiersin.org

Figure 3. GO and KEGG enrichment analysis of PmBs. (A) GO enrichment analysis. (B) KEGG enrichment analysis. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; GA, gastric adenocarcinoma.

Sample Classification Power of Cancer-Related Factor

CFth of dataset 2 was 0.9725, and the remaining samples in TCGA-STAD formed dataset 3 to test the sample classification power of cancer-related factor (Table 3). Table 3 shows that accuracy, sensitivity, and specificity achieved by cancer-related factor were 0.90, 0.89, and 1, respectively. The ROC curve of cancer-related factor is shown in Figure 4A, and the area under the ROC curve (AUC) reached 0.9494. The cancer-related factor constructed by PmBs has high specificity and sensitivity and low computational complexity and does not require training; thus, it has great potential application in auxiliary clinical diagnosis.

TABLE 3
www.frontiersin.org

Table 3. Performance of cancer-related factor.

FIGURE 4
www.frontiersin.org

Figure 4. ROC curve analysis for the four classifiers. (A) CF. (B) RF-based, SVM-based and NB-based classifiers. ROC, receiver operating characteristic; CF, cancer factor; RF, random forest; SVM, support vector machine; NB, naive Bayes.

Sample Classification Power of Classifiers With Machine Learning

The results of fivefold cross-validation of RF-based, SVM-based, and NB-based classifiers using dataset 4 are shown in Table 4. Average accuracy, sensitivity, and specificity of the RF-based, SVM-based, and NB-based classifiers were 0.94, 0.98, and 0.96, 0.94, 0.97, and 0.94, and 1, 1, and 0.97, respectively. The average ROC curves of the three classifiers are shown in Figure 4B, and all three AUCs were above 0.99. This finding provides further proof that PmBs can be potential markers related to GA.

TABLE 4
www.frontiersin.org

Table 4. Results of fivefold cross-validation of three classifiers with machine learning.

Survival-Related PmBs in GA

Fourteen survival-related PmBs (LMNB2, BGN, IRAK1, MFSD12, FKBP10, SOX4, SLC12A7, DNMT1, SLC1A5, TIMP1, ENTPD6, GPX3, HELZ2, and PMEPA1) were identified by univariate Cox regression analysis, and the detailed results are shown in Supplementary Table 3. Multivariate Cox regression analysis identified LMNB2, BGN, MFSD12, and SOX4 (refer to Supplementary Table 4), which can be used to construct a prognostic risk model. The risk score of the i-th (i = 1, 2,⋅, 269) GA sample was calculated as follows:

R i s k s c o r e ( i ) = - 0.5295 × E x p L M N B 2 ( i ) + 0.2133 × E x p B G N ( i ) - 0.6516 × E x p M F S D 12 ( i ) + 0.2814 × E x p S O X 4 ( i ) .

Where Expα(i) (α is LMNB2, BGN, MFSD12, or SOX4) is the expression value of a survival-related PmB in the i-th GA sample.

The median of the risk scores of all GA samples −(34 in this case) was used as the cutoff value, and 269 patients were divided into the high-risk (>−34, n = 134) and low-risk groups (<−8.34, n = 135). Kaplan-Meier survival analysis of patients in the high- and low-risk groups showed that the difference between the two groups was significant (p < 0.0001). As shown in Figure 5A, the average survival time of patients inhe high-risk group was shorter, and the number of deaths was higher than those of patients in the low-risk group. In addition, the results of ROC analysis showed that the AUC value of the prognostic risk model constructed using 4 PmBs was 0.7742 (Figure 5B), suggesting good specificity and sensitivity.

FIGURE 5
www.frontiersin.org

Figure 5. Survival analysis based on four prognostic PmBs. (A) Kaplan–Meier curves analysis for overall survival of patients between the high- and low-risk groups; the upper panel represents the Kaplan-Meier curve for the high and low risk groups, the lower panel shows the cumulative number of deaths. (B) ROC analysis for prognostic risk model with the 4 PmBs.

Generalization Ability

Generalization ability of mRBioM was assessed on three other complete transcriptomic datasets, including TCGA-COAD (colonic adenocarcinoma), TCGA-LUAD (lung adenocarcinoma), and TCGA-LIHC (hepatocellular carcinoma), downloaded from the TCGA database, and the results are shown in Table 5. Average accuracy and sensitivity of CF were between 0.92 and 0.99, and average specificity was 1. Average accuracy and sensitivity of the RF-based, SVM-based, and NB-based classifiers were between 0.94 and 0.99, and average specificity was above 0.95. Therefore, the classifiers constructed with PmBs have good sample classification power in 3 other cancer datasets, indicating that the mRBioM algorithm has good generalization ability and can effectively identify potential cancer-related mRNA markers in other cancers.

TABLE 5
www.frontiersin.org

Table 5. Generalization ability verification results.

Discussion

This study proposed the mRBioM algorithm to identify potential mRNA biomarkers from the complete transcriptomic RNA profiles of GA. Unlike existing algorithms, mRBioM evaluates the potential of each DE mRNA as a biomarker by combining the corresponding amount of information at the transcription and protein levels based on the information entropy theory. Fifty-five DE mRNAs were identified as PmBs associated with GA. These 55 PmBs were used to construct four sample classifiers, including cancer-related factor, RF-based, SVM-based, and NB-based classifiers, to illustrate the reliability of PmBs identified by mRBioM. Good sensitivity, specificity, and accuracy of classification were achieved by the four classifiers. Four of fifty-five PmBs had good ability for prognostic evaluation of the overall survival of GA patients. TCGA-COAD, TCGA-LUAD, and TCGA-LIHC datasets confirmed the generalization ability of mRBioM. The classifiers constructed by the identified PmBs suggested good performance in a variety of classification algorithms and cancer-related datasets, which is expected to be used in more researches on cancer-related biomarker identification.

Thirteen of 55 PmBs (Table 2) were confirmed by the data of the literature to play certain roles in occurrence and development of GA and were biomarkers or potential therapeutic targets of GA. For example, GPRC5A and SOX9 have been shown to be related to occurrence and development of GA (Liu et al., 2016; Wang H. et al., 2020), and their expression levels changed more than fourfold in the GA vs. adjacent control samples according to the result of DE RNA analysis (log2FC > 2). FCMET has been confirmed as a resistance factor in GA (Ebert et al., 2019), and Wang R. G. et al. (2020) demonstrated that FKBP10 may be a crucial player mediating cell proliferation, invasion, and migration by regulating the PI3K signaling pathway in GA. Twenty-seven of 55 PmBs (Table 2) were shown to be associated with other cancers according to the data of the literature. Thus, mRBioM identified some new GA-related mRNAs. We attempted to extract additional DE mRNAs as PmBs related to GA. However, adding PmBs did not improve the classification powers of the four classifiers, and these extra PmBs were not associated with prognosis. Thus, our strategy for PmBs screening according to the change trend of the total information amount for all PmBs was effective.

Notably, the value ranges of the cancer-related factor calculated in most cancer and adjacent normal samples of four cancer-related datasets were 0.9–1.4 and 0.7–0.9, respectively. Additionally, the thresholds of cancer-related factors (CFth) in all four datasets were approximately 1. The values of cancer-related factors and their corresponding thresholds showed good consistency and robustness in all four datasets. Although the classification performance of cancer-related factor is slightly worse than that of three classifiers with machine learning, the approach does not require training and has considerably lower computational complexity than that of three classifiers with machine learning. Importantly, the method requires only a small number of cancer and adjacent samples to determine the threshold and evaluates whether a single sample corresponds to cancer. Thus, the cancer-related factor may have good application prospects in the personalized diagnosis of cancers.

LMNB2, BGN, MFSD12, and SOX4 in 55 GA-related PmBs were identified and combined into a prognostic risk scoring model. There is no experimental evidence that LMNB2, BGN, and MFSD12 in this combination are associated with GA, and these are new PmBs identified in this study. LMNB2 belongs to the lamin family and is closely related to occurrence, development, and prognosis of liver cancer (Kong et al., 2020; Li X. N. et al., 2020). BGN is an important member of the leucine-rich small proteoglycan family and an important component of the extracellular matrix. Clinical studies have shown that upregulation of BGN is related to poor prognosis of patients with various types of cancer syndromes (Zhao S. F. et al., 2020). MFSD12, also known as PP3501, is a nuclear protein (Wang et al., 2012). Bioinformatic analysis revealed that upregulated expression of MFSD12 is a key promoter of cell proliferation, potential prognostic biomarker, and therapeutic target for melanoma (Wei et al., 2019). SOX4 is a key transcription factor involved in occurrence and development of many cancers (Liu et al., 2018; Wang et al., 2018; Ding et al., 2019) and was shown to be related to the proliferation, migration, and invasion of GA cells and prognosis of GA patients (Fang et al., 2012; Dong et al., 2018; Shao et al., 2020). Therefore, the model has good sensitivity and specificity (AUC = 0.7742), and the risk score calculated by the model can effectively predict the risk of GA patients (p < 0.0001, hazard ratio = 2.845, 95% CI: 2.033–3.981).

In conclusion, our study proposes an mRBioM algorithm to identify PmBs from the complete transcriptomic RNA profiles of GA by integrating and analyzing the information at transcriptome and proteome levels. mRBioM identified 55 PmBs related to the occurrence, development and prognosis of GA, which may provide potential biomarkers for early diagnosis, treatment, and prognosis of GA. mRBioM can also be applied in other cancers for cancer-related biomarker identification. But this study also has several limitations. mRBioM is a computational method, and reliability of GA-related PmBs identified by mRBioM was confirmed only by computational methods; thus, further experimental studies are needed to verify the clinical value of identified GA-related PmBs.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

Ethics Statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

CD and NR designed the study and wrote the manuscript. CD, FG, and XL conducted the computer experiments. NR, WD, GW, and JZ analyzed the results and revised and offered advice about the manuscript. All authors participated in the critical review, revision of this manuscript, contributed to the article, and approved the submitted version.

Funding

The present study was funded by the National Natural Science Foundation of China (61872405 and 61720106004) and the Key R&D Project of Sichuan Province (2020YFS0243).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.679612/full#supplementary-material

Abbreviations

mRBioM, mRNA Biomarkers; GA, gastric adenocarcinoma; PmBs, potential mRNA biomarkers; DE, differentially expressed; FC, fold change; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; RF, random forest; SVM, support vector machine; NB, naive Bayes; AUC, area under the ROC curve.

Footnotes

  1. ^ http://string-db.org/

References

Baek, J. H., Yun, H. S., Kwon, G. T., Lee, J., Kim, J. Y., Jo, Y., et al. (2019). PLOD3 suppression exerts an anti-tumor effect on human lung cancer cells by modulating the PKC-delta signaling pathway. Cell Death Dis. 10:156. doi: 10.1038/s41419-019-1405-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Barbarulo, A., Iansante, V., Chaidos, A., Naresh, K., and Bubici, C. (2012). Poly(ADP-ribose) polymerase family member 14 (PARP14) is a novel effector of the JNK2-dependent pro-survival signal in multiple myeloma. Oncogene 32, 4231–4242. doi: 10.1038/onc.2012.448

PubMed Abstract | CrossRef Full Text | Google Scholar

Bartel, D. P. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233. doi: 10.1016/j.cell.2009.01.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., and Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68, 394–424. doi: 10.3322/caac.21492

PubMed Abstract | CrossRef Full Text | Google Scholar

Brown, T. C., Murtha, T. D., Rubinstein, J. C., Korah, R., and Carling, T. (2018). SLC12A7 alters adrenocortical carcinoma cell adhesion properties to promote an aggressive invasive behavior. Cell Commun. Signal. 16:27. doi: 10.1186/s12964-018-0243-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Cai, M., Sikong, Y., Wang, Q., Zhu, S., Pang, F., and Cui, X. (2019). Gpx3 prevents migration and invasion in gastric cancer by targeting NFsmall ka, CyrillicB/Wnt5a/JNK signaling. Int. J. Clin. Exp. Pathol. 12, 1194–1203.

Google Scholar

Chen, D., Qin, Y., Dai, M., Li, L., Liu, H., Zhou, Y., et al. (2020). BGN and COL11A1 regulatory network analysis in colorectal cancer (CRC) reveals that BGN influences CRC cell biological functions and interacts with miR-6828-5p. Cancer Manag. Res. 12, 13051–13069. doi: 10.2147/CMAR.S277261

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, W., Xu, Y., Zhong, J., Wang, H., Weng, M., Cheng, Q., et al. (2016). MFHAS1 promotes colorectal cancer progress by regulating polarization of tumor-associated macrophages via STAT6 signaling pathway. Oncotarget 7, 78726–78735. doi: 10.18632/oncotarget.12807

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, Z., Wu, W., Huang, Y., Xie, L., Li, Y., Chen, H., et al. (2019). RCC2 promotes breast cancer progression through regulation of Wnt signaling and inducing EMT. J. Cancer 10, 6837–6847. doi: 10.7150/jca.36430

PubMed Abstract | CrossRef Full Text | Google Scholar

Cheriyath, V., Glaser, K. B., Waring, J. F., Baz, R., Hussein, M. A., and Borden, E. C. (2007). G1P3, an IFN-induced survival factor, antagonizes TRAIL-induced apoptosis in human myeloma cells. J. Clin. Invest. 117, 3107–3117. doi: 10.1172/JCI31122

PubMed Abstract | CrossRef Full Text | Google Scholar

Collins, F. S., and Varmus, H. (2015). A new initiative on precision medicine. N. Engl. J. Med. 372, 793–795. doi: 10.1056/NEJMp1500523

PubMed Abstract | CrossRef Full Text | Google Scholar

Ding, L., Zhao, Y., Dang, S., Wang, Y., Li, X., Yu, X., et al. (2019). Circular RNA circ-DONSON facilitates gastric cancer growth and invasion via NURF complex dependent activation of transcription factor SOX4. Mol. Cancer 18:45. doi: 10.1186/s12943-019-1006-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Dong, X., Chen, R., Lin, H., Lin, T., and Pan, S. (2018). lncRNA BG981369 inhibits cell proliferation, migration, and invasion, and promotes cell apoptosis by SRY-related high-mobility group box 4 (SOX4) signaling pathway in human gastric cancer. Med. Sci. Monit. 24, 718–726. doi: 10.12659/msm.905965

PubMed Abstract | CrossRef Full Text | Google Scholar

Dou, Y., Guo, X., Yuan, L., Holding, D. R., and Zhang, C. (2015). Differential expression analysis in RNA-Seq by a naive bayes classifier with local normalization. Biomed. Res. Int. 2015:789516. doi: 10.1155/2015/789516

PubMed Abstract | CrossRef Full Text | Google Scholar

Duan, F., Mei, C., Yang, L., Zheng, J., Lu, H., Xia, Y., et al. (2020). Vitamin K2 promotes PI3K/AKT/HIF-1alpha-mediated glycolysis that leads to AMPK-dependent autophagic cell death in bladder cancer cells. Sci. Rep. 10:7714. doi: 10.1038/s41598-020-64880-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Ebert, K., Mattes, J., Kunzke, T., Zwingenberger, G., and Luber, B. (2019). MET as resistance factor for afatinib therapy and motility driver in gastric cancer cells. PLoS One 14:e0223225. doi: 10.1371/journal.pone.0223225

PubMed Abstract | CrossRef Full Text | Google Scholar

Fang, C. L., Hseu, Y. C., Lin, Y. F., Hung, S. T., Tai, C., Uen, Y. H., et al. (2012). Clinical and prognostic association of transcription factor SOX4 in gastric cancer. PLoS One 7:e52804. doi: 10.1371/journal.pone.0052804

PubMed Abstract | CrossRef Full Text | Google Scholar

Gabrovska, P. N., Smith, R. A., Tiang, T., Weinstein, S. R., Haupt, L. M., and Griffiths, L. R. (2011). Semaphorin–plexin signalling genes associated with human breast tumourigenesis. Gene 489, 63–69. doi: 10.1016/j.gene.2011.08.024

PubMed Abstract | CrossRef Full Text | Google Scholar

Gao, Y., Xie, M., Guo, Y., Yang, Q., Hu, S., and Li, Z. (2020). Long non-coding RNA FGD5-AS1 regulates cancer cell proliferation and chemoresistance in gastric cancer through miR-153-3p/CITED2 Axis. Front. Genet 11:715. doi: 10.3389/fgene.2020.00715

PubMed Abstract | CrossRef Full Text | Google Scholar

Guccini, I., Revandkar, A., D’Ambrosio, M., Colucci, M., Pasquini, E., Mosole, S., et al. (2021). Senescence Reprogramming by TIMP1 Deficiency Promotes Prostate Cancer Metastasis. Cancer Cell 39, 68–82.e9. doi: 10.1016/j.ccell.2020.10.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Hou, P., Shi, P., Jiang, T., Yin, H., Chu, S., Shi, M., et al. (2020). DKC1 enhances angiogenesis by promoting HIF-1alpha transcription and facilitates metastasis in colorectal cancer. Br. J. Cancer 122, 668–679. doi: 10.1038/s41416-019-0695-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Hu, Y., Zhang, Y., Ding, M., and Xu, R. (2020). LncRNA TMPO-AS1/miR-126-5p/BRCC3 axis accelerates gastric cancer progression and angiogenesis via activating PI3K/Akt/mTOR pathway. J. Gastroenterol. Hepatol. doi: 10.1111/jgh.15362 [Epub ahead of print],

CrossRef Full Text | PubMed Abstract | Google Scholar

Huang, C., Zhao, J., Luo, C., and Zhu, Z. (2020). Overexpression of DGKI in gastric cancer predicts poor prognosis. Front. Med. (Lausanne) 7:320. doi: 10.3389/fmed.2020.00320

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, L., Liu, S., Lei, Y., Wang, K., Xu, M., Chen, Y., et al. (2016). Systemic immune-inflammation index, thymidine phosphorylase and survival of localized gastric cancer patients after curative resection. Oncotarget 7, 44185–44193. doi: 10.18632/oncotarget.9923

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, N., Dai, W., Li, Y., Sun, J., Ma, C., and Li, W. (2020). LncRNA PCAT-1 upregulates RAP1A through modulating miR-324-5p and promotes survival in lung cancer. Arch. Med. Sci. 16, 1196–1206. doi: 10.5114/aoms.2019.84235

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, Y., Hu, K., Zhang, S., Dong, X., Yin, Z., Meng, R., et al. (2018). S6K1 phosphorylation-dependent degradation of Mxi1 by beta-Trcp ubiquitin ligase promotes Myc activation and radioresistance in lung cancer. Theranostics 8, 1286–1300. doi: 10.7150/thno.22552

PubMed Abstract | CrossRef Full Text | Google Scholar

Kamarudin, A. N., Cox, T., and Kolamunnage-Dona, R. (2017). Time-dependent ROC curve analysis in medical research: current methods and applications. BMC Med. Res. Methodol. 17:53. doi: 10.1186/s12874-017-0332-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Killian, A., Sarafan-Vasseur, N., Sesboue, R., Le Pessot, F., Blanchard, F., Lamy, A., et al. (2006). Contribution of the BOP1 gene, located on 8q24, to colorectal tumorigenesis. Genes Chromosomes Cancer 45, 874–881. doi: 10.1002/gcc.20351

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, J. C., Ha, Y. J., Tak, K. H., Roh, S. A., Kwon, Y. H., Kim, C. W., et al. (2018). Opposite functions of GSN and OAS2 on colorectal cancer metastasis, mediating perineural and lymphovascular invasion, respectively. PLoS One 13:e0202856. doi: 10.1371/journal.pone.0202856

PubMed Abstract | CrossRef Full Text | Google Scholar

Kong, W., Wu, Z., Yang, M., Zuo, X., Yin, G., and Chen, W. (2020). LMNB2 is a prognostic biomarker and correlated with immune infiltrates in hepatocellular carcinoma. IUBMB Life 72, 2672–2685. doi: 10.1002/iub.2408

PubMed Abstract | CrossRef Full Text | Google Scholar

Kudo, Y., Yasui, W., Ue, T., Yamamoto, S., Yokozaki, H., Nikai, H., et al. (1997). Overexpression of Cyclin-dependent Kinase-activating CDC25B Phosphatase in Human Gastric Carcinomas. Jpn. J. Cancer Res. 88, 947–952. doi: 10.1111/j.1349-7006.1997.tb00313.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Lawrence, W. (2004). Gastric adenocarcinoma. Curr. Treat. Options Gastroenterol. 7, 149–157. doi: 10.1007/s11938-004-0036-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, Q., Lai, Q., He, C., Fang, Y., Yan, Q., Zhang, Y., et al. (2019). RUNX1 promotes tumour metastasis by activating the Wnt/beta-catenin signalling pathway and EMT in colorectal cancer. J. Exp. Clin. Cancer Res. 38:334. doi: 10.1186/s13046-019-1330-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, X. N., Yang, H., and Yang, T. (2020). miR-122 inhibits hepatocarcinoma cell progression by targeting LMNB2. Oncol. Res. 28, 41–49. doi: 10.3727/096504019X15615433287579

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, Y., Li, W., Lin, J., Lv, C., and Qiao, G. (2020). miR-146a enhances the sensitivity of breast cancer cells to paclitaxel by downregulating IRAK1. Cancer Biother. Radiopharm. doi: 10.1089/cbr.2020.3873 [Epub ahead of print],

CrossRef Full Text | PubMed Abstract | Google Scholar

Liaw, A., and Wiener, M. (2002). Classification and Regression by randomForest. R News, 2, 18-22, R Package Version 4.6-14, 2018.

Google Scholar

Lin, H., Huang, B., Wang, H., Liu, X., Hong, Y., Qiu, S., et al. (2018). MTHFD2 overexpression predicts poor prognosis in renal cell carcinoma and is associated with cell proliferation and vimentin-modulated migration and invasion. Cell Physiol. Biochem. 51, 991–1000. doi: 10.1159/000495402

PubMed Abstract | CrossRef Full Text | Google Scholar

Lin, Y. S., Tsai, K. L., Chen, J. N., and Wu, C. S. (2020). Mangiferin inhibits lipopolysaccharide-induced epithelial-mesenchymal transition (EMT) and enhances the expression of tumor suppressor gene PER1 in non-small cell lung cancer cells. Environ. Toxicol. 35, 1070–1081. doi: 10.1002/tox.22943

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, H., Zhang, Y., Hao, X., Kong, F., Li, X., Yu, J., et al. (2016). GPRC5A overexpression predicted advanced biological behaviors and poor prognosis in patients with gastric cancer. Tumor Biol. 37, 503–510. doi: 10.1007/s13277-015-3817-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, Q., Li, Y., Lv, W., Zhang, G., Tian, X., Li, X., et al. (2018). UCA1 promotes cell proliferation and invasion and inhibits apoptosis through regulation of the miR129-SOX4 pathway in renal cell carcinoma. Onco Targets Ther. 11, 2475–2487. doi: 10.2147/OTT.S160192

PubMed Abstract | CrossRef Full Text | Google Scholar

Lossos, I. S., Czerwinski, D. K., Alizadeh, A. A., Wechser, M. A., Tibshirani, R., Botstein, D., et al. (2004). Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N. Engl. J. Med. 350, 1828–1837. doi: 10.1056/NEJMoa032520

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, L., Wang, J., Wu, Y., Wan, P., and Yang, G. (2016). Rap1A promotes ovarian cancer metastasis via activation of ERK/p38 and notch signaling. Cancer Med. 5, 3544–3554. doi: 10.1002/cam4.946

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, M., Ding, N., Zhuang, S., and Li, Y. (2020). LINC01410/miR-23c/CHD7 functions as a ceRNA network to affect the prognosis of patients with endometrial cancer and strengthen the malignant properties of endometrial cancer cells. Mol. Cell Biochem. 469, 9–19. doi: 10.1007/s11010-020-03723-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Ma, H., Wu, Z., Peng, J., Li, Y., and Liao, W. (2018). Inhibition of SLC1A5 sensitizes colorectal cancer to cetuximab: SLC1A5 inhibition enhances the efficacy of cetuximab. Int. J. Cancer 142, 2578–2588. doi: 10.1002/ijc.31274

PubMed Abstract | CrossRef Full Text | Google Scholar

Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., and Leisch, F. (2019). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R Package Version 1.7-6, 2021.

Google Scholar

Nagarajan, R., Scutari, M., and Lebre, S. (2013). Bayesian Networks in R with Applications in Systems Biology. Springer. R Package Version 4.6-1, 2020.

Google Scholar

Necula, L., Matei, L., Dragu, D., Pitica, I., Neagu, A. I., Bleotu, C., et al. (2020). High plasma levels of COL10A1 are associated with advanced tumor stage in gastric cancer patients. World J. Gastroenterol. 26, 3024–3033. doi: 10.3748/wjg.v26.i22.3024

PubMed Abstract | CrossRef Full Text | Google Scholar

Parker, J. S., Mullins, M., Cheang, M. C., Leung, S., Voduc, D., Vickery, T., et al. (2009). Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160–1167. doi: 10.1200/JCO.2008.18.1370

PubMed Abstract | CrossRef Full Text | Google Scholar

Pellegrini, K. L., Sanda, M. G., and Moreno, C. S. (2015). RNA biomarkers to facilitate the identification of aggressive prostate cancer. Mol. Aspects Med. 45, 37–46. doi: 10.1016/j.mam.2015.05.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Peterschmitt, M. J., Cox, G. F., Ibrahim, J., MacDougall, J., Underhill, L. H., Patel, P., et al. (2018). A pooled analysis of adverse events in 393 adults with Gaucher disease type 1 from four clinical trials of oral eliglustat: evaluation of frequency, timing, and duration. Blood Cells Mol. Dis. 68, 185–191. doi: 10.1016/j.bcmd.2017.01.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., et al. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43:e47. doi: 10.1093/nar/gkv007

PubMed Abstract | CrossRef Full Text | Google Scholar

Shao, J. P., Su, F., Zhang, S. P., Chen, H. K., Li, Z. J., Xing, G. Q., et al. (2020). miR-212 as potential biomarker suppresses the proliferation of gastric cancer via targeting SOX4. J. Clin. Lab. Anal. 34:e23511. doi: 10.1002/jcla.23511

PubMed Abstract | CrossRef Full Text | Google Scholar

Sharad, S., Dobi, A., Srivastava, S., Srinivasan, A., and Li, H. (2020). PMEPA1 gene isoforms: a potential biomarker and therapeutic target in prostate cancer. Biomolecules 10:1221. doi: 10.3390/biom10091221

PubMed Abstract | CrossRef Full Text | Google Scholar

Siegel, R. L., Miller, K. D., and Jemal, A. (2016). Cancer statistics, 2016. CA Cancer J. Clin. 66, 7–30. doi: 10.3322/caac.21332

PubMed Abstract | CrossRef Full Text | Google Scholar

Song, Y. X., Sun, J. X., Zhao, J. H., Yang, Y. C., Shi, J. X., Wu, Z. H., et al. (2017). Non-coding RNAs participate in the regulatory network of CLDN4 via ceRNA mediated miRNA evasion. Nat. Commun. 8:289. doi: 10.1038/s41467-017-00304-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Song, Z., Wu, Y., Yang, J., Yang, D., and Fang, X. (2017). Progress in the treatment of advanced gastric cancer. Tumour Biol. 39:1010428317714626. doi: 10.1177/1010428317714626

PubMed Abstract | CrossRef Full Text | Google Scholar

Szklarczyk, D., Gable, A. L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., et al. (2019). STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613. doi: 10.1093/nar/gky1131

PubMed Abstract | CrossRef Full Text | Google Scholar

Tan, Z. (2019). Recent advances in the surgical treatment of advanced gastric cancer: a review. Med. Sci. Monit. 25, 3537–3541. doi: 10.12659/MSM.916475

PubMed Abstract | CrossRef Full Text | Google Scholar

Teschendorff, A. E., Sollich, P., and Kuehn, R. (2014). Signalling entropy: a novel network-theoretical framework for systems analysis and interpretation of functional omic data. Methods 67, 282–293. doi: 10.1016/j.ymeth.2014.03.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Thapa, S., Lomholt, M. A., Krog, J., Cherstvy, A. G., and Metzler, R. (2018). Bayesian analysis of single-particle tracking data using the nested-sampling algorithm: maximum-likelihood model selection applied to stochastic-diffusivity data. Phys. Chem. Chem. Phys. 20, 29018–29037. doi: 10.1039/C8CP04043E

PubMed Abstract | CrossRef Full Text | Google Scholar

Thrift, A. P., and El-Serag, H. B. (2020). Burden of gastric cancer. Clin. Gastroenterol. Hepatol. 18, 534–542. doi: 10.1016/j.cgh.2019.07.045

PubMed Abstract | CrossRef Full Text | Google Scholar

Tsai, M. M., Huang, H. W., Wang, C. S., Lee, K. F., Tsai, C. Y., Lu, P. H., et al. (2016). MicroRNA-26b inhibits tumor metastasis by targeting the KPNA2/c-jun pathway in human gastric cancer. Oncotarget 7, 39511–39526. doi: 10.18632/oncotarget.8629

PubMed Abstract | CrossRef Full Text | Google Scholar

Uehara, T., Kikuchi, H., Miyazaki, S., Iino, I., Setoguchi, T., Hiramatsu, Y., et al. (2016). Overexpression of Lysophosphatidylcholine Acyltransferase 1 and Concomitant Lipid alterations in gastric cancer. Ann. Surg. Oncol. 23(Suppl. 2), 206–213. doi: 10.1245/s10434-015-4459-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, H., Sham, P., Tong, T., and Pang, H. (2020). Pathway-based single-cell RNA-Seq classification, clustering, and construction of gene-gene interactions networks using random forests. IEEE J. Biomed. Health Inform. 24, 1814–1822. doi: 10.1109/JBHI.2019.2944865

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, N., Liu, W., Zheng, Y., Wang, S., Yang, B., Li, M., et al. (2018). CXCL1 derived from tumor-associated macrophages promotes breast cancer metastasis via activating NF-kappaB/SOX4 signaling. Cell Death Dis. 9:880. doi: 10.1038/s41419-018-0876-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, P., Liu, G. Z., Wang, J. F., and Du, Y. Y. (2020). SNHG3 silencing suppresses the malignant development of triple-negative breast cancer cells by regulating miRNA-326/integrin alpha5 axis and inactivating Vav2/Rac1 signaling pathway. Eur. Rev. Med. Pharmacol. Sci. 24, 5481–5492. doi: 10.26355/eurrev_202005_21333

CrossRef Full Text | Google Scholar

Wang, Q., Liu, J., You, Z., Yin, Y., Liu, L., Kang, Y., et al. (2021). LncRNA TINCR favors tumorigenesis via STAT3-TINCR-EGFR-feedback loop by recruiting DNMT1 and acting as a competing endogenous RNA in human breast cancer. Cell Death Dis. 12:83. doi: 10.1038/s41419-020-03188-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, R. G., Zhang, D., Zhao, C. H., Wang, Q. L., Qu, H., and He, Q. S. (2020). FKBP10 functioned as a cancer-promoting factor mediates cell proliferation, invasion, and migration via regulating PI3K signaling pathway in stomach adenocarcinoma. Kaohsiung J. Med. Sci. 36, 311–317. doi: 10.1002/kjm2.12174

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, Y. (2017). Transcriptional regulatory network analysis for gastric cancer based on mRNA microarray. Pathol. Oncol. Res. 23, 785–791. doi: 10.1007/s12253-016-0159-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, Y., Ma, C., Zhang, H., and Wu, J. (2012). Novel protein pp3501 mediates the inhibitory effect of sodium butyrate on SH-SY5Y cell proliferation. J. Cell. Biochem. 113, 2696–2703. doi: 10.1002/jcb.24145

PubMed Abstract | CrossRef Full Text | Google Scholar

Wei, C. Y., Zhu, M. X., Lu, N. H., Peng, R., Yang, X., Zhang, P. F., et al. (2019). Bioinformatics-based analysis reveals elevated MFSD12 as a key promoter of cell proliferation and a potential therapeutic target in melanoma. Oncogene 38, 1876–1891. doi: 10.1038/s41388-018-0531-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Wei, Y., Chen, X., Liang, C., Ling, Y., Yang, X., Ye, X., et al. (2020). A noncoding regulatory RNAs network driven by Circ-CDYL acts specifically in the early stages hepatocellular carcinoma. Hepatology 71, 130–147. doi: 10.1002/hep.30795

PubMed Abstract | CrossRef Full Text | Google Scholar

Weihs, C., Ligges, U., Luebke, K., and Raabe, N. (2005). klaR Analyzing German Business Cycles. Data Analysis and Decision Support, 335-343. R Package Version 0.6-15, 2020.

Google Scholar

Xi, X., Li, T., Huang, Y., Sun, J., Zhu, Y., Yang, Y., et al. (2017). RNA biomarkers: frontier of precision medicine for cancer. Noncoding RNA 3:9. doi: 10.3390/ncrna3010009

PubMed Abstract | CrossRef Full Text | Google Scholar

Xia, T., Liao, Q., Jiang, X., Shao, Y., Xiao, B., Xi, Y., et al. (2014). Long noncoding RNA associated-competing endogenous RNAs in gastric cancer. Sci. Rep. 4:6088. doi: 10.1038/srep06088

PubMed Abstract | CrossRef Full Text | Google Scholar

Xu, Y., Liu, Z., and Guo, K. (2012). Expression of FHL1 in gastric cancer tissue and its correlation with the invasion and metastasis of gastric cancer. Mol. Cell Biochem. 363, 93–99. doi: 10.1007/s11010-011-1161-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Yoon, C., Till, J., Cho, S. J., Chang, K. K., Lin, J. X., Huang, C. M., et al. (2019). KRAS activation in gastric adenocarcinoma stimulates epithelial-to-mesenchymal transition to cancer stem-like cells and promotes metastasis. Mol. Cancer Res. 17, 1945–1957. doi: 10.1158/1541-7786.MCR-19-0077

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, G., Wang, L. G., Han, Y., and He, Q. Y. (2012). clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287. doi: 10.1089/omi.2011.0118

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, G., Zhang, W., Li, B., Stringer-Reasor, E., Chu, C., Sun, L., et al. (2017). MicroRNA-200c and microRNA- 141 are regulated by a FOXP3-KAT2B axis and associated with tumor metastasis in breast cancer. Breast Cancer Res. 19:73. doi: 10.1186/s13058-017-0858-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, X., and Liu, S. (2017). RBPPred: predicting RNA-binding proteins from sequence using SVM. Bioinformatics 33, 854–862. doi: 10.1093/bioinformatics/btw730

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Z., Wang, J., Mao, J., Li, F., Chen, W., and Wang, W. (2020). Determining the clinical value and critical pathway of GTPBP4 in lung adenocarcinoma using a bioinformatics strategy: a study based on datasets from the cancer genome atlas. Biomed. Res. Int. 2020:5171242. doi: 10.1155/2020/5171242

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, R., Liu, Z., Xu, W., Song, L., Ren, H., Ou, Y., et al. (2020). Helicobacter pylori infection leads to KLF4 inactivation in gastric cancer through a TET1-mediated DNA methylation mechanism. Cancer Med. 9, 2551–2563. doi: 10.1002/cam4.2892

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, S. F., Yin, X. J., Zhao, W. J., Liu, L. C., and Wang, Z. P. (2020). Biglycan as a potential diagnostic and prognostic biomarker in multiple human cancers. Oncol. Lett. 19, 1673–1682. doi: 10.3892/ol.2020.11266

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: complete transcriptomic profiles, biomarkers, sample classification, prognosis, generalization ability

Citation: Dong C, Rao N, Du W, Gao F, Lv X, Wang G and Zhang J (2021) mRBioM: An Algorithm for the Identification of Potential mRNA Biomarkers From Complete Transcriptomic Profiles of Gastric Adenocarcinoma. Front. Genet. 12:679612. doi: 10.3389/fgene.2021.679612

Received: 12 March 2021; Accepted: 06 May 2021;
Published: 27 July 2021.

Edited by:

Guini Hong, Gannan Medical University, China

Reviewed by:

Andrey Cherstvy, University of Potsdam, Germany
Dong Wang, Southern Medical University, China

Copyright © 2021 Dong, Rao, Du, Gao, Lv, Wang and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Nini Rao, cmFvbm5AdWVzdGMuZWR1LmNu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.