Skip to main content

ORIGINAL RESEARCH article

Front. Aging Neurosci., 18 October 2021
Sec. Neurocognitive Aging and Behavior
This article is part of the Research Topic Biomarkers from Multi-tracer and Multi-modal Neuroimaging in Age-related Neurodegenerative Diseases View all 29 articles

Application of Machine Learning and Weighted Gene Co-expression Network Algorithm to Explore the Hub Genes in the Aging Brain

\r\nKeping Chai*&#x;Keping Chai1*†Jiawei Liang&#x;Jiawei Liang2†Xiaolin ZhangXiaolin Zhang3Panlong CaoPanlong Cao1Shufang ChenShufang Chen1Huaqian GuHuaqian Gu1Weiping YeWeiping Ye1Rong LiuRong Liu3Wenjun HuWenjun Hu2Caixia Peng,*Caixia Peng4,5*Gang Logan Liu*Gang Logan Liu2*Daojiang Shen*Daojiang Shen1*
  • 1Department of Pediatrics, Zhejiang Hospital, Hangzhou, China
  • 2College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
  • 3Key Laboratory of Ministry of Education for Neurological Disorders, Department of Pathophysiology, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
  • 4Key Laboratory for Molecular Diagnosis of Hubei Province, Tongji Medical College, The Central Hospital of Wuhan, Huazhong University of Science and Technology, Wuhan, China
  • 5Central Laboratory, Tongji Medical College, The Central Hospital of Wuhan, Huazhong University of Science and Technology, Wuhan, China

Aging is a major risk factor contributing to neurodegeneration and dementia. However, it remains unclarified how aging promotes these diseases. Here, we use machine learning and weighted gene co-expression network (WGCNA) to explore the relationship between aging and gene expression in the human frontal cortex and reveal potential biomarkers and therapeutic targets of neurodegeneration and dementia related to aging. The transcriptional profiling data of the human frontal cortex from individuals ranging from 26 to 106 years old was obtained from the GEO database in NCBI. Self-Organizing Feature Map (SOM) was conducted to find the clusters in which gene expressions downregulate with aging. For WGCNA analysis, first, co-expressed genes were clustered into different modules, and modules of interest were identified through calculating the correlation coefficient between the module and phenotypic trait (age). Next, the overlapping genes between differentially expressed genes (DEG, between young and aged group) and genes in the module of interest were discovered. Random Forest classifier was performed to obtain the most significant genes in the overlapping genes. The disclosed significant genes were further identified through network analysis. Through WGCNA analysis, the greenyellow module is found to be highly negatively correlated with age, and functions mainly in long-term potentiation and calcium signaling pathways. Through step-by-step filtering of the module genes by overlapping with downregulated DEGs in aged group and Random Forest classifier analysis, we found that MAPT, KLHDC3, RAP2A, RAP2B, ELAVL2, and SYN1 were co-expressed and highly correlated with aging.

Introduction

The brain is highly sensitive to aging and lots of neurological diseases are aging-promoted processes. An important issue is how normal brain aging transitions to pathological aging, giving rise to neurodegenerative disorders (Wyss-Coray, 2016; Hou et al., 2019; Juan and Adlard, 2019). Despite this central role in disease pathogenesis and morbidity, the aging of the brain has not been well understood at a molecular level. Several hypotheses, such as DNA damage, loss of neural circuits and synapses, and mitochondrial dysfunction theories, were established (Lu et al., 2004; Yankner et al., 2008; Stern, 2012; Hou et al., 2019). Exploring molecular changes in the aging brain can provide a basis for a better understanding of neurodegenerative diseases and dementia.

SOM is a clustering and classification method based on neural network (Furukawa, 2009). Similar to other types of center point clustering algorithms such as K-means, SOM also finds a set of centroids (also called codebook vector), and then maps each object in the data set to the corresponding centroids according to the principle of most similarity. In neural network terms, each neuron corresponds to a center point. In our study, we performed SOM on gene expression matrix to cluster genes with highly similar expression patterns and find the pattern in which gene expression decreases with aging.

Weighted gene co-expression network analysis (WGCNA) is a biology algorithm used to describe the correlation of gene expression based on the microarray data (Langfelder and Horvath, 2008). WGCNA can be used for clustering genes with highly correlated expression, for relating the modules to phenotypes to get the most phenotypic trait-related module, and for summarizing these co-expressed gene clusters by identification of the module eigengene or hub genes. Random forest (RF) is a more advanced machine learning algorithm based on decision tree. Like other decision trees, random forests can be used for both regression and classification. In this study, we conducted RF classifier to classify the different age groups based on the gene expression matrix, then we selected the most significant genes for further analysis. Further Topological network analysis can identify the key players within modules, and thus facilitate the discovery of candidate biomarkers or therapeutic targets.

In this study, we performed machine learning and WGCNA analysis on publicly accessible transcriptome data obtained from human frontal cortex of individuals at different ages. We identified 17 co-expression modules. Through calculating the correlation coefficient between the module and age phenotype, we obtained a module of interest. Next, we disclosed the overlapping genes between differentially expressed genes (DEGs of aged group compared to young group) and genes in the module of interest. Using these overlapping genes, we conducted GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis and further identify the central players within the module through network analysis. We concluded that ELAVL2, RAP2A, RAP2B, KLHDC3, and CALM1 genes are significantly associated with aging, and may be novel biomarkers involved in neurodegeneration and dementia.

Results

Self-Organizing Feature Map Construction and Cluster Identification

The expression matrix of GSE1572 was used as input dataset. In this dataset, after removing one abnormal sample, 30 samples were detected and used as SOM input features (Figure 1A). The expression data of each gene (in total more than 11,000 genes) in all samples was used as input data. We set the number of output neurons of the network to 100, and obtained the neural network after training (Figure 1C and Supplementary Figure 1). The weight matrix (30 × 100 size) corresponding to each feature was used as the input data of hierarchical clustering to cluster 100 neurons again. 100 neurons were clustered into six categories (Figures 1B,C). SOM clustering data showed that the gene expression of neuron 100, 99, and 89 gradually decreased with age. Next, we checked the expression levels of genes in these three clusters (Figures 1B,D). It was revealed that 240 genes, including MAPT, MAP2, MAPK3, SYN2, RAP2A, RAP2B, KLHDC3, and CALM1, gradually downregulated with aging.

FIGURE 1
www.frontiersin.org

Figure 1. SOM clustering of genes based on microarray data. (A) Flow chart of SOM clustering, xjn refers to the gene j expression level in nth sample, neuron i refers to the i cluster. (B) Hierarchical clustering on SOM clustering results; each 100 sub-clusters were divided into six major clusters. (C) The expression trend of genes in each neuron in the samples (Neuron 1–100, from bottom to top, from left to right). (D) The heatmap of gene expression in neuron 89, 99, and 100.

Weight Gene Co-expression Network Construction and Module Identification

Before WGCNA, the genes detected in GSE1572 were filtered according to the filtering procedure described in “Materials and Methods” section, and 5,000 genes were obtained. Then the 30 samples’ microarray data were read by R for Hierarchical clustering (Supplementary Figure 2A). Finally, 30 sets of data were obtained and matched to age (Supplementary Figure 2B). WGCNA was performed to identify gene co-expression networks associated with age. In the co-expression network, the degree of association between a module and other modules can be evaluated by the average connection degree and scale independence. Specifically, the closer the mean connectivity is to 0 and the closer the scale independence is to 1, the lower the correlation between modules. In the study, we set the threshold of scale independence to 0.9. We found that when the power value reaches 12, the scale independence can reach 0.9, and the mean connectivity is close to 0 (Supplementary Figure 3). Through the calculation of the correlation coefficient between genes, the genes were clustered according to the expression pattern theoretically, and the patterned genes are clustered into the same module. Seventeen co-expressed modules, ranging in size from 37 to 1,524 genes (assigning each module a color for reference), were identified (Supplementary Table 1 and Figure 2).

FIGURE 2
www.frontiersin.org

Figure 2. WGCNA analysis of the microarray data. (A) Network analysis of gene expression in aging identifies distinct modules of co-expression data. (B) Pearson correlation coefficient between the age and module eigengene, numbers in brackets indicate the corresponding p-values. (C) Correlation between gene significance (GS) and module membership (MM) for the clinical trait of age of genes in yellowgreen module. Cor represents absolute correlation coefficient between MM and GS.

Finding the Module of Interest, Functional Annotation, and Identification of the Overlapping Genes Between Differentially Expressed Genes in Young/Old Individuals and Genes in the Module of Interest Verified in Weighted Gene Co-expression Network Analysis

To identify modules most significantly associated with age, the Pearson’s correlation coefficient between the module and age was calculated. The highest negative association in the module trait relationship was found between yellowgreen module and age score (cor = −0.83, p < 0.001, Figure 2B). Thus, yellowgreen module was selected as the module of interest in subsequent analyses. To confirm the correlation between module of interest and age, labeleHeatmap function was used to calculate the correlation values of module membership with gene significance (age) in the greenyellow module. The results showed significant correlation of module membership with gene significance in age (cor = 0.81, p < 0.0001) in greenyellow module (Figure 2C). To find the DEGs between young and aged individuals, the frontal cortical samples were grouped into individuals ≤42 and ≥73 years old and Limma packages were performed (see section “Materials and Methods” for age grouping criteria). About 4% of the genes analyzed were significantly changed (1.5-fold change or more, Figure 3A). Next, we performed overlap analysis between downregulated DEGs and genes in greenyellow module using the online veen tool; we found 45 genes in greenyellow module were also down-regulated DEGs (Figures 3B–D). These genes highly related to aging, and showed decreased expression during aging, suggesting that they might play important roles in age-related degeneration.

FIGURE 3
www.frontiersin.org

Figure 3. Identifying the overlapping genes between downregulated DEGs in aged group and genes in greenyellow module. (A) Heatmap of the expression of DEGs. (B) Heatmap of the gene expression in greenyellow module. (C) Using veen tools to find the overlap genes between downregulated genes in DEGs and genes in greenyellow module. (D) Heatmap showing the expression of the overlapping genes in the different samples.

Identifying Hub Genes and Gene Functional Annotation

The above identified overlapping genes were subjected to GO functional and KEGG pathway enrichment analyses. Biological processes of overlapping genes were found to focus on modulation of chemical synaptic transmission and regulation of trans-synaptic signaling. Cell components of overlapping genes were found to focus on postsynaptic density and axon part; molecule functions of overlapping genes were found to focus on primary active transmembrane transporter activity and P-P-bond-hydrolysis-driven transmembrane transporter activity (Figure 4). In KEGG pathway analysis, calcium signaling pathway (p = 1.1498E-06; Table 1) and MAPK signaling pathway (p = 0.000027; Table 1) were the most significant pathways involved in overlapping genes.

FIGURE 4
www.frontiersin.org

Figure 4. GO enrichment analysis of the overlapping genes. X-axis shows the terms of GO pathway and Y-axis shows the number of genes.

TABLE 1
www.frontiersin.org

Table 1. KEGG pathway analysis of the overlapping genes.

Identification of the Most Significant Genes and Network Construction

To identify the most important genes related to aging, the overlapping genes were further filtered by RF classification. Gene counts were input into RF classifier model, the unimportant genes, such as ABI2, YWHAZ, MAPK9, RAN and others were removed, and the 21 retained genes were used for the subsequent analysis (Figure 5A). To ascertain the significance of genes and analyze the network in the corresponding modules, the PPI maps were constructed via genemania and String (Figures 5B,C). Hub genes in the network, including MAPT, PAK1, RAP2A, RAP2B, KLHDC3, TPPP, and ELAVL2, were constructed. In the single-cell sequencing database Tubula, we found that the distribution of KLHDC3 and RAP2A in brain cells is very similar, mainly in oligodendrocytes and neurons.

FIGURE 5
www.frontiersin.org

Figure 5. Identifying the most important genes via RF and the cellular distribution of the important genes in the brain. (A) Random Forest algorithm result. The blue box plot corresponds to the minimum, average, and maximum Z scores of a color attribute. The red, yellow, and green boxes represent the Z scores of rejected, tentative, and confirmed genes, respectively. (B) The PPI network of important genes via genemalia. (C) The PPI network of important genes via String. (D) The scatterplot shows the distribution of different kinds of cells in TSNE. (E,F) KLHDC3 and RAP2A expression in different cell types.

Discussion

In this study, the dataset GSE1572 includes samples from individuals of varying age from 26 years old to 106 years old; such data from multiple samples based on age is a good candidate for SOM clustering and WGCNA analysis. First, we performed the SOM on the whole genome expression data. The SOM algorithm is usually used for data feature extraction, clustering, and classification (Furukawa, 2009). In this study, we used SOM to cluster genes in the expression matrix. In the clustering results of SOM, neurons 100, 89, and 99 are found to be related with aging. The genes in these neurons, such as MAPT, MAP2, MAPK3, SYN2, RAP2A, RAP2B, KLHDC3, and CALM1, were gradually down-regulated with age. Although SOM can identify some clusters of genes related to aging, this method has certain shortcomings, such as the large number of genes found, which makes it hard to screen key genes, and genes clusters having poor biological interpretation. In order to more accurately find the most relevant genes with aging, weight gene co-expression network was constructed, and we identified 17 co-expressed modules. The expression changes of genes in the same module in different samples are highly similar, indicating consistent effects and potential interaction of these gene-coded proteins in the same pathways during the aging process. Through Pearson’s correlation coefficient between the module and age, we obtained the interest module. In order to identify the significant genes, we took the intersection of the genes in the greenyellow module and the differentially expressed genes which were downregulated in aged group, and obtained 45 genes. Furthermore, we found that these overlapping genes of greenyellow module and DEGs also exist in the gene cluster found in SOM, which further confirms that these genes may be related to aging. Further KEGG pathway and GO functional enrichment analyses indicated calcium signaling pathway, long-term potentiation, and MAPK signaling pathway as the most significant pathways in the module. In order to identify genes that are most intensively related with aging, we further used one of the machine learning algorithms, Random Forest, and input the expression of the above 45 genes as feature values into the model for training, and finally screened out 21 key genes.

In another study by us (Liang et al., 2018; Chai et al., 2021), we took samples of different brain regions from different Braak stages (GSE131617) and found that microglia-mediated immune system activation plays a crucial role in the early stages of Alzheimer’s disease. The samples we used in this study are only samples of the frontal cortex of different ages, and do not contain any clinical diagnosis and pathological changes, which is more conducive to discovering the changes in the brain during the aging process.

Analysis of hub genes showed that SYN2 might play an important role in aging. In the Cell Component (CC) enrichment analyses, postsynaptic density and distal axon were identified as the most significant CC in the network. In the Biological Process (BP) enrichment analysis, synaptic vesicle localization was revealed to be a significant BP in the network. SYN2 is a multigene family coding synaptic vesicle (SV) phosphoproteins implicated in the regulation of synaptic transmission and plasticity (Luk et al., 2012). In previous studies, it was shown that SYN2 knockdown mice display emotional and spatial memory deficits that aggravated during aging (Corradi et al., 2008; Boido et al., 2010). In the co-expression network constructed in the present study, the expression of SYN2 decreases with the increase of age. We suspected that the decreased expression of SYN2 is either a result of synapse impairment/loss during aging, or an upstream factor that induces synaptic dysfunction.

In the co-expression network, MAPT and MAP2 were identified as hub genes. MAPT encodes microtubule-associated protein tau, which promotes the stability and assembly of microtubules in axon of neurons (Dehmelt and Halpain, 2005; Irwin et al., 2013; Wang and Mandelkow, 2016; Saha and Sen, 2019; Vogels et al., 2019). This was in accordance with the fact that distal axon is a significant CC in the GO enrichment analysis. In age-related tauopathy, tau pathology has been considered as a significant marker in neurodegeneration. MAP2 gene encodes dendritic marker MAP-2, which is also a microtubule-associated protein (Friedrich and Aszódi, 1991; Dehmelt and Halpain, 2005). Microtubule is a key player in neuronal activities and axoplasmic flow under physiological conditions. In our study, we found that with the increase of age, the expression of MAPT and MAP2 decreases, which may be a result of neurite degeneration during aging. However, genes that code other skeletal proteins such as tubulin were not identified as hub genes in aging. This result indicates that microtubule-associated proteins tau and MAP-2 may participate in aging-related pathogenesis through mechanisms other than cell skeletal stability.

Analysis of hub genes also showed that RAP2A and RAP2B were hub genes in the co-expression network. RAP2A and RAP2B belong to the small GTPase superfamily (Emery et al., 2017). Most studies about RAP2A and RAP2B focus on their functions in tumor (Zheng et al., 2017; Zhang et al., 2020). RAP2A is overexpressed in a multitude of human cancers and plays an important role in cytoskeleton rearrangement, arteriogenesis, and cell migration. In neurons, it was found that RAP2 stimulated dendritic pruning, reduced synaptic density, and caused removal of synaptic AMPA receptors, suggesting that RAP2 plays a role in regulating synaptic functions (Kawabe et al., 2010; Hu et al., 2019). In our study, we found that RAP2A and RAP2B were interacted and co-localized with MAP2 in the co-expression network and string network. Therefore, RAP2A and RAP2B may have a similar function or cooperate with MAP2. We speculate that the main function of RAP2A in the brain is also involved in regulation of dendritic development and plasticity.

To our surprise, KLDHC3 was found mainly co-expressed with RAP2A and RAP2B in the co-expression network. Its related pathways are Unfolded Protein Response (UPR) and metabolism of proteins, and a few studies report its function in the brain (Niculescu et al., 2015). In our study, KLHDC3 and RAP2A are consistently distributed in different cells in the brain (Figures 5D–F), so we speculate they may also participate in similar functions in the brain. The decrease of the expression of KLHDC3 with age may also play a role in the impairment of dendritic and synaptic plasticity during aging. Further studies needed to reveal the function of KLDHC3 in neurons.

At last, ELAVL2 was characterized as a hub gene with PAK1, MAPT, RAP2A, and RAP2B in the same module. Some studies report that ELAVL2-regulated pathways are involved in normal human brain function and their disruption may play a role in neurodevelopmental disorders such as autism spectrum disorder (ASD) (Berto et al., 2016; Ohi et al., 2017; Kato et al., 2019). However, the function of ELAVL2 in the aging brain has not been reported yet. In our study, ELAVL2 was found to be co-localized with PAK1, and co-expressed and interacted with tau. Both tau and PAK1 are involved in axonal guidance and neuronal migration (Dehmelt and Halpain, 2005; Koth et al., 2014). Therefore, we speculate that ELAVL2 may play a consistent role with tau and PAK1 in neurons.

In summary, through machine learning and WGCNA on microarray data from human frontal cortex, we uncovered that RAP2A, RAP2B, KLHDC3, and ELAVL2 may be associated with aging. The proteins encoded by these genes may play a coordinated role in the brain with the proteins tau, MAP-2, SYN, and CALM family in neurodegenerative diseases, which may be novel biomarkers of neurodegenerative diseases caused by aging.

Materials and Methods

Data Acquisition and Preprocessing

The data used in this paper was obtained from the GEO database in NCBI1 (Gene Expression Omnibus), and the data entry number is GSE1572 (Lu et al., 2004). The platform is Affymetrix Human Genome U95 Version 2 Array [HG_U95Av2]. Gene expression in the frontal cortex of 18 normal males and 12 normal females at 26–106 years old was detected. The normalized data was downloaded and the expression matrix was obtained, and data filtering was performed before WGCNA analysis. For data filtering, the standard deviation of the gene expression was calculated to obtain a list with decreasing standard deviations, the first 5,000 genes with large standard deviations were obtained, and the probe without corresponding annotation information were removed. There were about 11,000 genes in the dataset; after the data preprocessing, we kept 5,000 genes for further analysis.

Finding Genes With Highly Similar Expression Pattern Through Self-Organizing Feature Map Algorithm

The SOM clustering was constructed by kohonen package based on R 3.4.2 (Furukawa, 2009). The 31 frontal cortical samples were treated as 31 input features. The expression counts of each gene in 31 samples are used as input data. Through inputting the data to SOM cluster model to cluster the genes, we can obtain the cluster to show which gene expression decreases with aging.

Construction of Weighted Gene Co-expression Network and Identification of Significant Modules

Data was processed using R 3.4.2 software. To ensure that the results of network construction are reliable, abnormal samples were removed. Then, the weighted gene co-expression network was constructed by WGCNA package based on R 3.4.2. First, the Pearson correlation coefficient was calculated to assess the similarity of the gene expression profiles. Second, the correlation coefficients between genes were weighted by a power function to obtain a scale-free network. A gene module is a cluster of densely interconnected genes in terms of co-expression. Then, hierarchical cluster was used to identify gene modules and different modules were represented by different colors. Dynamic treecut method was used to identify different modules, the adjacency matrix was converted to a topology overlay matrix (TOM), and modules were detected by cluster analysis during module selection.

Correlation Analysis of Gene Modules With Clinical Phenotype

To detect the associations of modules to clinical phenotype (age), first, the age data and gene expression data were correlated using the match function. Secondly, the associations of the module eigengene (ME) to the age were calculated by Pearson’s correlation analysis. Modules showing significant association to age were obtained. At last, to further confirm the modules with significant correlation to age, the correlation coefficient between the module membership (gene expression level) with gene significance (GS, for assessing the association of genes with phenotypes) was calculated using the labeleHeatmap function, and the p-values were obtained.

Finding the Overlapping Genes Between the Differentially Expressed Genes (DEGs in Aged Compared to Young Group) and Genes in the Module of Interest Verified by Weighted Gene Co-expression Network Analysis

The frontal cortical samples were grouped into individuals ≤42 (young group) and ≥73 years (aged group) and Limma packages were performed to find the DEGs; the group of individuals ≤42 years old showed the most homogeneous pattern of gene expression, and the group ≥73 years old was also relatively homogeneous. Moreover, these two age groups were negatively correlated with each other. In contrast, the middle age group ranging in age from 45 to 71 exhibited much greater heterogeneity, with some cases resembling the young group and others resembling the aged group (Lu et al., 2004; Ritchie et al., 2015). Next, the overlapping genes between downregulated DEGs and genes in the module of interest were discovered by using online veen tools.2

Gene Ontology and Kyoto Encyclopedia of Genes and Genomes Pathway Enrichment Analyses, Identification of Hub Genes, and Protein-Protein Interaction Analysis

For the obtained overlapping genes, functional enrichment of Gene Ontology (GO) and KEGG pathways analyses were performed using GSAT (Zhang et al., 2005)3 and GOplot packages based on R3.4.2. P-value < 0.05 was considered to be significant enrichment. These genes were also analyzed using cytoHubba in Cytoscape for identification of hub genes. The identified hub genes were further confirmed and analyzed using genemania (Warde-Farley et al., 2010).4 String network was constructed by the online tools String.5

Application of Random Forest Algorithm to Find the Most Important Genes Related to Aging

The frontal cortical samples were grouped into individuals ≤42 (young) and ≥73 years (old). Through inputting the overlapping genes counts into random forest classifier model to predict which group the samples belong to, the most important overlapping genes for the most accurate model for grouping were identified.

Exploring the Cellular Distribution of the Identified Genes

By using the single cell RNA-seq database Tubula6 (Tabula Muris Consortium et al., 2018), the cellular distribution of the identified important genes were further explored.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1572.

Author Contributions

KC contributed to the study design, performed the experiments, and contributed to the writing of the manuscript. JL contributed to the study design and the writing of the manuscript. XZ, PC, SC, WY, HG, RL, and WH conducted the experiments. CP, GL, and DS provided critical devices and contributed to the study design. All authors read and approved the final manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 31970964) and the Natural Science Foundation of Hubei Province, China (No. 2019CFB436).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We acknowledge GEO database for providing their platforms and contributors for uploading their meaningful datasets.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi.2021.707165/full#supplementary-material

Footnotes

  1. ^ https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1572
  2. ^ http://bioinformatics.psb.ugent.be/webtools/Venn/
  3. ^ http://www.webgestalt.org/option.php
  4. ^ http://genemania.org
  5. ^ http://string-db.org
  6. ^ https://tabula-muris.ds.czbiohub.org

References

Berto, S., Usui, N., Konopka, G., and Fogel, B. L. (2016). ELAVL2-regulated transcriptional and splicing networks in human neurons link neurodevelopment and autism. Hum. Mol. Genet. 25, 2451–2464. doi: 10.1093/hmg/ddw110

PubMed Abstract | CrossRef Full Text | Google Scholar

Boido, D., Farisello, P., Cesca, F., Ferrea, E., Valtorta, F., Benfenati, F., et al. (2010). Cortico-hippocampal hyperexcitability in synapsin I/II/III knockout mice: age-dependency and response to the antiepileptic drug levetiracetam. Neuroscience 171, 268–283. doi: 10.1016/j.neuroscience.2010.08.046

PubMed Abstract | CrossRef Full Text | Google Scholar

Chai, K., Liang, J., Zhang, X., Gu, H., Cao, P., Ye, W., et al. (2021). ARHGDIB Plays a Novel Role in the Braak Stages of Alzheimer’s Diseases via the Immune Response Mediated by Microglia. bioRxiv [Preprint] doi: 10.21203/rs.3.rs-474315/v1

PubMed Abstract | CrossRef Full Text | Google Scholar

Corradi, A., Zanardi, A., Giacomini, C., Onofri, F., Valtorta, F., Zoli, M., et al. (2008). Synapsin-I- and synapsin-II-null mice display an increased age-dependent cognitive impairment. J. Cell Sci. 121, 3042–3051. doi: 10.1242/jcs.035063

PubMed Abstract | CrossRef Full Text | Google Scholar

Dehmelt, L., and Halpain, S. (2005). The MAP2/Tau family of microtubule-associated proteins. Genome Biol. 6:204. doi: 10.1186/gb-2004-6-1-204

PubMed Abstract | CrossRef Full Text | Google Scholar

Emery, A. C., Xu, W., Eiden, M. V., and Eiden, L. E. (2017). Guanine nucleotide exchange factor Epac2-dependent activation of the GTP-binding protein Rap2A mediates cAMP-dependent growth arrest in neuroendocrine cells. J. Biol. Chem. 292, 12220–12231. doi: 10.1074/jbc.M117.790329

PubMed Abstract | CrossRef Full Text | Google Scholar

Friedrich, P., and Aszódi, A. (1991). MAP2: a sensitive cross-linker and adjustable spacer in dendritic architecture. FEBS Lett. 295, 5–9. doi: 10.1016/0014-5793(91)81371-e

CrossRef Full Text | Google Scholar

Furukawa, T. (2009). SOM of SOMs. Neural. Netw. 22, 463–478. doi: 10.1016/j.neunet.2009.01.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Hou, Y., Dan, X., Babbar, M., Wei, Y., Hasselbalch, S. G., Croteau, D. L., et al. (2019). Ageing as a risk factor for neurodegenerative disease. Nat. Rev. Neurol. 15, 565–581. doi: 10.1038/s41582-019-0244-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Hu, Y., Hong, X.-Y., Yang, X.-F., Ma, R.-H., Wang, X., Zhang, J.-F., et al. (2019). Inflammation-dependent ISG15 upregulation mediates MIA-induced dendrite damages and depression by disrupting NEDD4/Rap2A signaling. Biochim. Biophys. Acta Mol. Basis Dis. 1865, 1477–1489. doi: 10.1016/j.bbadis.2019.02.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Irwin, D. J., Lee, V. M.-Y., and Trojanowski, J. Q. (2013). Parkinson’s disease dementia: convergence of α-synuclein, tau and amyloid-β pathologies. Nat. Rev. Neurosci. 14, 626–636. doi: 10.1038/nrn3549

PubMed Abstract | CrossRef Full Text | Google Scholar

Juan, S. M. A., and Adlard, P. A. (2019). Ageing and cognition. Subcell. Biochem. 91, 107–122. doi: 10.1007/978-981-13-3681-2_5

CrossRef Full Text | Google Scholar

Kato, Y., Iwamori, T., Ninomiya, Y., Kohda, T., Miyashita, J., Sato, M., et al. (2019). ELAVL2-directed RNA regulatory network drives the formation of quiescent primordial follicles. EMBO Rep. 20:e48251. doi: 10.15252/embr.201948251

PubMed Abstract | CrossRef Full Text | Google Scholar

Kawabe, H., Neeb, A., Dimova, K., Young, S. M., Takeda, M., Katsurabayashi, S., et al. (2010). Regulation of Rap2A by the ubiquitin ligase Nedd4-1 controls neurite development. Neuron 65, 358–372. doi: 10.1016/j.neuron.2010.01.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Koth, A. P., Oliveira, B. R., Parfitt, G. M., Buonocore, J., de, Q., and Barros, D. M. (2014). Participation of group I p21-activated kinases in neuroplasticity. J. Physiol. Paris 108, 270–277. doi: 10.1016/j.jphysparis.2014.08.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Langfelder, P., and Horvath, S. (2008). WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559. doi: 10.1186/1471-2105-9-559

PubMed Abstract | CrossRef Full Text | Google Scholar

Liang, J.-W., Fang, Z.-Y., Huang, Y., Liuyang, Z.-Y., Zhang, X.-L., Wang, J.-L., et al. (2018). Application of weighted gene co-expression network analysis to explore the key genes in Alzheimer’s disease. J. Alzheimers Dis. 65, 1353–1364. doi: 10.3233/JAD-180400

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, T., Pan, Y., Kao, S.-Y., Li, C., Kohane, I., Chan, J., et al. (2004). Gene regulation and DNA damage in the ageing human brain. Nature 429, 883–891. doi: 10.1038/nature02661

PubMed Abstract | CrossRef Full Text | Google Scholar

Luk, K. C., Kehm, V., Carroll, J., Zhang, B., O’Brien, P., Trojanowski, J. Q., et al. (2012). Pathological -synuclein transmission initiates parkinson-like neurodegeneration in nontransgenic mice. Science 338, 949–953. doi: 10.1126/science.1227157

PubMed Abstract | CrossRef Full Text | Google Scholar

Niculescu, A. B., Levey, D. F., Phalen, P. L., Le-Niculescu, H., Dainton, H. D., Jain, N., et al. (2015). Understanding and predicting suicidality using a combined genomic and clinical risk assessment approach. Mol. Psychiatry 20, 1266–1285. doi: 10.1038/mp.2015.112

PubMed Abstract | CrossRef Full Text | Google Scholar

Ohi, K., Shimada, T., Yasuyama, T., Kimura, K., Uehara, T., and Kawasaki, Y. (2017). Spatial and temporal expression patterns of genes around nine neuroticism-associated loci. Prog. Neuropsychopharmacol. Biol. Psychiatry 77, 164–171. doi: 10.1016/j.pnpbp.2017.04.019

PubMed Abstract | CrossRef Full Text | Google Scholar

Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., et al. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47. doi: 10.1093/nar/gkv007

PubMed Abstract | CrossRef Full Text | Google Scholar

Saha, P., and Sen, N. (2019). Tauopathy: a common mechanism for neurodegeneration and brain aging. Mechan. Ageing Dev. 178, 72–79. doi: 10.1016/j.mad.2019.01.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Stern, Y. (2012). Cognitive reserve in ageing and Alzheimer’s disease. Lancet Neurol. 11, 1006–1012. doi: 10.1016/S1474-4422(12)70191-6

CrossRef Full Text | Google Scholar

Tabula Muris Consortium, Overall coordination, Logistical coordination, Organ collection and processing, Library preparation and sequencing, Computational data analysis, et al. (2018). Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372. doi: 10.1038/s41586-018-0590-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Vogels, T., Murgoci, A.-N., and Hromádka, T. (2019). Intersection of pathological tau and microglia at the synapse. Acta Neuropathol. Commun. 7:109. doi: 10.1186/s40478-019-0754-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, Y., and Mandelkow, E. (2016). Tau in physiology and pathology. Nat. Rev. Neurosci. 17, 5–21. doi: 10.1038/nrn.2015.1

PubMed Abstract | CrossRef Full Text | Google Scholar

Warde-Farley, D., Donaldson, S. L., Comes, O., Zuberi, K., Badrawi, R., Chao, P., et al. (2010). The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38, W214–W220. doi: 10.1093/nar/gkq537

PubMed Abstract | CrossRef Full Text | Google Scholar

Wyss-Coray, T. (2016). Ageing, neurodegeneration and brain rejuvenation. Nature 539, 180–186. doi: 10.1038/nature20411

PubMed Abstract | CrossRef Full Text | Google Scholar

Yankner, B. A., Lu, T., and Loerch, P. (2008). The aging brain. Annu. Rev. Pathol. Mech. Dis. 3, 41–66. doi: 10.1146/annurev.pathmechdis.2.010506.092044

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, B., Kirov, S., and Snoddy, J. (2005). WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33, W741–W748. doi: 10.1093/nar/gki475

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, J., Wei, Y., Min, J., Wang, Y., Yin, L., Cao, G., et al. (2020). Knockdown of RAP2A gene expression suppresses cisplatin resistance in gastric cancer cells. Oncol. Lett. 19, 350–358. doi: 10.3892/ol.2019.11086

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, X., Zhao, W., Ji, P., Zhang, K., Jin, J., Feng, M., et al. (2017). High expression of Rap2A is associated with poor prognosis of patients with hepatocellular carcinoma. Int. J. Clin. Exp. Pathol. 10, 9607–9613.

Google Scholar

Keywords: WGCNA (weighted gene co-expression network analyses), SOM (self-organization map), aging brain, random forest, machine learning

Citation: Chai K, Liang J, Zhang X, Cao P, Chen S, Gu H, Ye W, Liu R, Hu W, Peng C, Liu GL and Shen D (2021) Application of Machine Learning and Weighted Gene Co-expression Network Algorithm to Explore the Hub Genes in the Aging Brain. Front. Aging Neurosci. 13:707165. doi: 10.3389/fnagi.2021.707165

Received: 09 May 2021; Accepted: 27 September 2021;
Published: 18 October 2021.

Edited by:

Ping Wu, Fudan University, China

Reviewed by:

Hudson Sousa Buck, Universidade de São Paulo, Brazil
Liping Sun, The First Affiliated Hospital of China Medical University, China

Copyright © 2021 Chai, Liang, Zhang, Cao, Chen, Gu, Ye, Liu, Hu, Peng, Liu and Shen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Keping Chai, Y2twemp5eUAxMjYuY29t; Caixia Peng, cGVuZ2NhaXhpYUB6eGhvc3BpdGFsLmNvbQ==; Gang Logan Liu, bG9nYW5saXVAaHVzdC5lZHUuY24=; Daojiang Shen, emp5eXNkakAxMjYuY29t

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.