Skip to main content

ORIGINAL RESEARCH article

Front. Aging Neurosci., 02 March 2023
Sec. Alzheimer's Disease and Related Dementias
This article is part of the Research Topic Advancing Clinical Neuroscience by Multi-Omic Driven Approaches Towards Personalized Medicine: Opportunity, Challenges, and the Future View all 9 articles

Associating brain imaging phenotypes and genetic risk factors via a hypergraph based netNMF method

Junli Zhuang&#x;Junli Zhuang1Jinping Tian&#x;Jinping Tian2Xiaoxing Xiong
Xiaoxing Xiong3*Taihan Li
Taihan Li4*Zhengwei ChenZhengwei Chen5Rong ChenRong Chen5Jun ChenJun Chen5Xiang LiXiang Li6
  • 1Department of Vascular Surgery, Renmin Hospital of Wuhan University, Wuhan, China
  • 2Faculty of Medicine, Jianghan University, Wuhan, China
  • 3Central Laboratory, Renmin Hospital of Wuhan University, Wuhan, China
  • 4Department of Clinical Laboratory, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Shenzhen, China
  • 5Department of Radiology, Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan, China
  • 6School of Health, Wuhan University, Wuhan, China

Abstract: Alzheimer’s disease (AD) is a severe neurodegenerative disease for which there is currently no effective treatment. Mild cognitive impairment (MCI) is an early disease that may progress to AD. The effective diagnosis of AD and MCI in the early stage has important clinical significance.

Methods: To this end, this paper proposed a hypergraph-based netNMF (HG-netNMF) algorithm for integrating structural magnetic resonance imaging (sMRI) of AD and MCI with corresponding gene expression profiles.

Results: Hypergraph regularization assumes that regions of interest (ROIs) and genes were located on a non-linear low-dimensional manifold and can capture the inherent prevalence of two modalities of data and mined high-order correlation features of the two data. Further, this paper used the HG-netNMF algorithm to construct a brain structure connection network and a protein interaction network (PPI) with potential role relationships, mine the risk (ROI) and key genes of both, and conduct a series of bioinformatics analyses.

Conclusion: Finally, this paper used the risk ROI and key genes of the AD and MCI groups to construct diagnostic models. The AUC of the AD group and MCI group were 0.8 and 0.797, respectively.

1. Introduction

Alzheimer’s disease (AD) is a neurodegenerative disease with insidious onset and progressive development. The most common early symptom is difficulty remembering recent events (Scheltens et al., 2021). As the disease progresses, patients gradually lose their ability to care for themselves and eventually die from complications such as infection (Scheltens et al., 2021). The exact cause of AD is still unknown, but it has a long-standing preclinical feature: mild cognitive impairment (MCI). The typical symptoms of MCI patients are memory loss and there may be damage to one or more cognitive areas (Carlson and Prusiner, 2021). Still, it is not enough to affect the patient’s daily life, and the diagnostic criteria for AD have not yet been met. Previous studies show that structural and functional abnormalities of the brain and phenotypic or molecular abnormalities associated with AD are associated (Simrén et al., 2021). However, the source or cause of these abnormalities is unclear.

In recent years, imaging genetic analysis related to brain diseases has attracted much attention. Imaging genetics explores the effect of genetic variation on brain structure, metabolism, and function through association analysis of radionics data (such as sMRI, PET, fMRI; Jiang et al., 2018) and genetic data (such as DNA methylation, SNP, gene expression; Du et al., 2020).

The association mechanism of imaging genetics is gradually revealed through robust association analysis algorithms. By improving sparse canonical correlation analysis (SCCA) with various strategies, Du et al. (2020) proposed various innovative algorithms based on SCCA to analyze brain imaging genetics data. For example, they proposed a dirty multi-task sparse canonical correlation analysis (dirty MT-SCCA) to study imaging genetic problems involving multimodal brain imaging. This method used multi-task learning and parameter decomposition to simultaneously identify pattern-consistent and pattern-specific brain regions and SNP loci. Furthermore, they proposed a parametric decomposition-based sparse multi-view canonical correlation analysis (PDSMCCA) method to identify modality-sharing and specific information from multimodal data to gain insight into the complex pathology of brain diseases (Zhang et al., 2022). However, the above two algorithms only supported the analysis of multimodal brain imaging and single-modal genetic data. The sparse multi-view sparse canonical correlation analysis (SMCCA) algorithm can simultaneously perform correlation analysis on data from multiple modalities. However, the direct fusion of multiple SCCA objectives can cause gradient domination problems, resulting in SMCCA being a suboptimal model. Therefore, they proposed two adaptive multi-view canonical correlation analysis algorithms to solve the problem of gradient domination, integrating the underlying relationships between protein expression, SNPs, and neuroimaging (Du et al., 2021).

On the other hand, multi-objective optimization algorithms are also emerging in the field of imaging genetics. Bi et al. (2022) integrated the fMRI and SNP data of Parkinson’s patients through multiple optimization methods for multimodal analysis. They used a correlation analysis method to construct fused features from the sequence information and SNPs of regions of interest (ROIs). Then, a weighted evolution strategy was introduced into ensemble learning, and a new weighted evolutionary random forest (WERF) model was constructed to eliminate inefficient features. In addition, they also proposed a clustered evolutionary random forest (CERF) method to detect discriminative genes and brain regions and found some interesting associations between brain regions and genes in Parkinson’s patients (Bi et al., 2021).

However, the above algorithms cannot interpret the results from the perspective of network regulation. Therefore, this paper used a hypergraph-based netNMF (HG-netNMF) algorithm to explore the relationship between genes and sMRI from both gene network and brain network construction. The algorithm exploited hypergraph regularization to mine higher-order features of both modal data. Further, the key genes and ROIs were explored from the critical network modules. Bioinformatics analysis and diagnostic model construction were performed to provide new insights into the imaging genetic association mechanism of AD and MCI.

2. Methods

2.1. Nonnegative matrix factorization

Nonnegative matrix factorization (NMF) algorithm is a classic dimensionality reduction algorithm that were widely used in image processing, audio processing, biological data analysis, and other fields (Kim and Tidor, 2003; Zhang et al., 2012; Yang et al., 2021). It decomposes the matrix Xn×p to obtain the basis matrix Wn×k and the coefficient matrix Hk×p, and its objective function is as follows:

minW,HXWHF2    (1)

2.2. Network non-negative matrix factorization

The traditional NMF algorithm can only decompose single-modal data and cannot analyze the network control module composed of multi-modal data. When X is a symmetric similarity matrix, Chen and Zhang (2018) proposed the netNMF algorithm to decompose multiple matrices simultaneously. It is worth noting that the algorithm can be used to analyze the network modules of various data. For example, their respective correlation networks and mutual correlation networks can be calculated for analysis for the two expression matrices X1n×p, X2n×q. The objective function of the netNMF algorithm is as follows:

minG1,G2,S11,S22R11G1S11G1TF2+R12G1G2TF2+R22G2S22G2TF2s.t.G1,G2,S11,S220    (2)

Among them, R11p×p, R22q×q are the respective association networks of X1 and X2. R12p×q is the cross-correlation network of X1 and X2. G11p×k, G22q×k, S11k×k, S22k×k are non-negative factor matrices. The algorithm can be analyzed from the network module level constructed by the data, and R12G1G2TF2 is used to identify the one-to-one relationship between the two types of modules.

In fact, R11p×p, R22q×q are the symmetric similarity matrices corresponding to the two types of features, respectively, and R12p×q is the nonnegative similarity matrix between the two types of data. S11 and S22 represent the matrix describing the similarity of the two networks obtained from the decomposition of R11 and R22, respectively. Off-diagonal elements in S11 and S22 indicate the importance of relationships between rois and between genes.

2.3. Hypergraph learning

In graph learning, vertices and edges can describe the relationship between multiple objects. However, simple graphs may not depict complex relationships in practical situations. Therefore, hypergraph theory is widely used to mine higher-order correlations between complex things (Xiao et al., 2020). A hypergraph can connect more than two vertices through hyperedges. It is an extension of simple graphs where each edge can connect multiple vertices, known as hyperedges. Let G(V,E,w) represents a hypergraph, and V,E,andw represent vertices, hyperedges, and the weights of vertices, respectively. V={v1,v2,,vN} is the set of vertices contained in one of the hyperedges ={e1,e2,,eM} is the set of hyperedges. w=(w(e1),w(e2),,w(eM))TM corresponds to the weight of each hyperedge. In addition, the relationship between hyperedges and vertices can be represented by an association matrix H shown on the right side of Figure 1. Among them, the (i,j) element of matrix H indicates whether the jth hyperedge contains the ith vertex. The element in the ith row and jth column of matrix H is defined as follows:

Hij={1,ifvie0,ifvie    (3)

Among them, H=[Hij]N×M. Furthermore, the degrees d(v) and δ(e) of the ith vertex and the jth hyperedge are defined, respectively.

d(vi)=ejw(ej)Hijfor1iN    (4)
δ(ej)=viVHijfor1jM    (5)
FIGURE 1
www.frontiersin.org

Figure 1. Line graph of the relationship between the number of neighbors and the value of the relative error when building a hypergraph in the two groups. (A,B) Are the cases of AD group and MCI group, respectively.

Further, define degree diagonal matrices Dv and De. Among them, Dv=diag(d(v1),d(v2),,d(vN))N×N,Dρ=diag(δ(e1),δ(e2),,δ(eM))M×M. Furthermore, let W represent the diagonal matrix of hyperedge weights, W=diag(w)=diag(w(e1),w(e2),,w(eM))M×M. Then, the similarity matrix S that defines the hypergraph G is as follows.

S=HWDe1HTN×N    (6)

Similar to the Laplacian matrix definition for simple graphs, the Laplacian matrix for hypergraphs is defined as follows.

L=DvS    (7)

2.4. Hypergraph-based network non-negative matrix factorization

The previously introduced netNMF algorithm does not consider the correlation within different networks, while the hypergraph Laplacian matrix can integrate high-order correlation information of different modal data. It helps the algorithm identify more biologically meaningful network modules and improves its performance to a certain extent. This paper defines hypergraph regularization as follows.

Ω=GTLhG    (8)

L1 and L2 are the hypergraph Laplacian matrices of X1 and X2, respectively. Further, let ψ1,ψ2,Φ1, and Φ2 be Lagrangian multipliers, and (9) can be arranged as a Lagrangian function.

L=tr((R11G1S11G1T)T(R11G1S11G1T)) +tr((R12G1G2T)T(R12G1G2T)) +tr((R22G2S22G2T)T(R22G2S22G2T)) +λ1tr(G1TL1G1)+λ2tr(G2TL2G2)+tr(ψ1TS11) +tr(ψ2TS22)+tr(Φ1TG1)+tr(Φ2TG2)    (9)

The following formula can be obtained using (10) to calculate the partial derivatives of S11, S22, G1 and G2, respectively.

fSII=2G1TR11G1+2G1TG1S11G1TG1+ψlfS22=2G2TR22G2+2G2TG2S22G2TG2+ψ2fG1=4(G1S11G1TG1S11R11G1S11)+2(G1G2TG2R12G2)+2λ1L1Gl+Φ1fG2=4(G2S22G2TG2S22R22G2S22)+2(G2G1TG1R12G1)+2λ2L2G2+Φ2    (10)

Further, through the Karush-Kuhn-Tucher (KKT) condition, the iteration rules of S11, S22, G1 and G2 can be obtained, as shown below.

(s11)ijs11(G1TR11G1)ij(G1TG1S11G1TG1)ij,
(s22)ijs22(G2TR22G2)ij(G2TG2S22G2TG2)ij,
(g1)ij(g1)ij(R12G2+2R11G1S11)ij(2G1S11G1TG1S11+G1G2TG2+2λ1L1G1)ij,
(g2)ij(g2)ij(R12TG1+2R22G2S22)ij(αG2G1TG1+2G2S22G2TG2S22+λ2L2G2)ij.    (11)

2.5. Network module selection method

Two types of modules, G1 and G2, can be identified from the ROI matrix R11 and gene network matrix R22 constructed in this paper. Specifically, zscore is introduced into this paper to calculate the weights of genes and ROIs, and genes and ROIs with weights above the threshold are treated as members of the corresponding community.

xij=(Gl)iju(Gl).jσ(Gl).j(l=1,2)    (12)

Among them, u(Gl).j=1N(Gl)ij, σ(Gl),j=1N((Gl)iju(Gl).j)2. In this paper, the threshold was set to 1.

2.6. Evaluation indicators for regression analysis and diagnostic model construction

In this paper, regression analysis and diagnostic model construction were carried out on the Top elements in the network module obtained by the HG-netNMF algorithm. For regression analysis, this paper introduced mean absolute error (MAE) and root mean square error (RMSE) to evaluate the regression performance, and their definitions are as follows:

RMSE=i=1n(yiyi)2n    (13)
MAE=i=1n|yiyi|n    (14)

where yi represents the predicted value, and yi represents the true value. In addition, when constructing the diagnostic model, this paper introduced the Receiver Operating Characteristic (ROC) curve to evaluate the classification performance of the algorithm, the abscissa is the false positive rate (FPR), and the ordinate is the true positive rate (TPR). The AUC value is the area covered by the ROC curve. AUC can measure the classification effect of the classifier.

3. Results

3.1. Data acquisition and preprocessing

The real data used in this paper are all from the ADNI database. We collected sMRI imaging and gene expression data from 306 subjects in ADNI. Table 1 gives specific information of the samples.

TABLE 1
www.frontiersin.org

Table 1. Information about the samples included in the analysis in this paper.

For sMRI data, this paper realized the segmentation of sMRI based on the CAT toolkit of MATLAB software. Specifically, the CAT toolkit provided a voxel-based morphometric measurement (VBM) function (Veres et al., 2009). Finally, the gray matter volumes of 140 ROIs were extracted as imaging features. Differential expression analysis was performed using the Limma package for gene expression data. Specifically, we used the AD group and the MCI group as the diseased group and the HC group as the control group to experiment. Differentially expressed genes with p values less than 0.01 in the AD and MCI groups were retained (510 genes were retained in the AD group, and 314 genes were retained in the MCI group).

In addition, this paper firstly divided the AD group and the MCI group into the training set and the test set according to the ratio of 8:2 and used the HG-netNMF algorithm to perform network module association analysis on the training set of the two groups of data, and then validate on the test set, and use Top 10 genes to regress Top 10 ROIs, build diagnostic models, etc.

3.2. The influence of neighbor size

In this paper, the K-nearest neighbors (KNN) method was used to construct the hyperedge, and the neighbor size of the KNN algorithm needs to be selected. The size of the neighbors was selected based on the size of the objective function value. Based on the selection experience of previous papers, we divided the parameters of the two groups. Figures 1A,B below give the line graphs of the relationship between the neighbor size of the AD group and the MCI group and the objective function value, respectively.

The AD group’s smallest relative error was 0.4496 (corresponding to 31 and 5). The MCI group’s smallest relative error was 0.4848 (corresponding to 5 and 47).

3.3. Parameter selection

Two parameters need to be selected in this paper, namely λ1 and λ2. The objective function value was selected as the selection standard, and parameters were selected from the range of [0.0001, 0.001, 0.01, 0.1]. The schematic diagram of parameter selection was shown in Figure 2.

FIGURE 2
www.frontiersin.org

Figure 2. The line graph of parameter selection and objective function value of AD group and MCI group. (A,B) Are the cases of AD group and MCI group, respectively.

In addition, according to the selection experience of previous literature, the value of k generally does not exceed one-tenth of the minimum number of samples or features for the number of genes-ROIs network modules k. Therefore, the k value in this paper was set to 7 in the AD group and 22 in the MCI group.

3.4. Algorithm comparison

To confirm the superiority of the algorithm, the algorithm was compared on simulated data and the real data. First, on simulated data, this paper compared the objective function values of the two algorithms at different noise levels. The generation method of simulated data was similar to Kim et al. (2019). Specifically, this paper simulated the sMRI numerical matrix X and the gene numerical matrix Y of 300 random samples. By defining a normal distribution N(0,σϵ2), generate an ROI weight vector u with 200 elements and a gene weight vector v with 2000 elements. In addition, this paper created noise ϵ=e, which comes from a normal distribution N(0,σe2). Next, correlated and uncorrelated ROI and gene variables were generated similarly: X=uε+e and X=e and Y=vε+e and Y=e. The respective and mutual Pearson correlation coefficients were then calculated for the two variables as inputs to the algorithm (R11, R22, and R12). Next, the objective function values of the proposed method and the netNMF algorithm were compared under the same experimental conditions and noise levels (Table 2). In addition, we simulated the anti-noise performance of the two algorithms when the sample size is large. Specifically, we set the total number of samples to 1,000, and the remaining parameters are consistent with the above to obtain the objective function values of the two algorithms under different noise levels (Table 3).

R11=G1S11G1T
R12=G1G2T
R22=G2S22G2T    (15)
TABLE 2
www.frontiersin.org

Table 2. The objective function values of the two algorithms were compared on a simulation dataset with a small sample size.

TABLE 3
www.frontiersin.org

Table 3. The objective function values of the two algorithms were compared on a simulation dataset with a large sample size.

In real data, this paper compared the results of the proposed HG-netNMF algorithm with the results obtained by the netNMF algorithm for the AD group and the MCI group. Specifically, we calculated the Pearson correlation coefficients between the reconstructed matrices R11, R22, and R12 and the three original matrices R11, R22, and R12, as shown in Table 4. In addition, we give the formulas for R11, R22 and R12 in the following formula.

TABLE 4
www.frontiersin.org

Table 4. Pearson correlation coefficients between the three original matrices and the three reconstructed matrices obtained by the two algorithms in the AD group and the MCI group.

corr(X,Y) represents the Pearson correlation coefficient between X and Y. It can be seen from Table 3 that the proposed algorithm outperforms the netNMF algorithm in the reconstruction of R22 and R12 in the AD group. In the MCI group, the proposed algorithm outperformed the netNMF algorithm for reconstructing the three original matrices. This confirmed that hypergraph regularization contributes to the improvement of algorithm reconstruction performance. In addition, this paper also drew the scatter plots of the two algorithms between the three original matrices and the three reconstructed matrices in the AD group and the MCI group, as shown in Figure 3.

FIGURE 3
www.frontiersin.org

Figure 3. Correlation scatter plots of R11, R12, and R22 of the two algorithms and their reconstruction matrices R11, R12, and R22 in the AD and MCI groups. (A,E,I) Are three sets of scatter plots obtained by the netNMF algorithm in the AD group. (B,F,J) Are three groups of scatter plots obtained by the HG-netNMF algorithm in the AD group. (C,G,K) Are three sets of scatter plots obtained by the netNMF algorithm in the MCI group. (D,H,L) Are three groups of scatter plots obtained by the HG-netNMF algorithm in the MCI group.

As seen from Figure 3, the two algorithms had comparable reconstruction performance when reconstructing R11 and R12. However, the proposed algorithm performed better when reconstructing R22 in the MCI group.

3.5. Network module selection

Fourteen network modules were obtained in the AD group, of which module 1 and 10 contained less than five genes, so they were eliminated. In the MCI group, 14 modules were obtained, of which modules 1, 3, 6, 9, and 11 contained less than five ROIs, and modules 2, 3, 4, 7, and 9 contained less than five genes. Therefore, the above modules were eliminated. Figure 4 below shows the number of elements retained in the AD and the MCI group and the respective reconstruction errors and total reconstruction errors of the two modal features.

FIGURE 4
www.frontiersin.org

Figure 4. Histogram of network module selection in the AD group and MCI group. (A,B) Are the element numbers of ROIs and genes in the retained modules in the AD and MCI groups, respectively. (C,D) Are the ROIs, genes, and relative errors of the two in the retained modules in AD and MCI groups, respectively.

As shown in Figure 4, module 3 of the AD group and module 10 of the MCI group had minor relative errors, and subsequent analysis of these two modules will be performed later.

3.6. Significant network module analysis

First, this paper constructed a PPI network using the ROIs and genes in network module 3 in the AD group, respectively. Specifically, this paper selected elements in module 3 from R11 and R22 to form PCC pairs with a one-to-one correspondence between two components. The AD group’s mean of the PCC of gene–gene and ROI-ROI was 0.902. Therefore, in this paper, the PCC threshold of ROI and genes was set to 0.9, and uses the relationship pair of PCC > 0.9 to construct the ROI-ROI interaction network and the gene–gene interaction network. Furthermore, the interaction network model was visualized using Cytoscape (version 3.9.1; Shannon et al., 2003). The Matthews Correlation Coefficient metric (MCC) algorithm (Boughorbel et al., 2017) was widely used as a performance metric in bioinformatics. This paper used the MCC algorithm to mine the scores of ROIs and genes related to other network nodes and then arranges the top 10 ROIs and genes as the risk ROI and key genes. The MCI group’s mean of the PCC of gene–gene and ROI-ROI was 0.604. Therefore, in this paper, the PCC threshold of ROI and genes was set to 0.60, and used the relationship pair of PCC > 0.6 to construct the ROI-ROI interaction network and the gene–gene interaction network. The PPI networks of the two groups are shown in Figure 5.

FIGURE 5
www.frontiersin.org

Figure 5. Visualization of ROI-ROI and gene–gene interaction networks in the AD and MCI groups. (A,B) are the network diagrams of the ROIs and genes of the AD group, respectively. (C,D) Are the network maps of the ROIs and genes of the MCI group, respectively.

3.7. Key regions of interests and genes selection

In this paper, the MCC algorithm was used to obtain the scores of the Top nodes in the four network graphs of the AD and the MCI groups, as shown in Tables 5, 6. A higher score in the table represents a more critical role for this ROIs/gene in MCI/AD. So we discussed these ROIs/genes in detail in the subsequent discussion section. The top brain regions in both groups were also visualized (Figure 6, Figure 7). We also drew Venn diagrams for ROI and gene nodes in the AD group and MCI group network (Figure 8).

TABLE 5
www.frontiersin.org

Table 5. Top 10 ROIs and genes and their MCC score information in AD group.

TABLE 6
www.frontiersin.org

Table 6. Top 10 ROIs and genes and their MCC score information in MCI group.

FIGURE 6
www.frontiersin.org

Figure 6. Top 10 ROI visualization of AD group.

FIGURE 7
www.frontiersin.org

Figure 7. Top 10 ROI visualization of MCI group.

FIGURE 8
www.frontiersin.org

Figure 8. ROI and gene intersection of Top 10 in AD and MCI groups. (A) Is the intersection of ROI selected by AD group and MCI group. (B) Is the intersection of ROI selected by AD group and MCI group.

4. Discussion

4.1. Analysis of the biological significance of key regions of interests and genes

We first analyzed the biological significance of the Top 10 ROIs in the AD group. A pilot study showed that in AD, Aβ deposition in the inferior temporal gyrus was strongly associated with gray matter atrophy in brain region BF-227 (Maeno, 2019). The left lingual gyrus is associated with changes in functional connectivity at the local network level in AD (Chang et al., 2020). The left angular gyrus is associated with increased brain metabolism in AD patients (Weissberger et al., 2017). Studies on rs-fMRI have shown that the left angular gyrus is the brain functional connectivity region showing the most significant discrimination (Wang et al., 2021). The left inferior frontal gyrus is associated with genetic variants of cortical atrophy in AD, contributing to further understanding AD’s genetic basis (Kim et al., 2021). Maeno (2019) found that Aβ deposition in the anterior cuneiform of AD patients was associated with atrophy of the right occipitotemporal region. The left medial orbital age has a strong negative correlation, which is valid in the age adjustment of AD subjects (Wang et al., 2021).

We also analyzed the Top ROIs in the MCI group’s biological significance. The left entorhinal cortex decreases in volume and cognitive and motor dysfunction in older adults with mild cognitive impairment (Sakurai et al., 2019). Cerebral blood flow (rCBF) in the left lateral orbital gyrus was decreased in the dizzy MCI group compared with the non-dizzy MCI group (Na et al., 2020). In an ALFF-based fMRI study, researchers found significantly increased ALFF values in the right lingual gyrus of MCI patients compared with HC (Lai et al., 2022). Occipital gray matter volume correlates with neuropsychological performance in patients with amnestic MCI or mild AD (Arlt et al., 2013). The left superior parietal lobule is associated with modulating the amplitude of low-frequency fluctuations (ALFF) in patients with mild cognitive impairment (MCI) (Zhuang et al., 2020). Chen et al. (2020) found atrophy of the Para hippocampus gyrus structure in MCI. In addition, it can be found from Tables 4, 5 that the left inferior frontal gyrus and left inferior frontal gyrus are the intersection ROIs of the two groups. Here we found that these two brain regions were selected in both the AD and MCI groups. Using the left inferior frontal gyrus as a seed, Pistono et al. (2021) found that the functional connectivity of the language network could better discriminate the MCI and AD participants than the executive control network, revealing an increase in connectivity during the MCI phase. To study the relationship between MCI patients and apathy, the researchers divided MCI patients into those with and without “SPECT images suggestive of AD.” They found that apathy negatively correlated with regional cerebral blood flow in the bilateral fusiform gyrus (Kazui et al., 2016).

Next, this paper also performed the same analysis on the Top 10 genes. Two significant features of AD are transcriptome dysregulation and altered RNA-binding protein (RBP) function (Chen et al., 2020). TNS1-related Gene Ontology annotations included RNA binding and actin binding. Variations in actin-involved endocytosis pathways are significant contributors to the overall regulation of genetic risk for AD (Pistono et al., 2021). TRIM58 gene induces E3 ubiquitin ligase in late erythropoiesis (Kazui et al., 2016). Experiments have confirmed that α-synuclein in red blood cells may help differentiate AD from HC. Erythropoiesis is associated with FECH. The protein encoded by FECH is localized to mitochondria (Kazui et al., 2016). GLRX5 encodes a mitochondrial protein (Rybak-Wolf and Plass, 2021). E2F2 is involved in the control of the cell cycle, and Zhou et al. (2021) identified vital cell cycle regulators, helping to develop potential pathways for optimal AD treatment. BCL2L1 was identified as a core target of tau pathogenesis (Tesi et al., 2020).

TXN is associated with cellular senescence, and Gene Ontology annotations associated with this gene include RNA binding and oxidoreductase activity. Chico et al. (2013) found that superoxide dismutase activity was reduced in MCI compared to controls. SLC15A3 is involved in the innate immune response, and Fiala et al. (2017) found that the innate immune system in MCI patients is highly increased or decreased through the transcription of inflammatory genes. LST1 may regulate immune responses, and peripheral innate immune responses had the highest activation level in the MCI group compared with the subjective memory complaints (SMC) and AD groups (Munawara et al., 2021). The CCDC9 gene is a possible component of the exon junction complex (EJC), which is involved in mRNA translation, one of the common pathways of MCI (Fernández-Martínez et al., 2020). LRP3 is involved in regulating gene expression, and analysis of gene expression data from blood may help differentiate MCI from AD (Bottero and Potashkin, 2019).

4.2. Analysis of MMSE based on top features

In this paper, the Top 10 ROIs and 10 genes of the AD and the MCI groups were used to predict MMSE (stands for root mean square error) to confirm the correlation between the Top features and the clinical score. Specifically, this paper first used support vector regression (SVR), random forest (RF), and K-Nearest Neighbors (KNN) algorithms to regress the MMSE of the two groups on the training sets of the two groups, respectively. The regression effect was evaluated using MAE and RMSE, as shown in Figure 9.

FIGURE 9
www.frontiersin.org

Figure 9. Histograms obtained by using three regression methods (RF, SVM and KNN) to regress MMSE and using two regression evaluation indicators (MAE and MSE). (A,B) Are the regression results using the Top ROIs and genes of the AD group, respectively. (C,D) Are the regression results of Top ROIs genes using the MCI group, respectively.

It can be found from Figures 9A,B that using KNN regression algorithm in AD group can get smaller MAE and RMSE in most cases. From Figures 9C,D, it can be found that using the RF algorithm in AD group can get smaller MAE and RMSE in most cases.

4.3. Correlation analysis between top regions of interests and genes

This paper drew a correlation heat map for the Top 10 ROIs and genes on the test set in the two groups obtained above, as shown in Figure 10.

FIGURE 10
www.frontiersin.org

Figure 10. Heatmap of the correlation of Top ROIs and genes on the test set. (A,B) Are the correlation heatmaps of AD group and healthy control group, respectively. (C,D) Are the correlation heatmaps of the MCI group and the healthy control group, respectively.

As seen in Figure 10, there is a strong correlation between the Top features of the AD group data. This paper takes the absolute value of PCC and then calculates the average correlation. Among them, the average correlation for the AD group was 0.5411, and the correlation of the control group was 0.1045. The mean correlation was 0.1039 in the MCI group and 0.2619 in the control group.

4.4. Regression analysis between top features

In this section, this paper still used three regression algorithms: SVR, RF, and KNN. Specifically, this paper uses the Top 10 genes selected by the HG-netNMF algorithm to regress to the Top 10 ROIs, respectively, trains three models on the training set, and uses MAE and RMSE to evaluate the regression effect on the test set, as shown in Figure 11.

FIGURE 11
www.frontiersin.org

Figure 11. Histograms of errors (MAE and RMSE) obtained by regression prediction of Top 10 ROIs using Top10 genes in the AD and MCI groups, respectively. (A,B) Are the regression results of two groups, respectively.

It can be found from Figure 11 that using different regression algorithms to predict the left lingual gyrus of the AD group can achieve the smallest root mean square error. Using different regression algorithms to predict the right frontal operculum of the MCI group can achieve the smallest root mean square error.

4.5. Classification analysis

This paper’s ROC curves were drawn using the Top 10 ROIs and genes of the AD and MCI groups, respectively. In addition, this paper used Logistic regression in IBM SPSS Statistics software to build a joint diagnostic model, as shown in Figure 12. This paper also counted the specific information of the four diagnostic models, as shown in Table 7.

FIGURE 12
www.frontiersin.org

Figure 12. ROC curves obtained from the diagnostic models of AD and MCI constructed using Top ROIs and genes of AD and MCI groups, respectively. (A,B) Are the diagnostic models of AD constructed using the Top features of the AD group, respectively. (C,D) Are the diagnostic models of MCI constructed using the Top features of the MCI group, respectively.

TABLE 7
www.frontiersin.org

Table 7. Specific information of ROC curve.

Next, to determine whether the ROIs and genes involved in the construction of the diagnostic model also have diagnostic significance for AD and MCI, this paper drew ROC curves for the ROIs and genes involved in the construction of the diagnostic model in Figure 9 to verify one by one, as shown in Figure 10. In addition, the ROC curve information of a single ROI and gene in the two groups was calculated separately, as shown in Tables 8, 9.

TABLE 8
www.frontiersin.org

Table 8. ROC curve information of a single top features in AD group.

TABLE 9
www.frontiersin.org

Table 9. ROC curve information of a single top features in the MCI group.

It can be seen from Figure 13 that most of the AUCs of ROI/gene involved in the construction of the diagnostic model are more significant than 0.5, which has diagnostic significance for AD and MCI, further confirming the effectiveness of the algorithm.

FIGURE 13
www.frontiersin.org

Figure 13. Single validation of Top 10 ROIs and genes for the two groups. (A–D) Are the ROC curves of the first five and last five ROIs of the AD group, respectively. (E–H) Are the ROC curves of the first five and last five ROIs of the MCI group.

5. Conclusion

This paper proposed a hypergraph-based NetNMF method to integrate sMRI and genetic data of AD and MCI patients and mined their respective regulatory networks and interaction information, aiming to explore the risk ROIs and genes associated with AD and MCI. Finally, robust diagnostic models for AD and MCI were constructed, respectively. In future work, we will integrate more data types to analyze the disease-related regulatory mechanisms more comprehensively.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found at: https://adni.loni.usc.edu.

Author contributions

JT contributed to the conception of the study. JZ and JT performed the experiment. XX and TL contributed significantly to analysis and manuscript preparation. ZC and RC performed the data analyses and wrote the manuscript. JC and XL helped to perform the analysis with constructive discussions. All authors have read and approved the manuscript.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Arlt, S., Buchert, R., Spies, L., Eichenlaub, M., Lehmbeck, J. T., and Jahn, H. (2013). Association between fully automated MRI-based volumetry of different brain regions and neuropsychological test performance in patients with amnestic mild cognitive impairment and Alzheimer's disease. Eur. Arch. Psychiatry Clin. Neurosci. 263, 335–344. doi: 10.1007/s00406-012-0350-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Bi, X.-A., Wu, H., Xie, Y., Zhang, L., Luo, X., Fu, Y., et al. (2021). The exploration of Parkinson's disease: a multi-modal data analysis of resting functional magnetic resonance imaging and gene data. Brain Imaging Behav. 15, 1986–1996. doi: 10.1007/s11682-020-00392-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Bi, X.-a., Xing, Z.-X., Zhou, W., Li, L., and Xu, L. (2022). Pathogeny detection for mild cognitive impairment via weighted evolutionary random forest with brain imaging and genetic data. IEEE J. Biomed. Health Inf. 26, 3068–3079. doi: 10.1109/JBHI.2022.3151084

CrossRef Full Text | Google Scholar

Bottero, V., and Potashkin, J. A. (2019). Meta-analysis of gene expression changes in the blood of patients with mild cognitive impairment and Alzheimer's disease dementia. Int. J. Mol. Sci. 20:5403. doi: 10.3390/ijms20215403

PubMed Abstract | CrossRef Full Text | Google Scholar

Boughorbel, S., Jarray, F., and el-Anbari, M. (2017). Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS One 12:e0177678. doi: 10.1371/journal.pone.0177678

PubMed Abstract | CrossRef Full Text | Google Scholar

Carlson, G. A., and Prusiner, S. B. (2021). How an infection of sheep revealed prion mechanisms in Alzheimer's disease and other neurodegenerative disorders. Int. J. Mol. Sci. 22:4861. doi: 10.3390/ijms22094861

PubMed Abstract | CrossRef Full Text | Google Scholar

Chang, Y. T., Hsu, J. L., Huang, S. H., Hsu, S. W., Lee, C. C., and Chang, C. C. (2020). Functional connectome and neuropsychiatric symptom clusters of Alzheimer's disease? J. Affect. Disord. 273, 48–54. doi: 10.1016/j.jad.2020.04.054

CrossRef Full Text | Google Scholar

Chen, S., Xu, W., Xue, C., Hu, G., Ma, W., Qi, W., et al. (2020). Voxelwise meta-analysis of gray matter abnormalities in mild cognitive impairment and subjective cognitive decline using activation likelihood estimation. J. Alzheimers Dis. 77, 1495–1512. doi: 10.3233/JAD-200659

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, J., and Zhang, S. (2018). Discovery of two-level modular organization from matched genomic data via joint matrix tri-factorization. Nucleic Acids Res. 46, 5967–5976. doi: 10.1093/nar/gky440

PubMed Abstract | CrossRef Full Text | Google Scholar

Chico, L., Simoncini, C., Lo Gerfo, A., Rocchi, A., Petrozzi, L., Carlesi, C., et al. (2013). Oxidative stress and APO E polymorphisms in Alzheimer's disease and in mild cognitive impairment. Free Radic. Res. 47, 569–576. doi: 10.3109/10715762.2013.804622

PubMed Abstract | CrossRef Full Text | Google Scholar

Du, L., Liu, F., Liu, K., Yao, X., Risacher, S. L., Han, J., et al. (2020). Associating multi-modal brain imaging phenotypes and genetic risk factors via a dirty multi-task learning method. IEEE Trans. Med. Imaging 39, 3416–3428. doi: 10.1109/TMI.2020.2995510

PubMed Abstract | CrossRef Full Text | Google Scholar

Du, L., Zhang, J., Liu, F., Wang, H., Guo, L., Han, J., et al. (2021). Identifying associations among genomic, proteomic and imaging biomarkers via adaptive sparse multi-view canonical correlation analysis. Med. Image Anal. 70:102003:102003. doi: 10.1016/j.media.2021.102003

PubMed Abstract | CrossRef Full Text | Google Scholar

Fernández-Martínez, J. L., Álvarez-Machancoses, Ó., de Andrés-Galiana, E. J., Bea, G., and Kloczkowski, A. (2020). Robust sampling of defective pathways in Alzheimer's disease. Implications in drug repositioning. Int. J. Mol. Sci. 21:3594. doi: 10.3390/ijms21103594

PubMed Abstract | CrossRef Full Text | Google Scholar

Fiala, M., Kooij, G., Wagner, K., Hammock, B., and Pellegrini, M. (2017). Modulation of innate immunity of patients with Alzheimer's disease by omega-3 fatty acids. FASEB J. 31, 3229–3239. doi: 10.1096/fj.201700065R

PubMed Abstract | CrossRef Full Text | Google Scholar

Jiang, J., Sun, Y., Zhou, H., Li, S., Huang, Z., Wu, P., et al. (2018). Study of the influence of age in (18)F-FDG PET images using a data-driven approach and its evaluation in Alzheimer's disease. Contrast Media Mol. Imaging 2018:3786083. doi: 10.1155/2018/3786083

PubMed Abstract | CrossRef Full Text | Google Scholar

Kazui, H., Takahashi, R., Yamamoto, Y., Yoshiyama, K., Kanemoto, H., Suzuki, Y., et al. (2016). Neural basis of apathy in patients with amnestic mild cognitive impairment. J. Alzheimers Dis. 55, 1403–1416. doi: 10.3233/JAD-160223

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, M., Ji, H. W., Youn, J., and Park, H. (2019). Joint-connectivity-based sparse canonical correlation analysis of imaging genetics for detecting biomarkers of parkinson's disease. IEEE Trans. Med. Imaging 99:1. doi: 10.1109/TMI.2019.2918839

CrossRef Full Text | Google Scholar

Kim, B. H., Nho, K., and Lee, J. M. (2021). Genome-wide association study identifies susceptibility loci of brain atrophy to NFIA and ST18 in Alzheimer's disease. Neurobiol. Aging 102, 200.e1–200.e11. doi: 10.1016/j.neurobiolaging.2021.01.021

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, P. M., and Tidor, B. (2003). Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res. 13, 1706–1718. doi: 10.1101/gr.903503

PubMed Abstract | CrossRef Full Text | Google Scholar

Lai, Z., Zhang, Q., Liang, L., Wei, Y., Duan, G., Mai, W., et al. (2022). Efficacy and mechanism of Moxibustion treatment on mild cognitive impairment patients: an fMRI study using ALFF. Front. Mol. Neurosci. 15:852882. doi: 10.3389/fnmol.2022.852882

PubMed Abstract | CrossRef Full Text | Google Scholar

Maeno, N. (2019). Correlation between β-amyloid deposits revealed by BF-227-PET imaging and brain atrophy detected by voxel-based morphometry-MR imaging: a pilot study. Nucl. Med. Commun. 40, 905–912. doi: 10.1097/MNM.0000000000001042

PubMed Abstract | CrossRef Full Text | Google Scholar

Munawara, U., Catanzaro, M., Xu, W., Tan, C., Hirokawa, K., Bosco, N., et al. (2021). Hyperactivation of monocytes and macrophages in MCI patients contributes to the progression of Alzheimer's disease. Immun. Ageing 18:29. doi: 10.1186/s12979-021-00236-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Na, S., Im, J. J., Jeong, H., Lee, E. S., Lee, T. K., Chung, Y. A., et al. (2020). Altered regional cerebral blood perfusion in mild cognitive impairment patients with dizziness. Diagnostics (Basel) 10:777. doi: 10.3390/diagnostics10100777

PubMed Abstract | CrossRef Full Text | Google Scholar

Pistono, A., Senoussi, M., Guerrier, L., Rafiq, M., Giméno, M., Péran, P., et al. (2021). Language network connectivity increases in early Alzheimer's disease. J. Alzheimers Dis. 82, 447–460. doi: 10.3233/JAD-201584

PubMed Abstract | CrossRef Full Text | Google Scholar

Rybak-Wolf, A., and Plass, M. (2021). RNA dynamics in Alzheimer's disease. Molecules 26:5113. doi: 10.3390/molecules26175113

PubMed Abstract | CrossRef Full Text | Google Scholar

Sakurai, R., Bartha, R., and Montero-Odasso, M. (2019). Entorhinal cortex volume is associated with dual-task gait cost among older adults with MCI: results from the gait and brain study. J. Gerontol. A Biol. Sci. Med. Sci. 74, 698–704. doi: 10.1093/gerona/gly084

PubMed Abstract | CrossRef Full Text | Google Scholar

Scheltens, P., De Strooper, B., Kivipelto, M., Holstege, H., Chételat, G., Teunissen, C. E., et al. (2021). Alzheimer's disease. Lancet 397, 1577–1590. doi: 10.1016/S0140-6736(20)32205-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., et al. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504. doi: 10.1101/gr.1239303

PubMed Abstract | CrossRef Full Text | Google Scholar

Simrén, J., Leuzy, A., Karikari, T. K., Hye, A., Benedet, A. L., Lantero-Rodriguez, J., et al. (2021). The diagnostic and prognostic capabilities of plasma biomarkers in Alzheimer's disease. Alzheimers Dement. 17, 1145–1156. doi: 10.1002/alz.12283

PubMed Abstract | CrossRef Full Text | Google Scholar

Tesi, N., van der Lee, S. J., Hulsman, M., Jansen, I. E., Stringa, N., van Schoor, N. M., et al. (2020). Immune response and endocytosis pathways are associated with the resilience against Alzheimer's disease. Transl. Psychiatry 10:332. doi: 10.1038/s41398-020-01018-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Veres, S M, Molnar, L, and Lincoln, N K. The Cognitive Agents Toolbox (CAT)–Programmingautonomous Vehicles. South Yorkshire: Sysbrain Ltd. (2009).

Google Scholar

Wang, S. M., Kim, N. Y., Kang, D. W., Um, Y. H., Na, H. R., Woo, Y. S., et al. (2021). A comparative study on the predictive value of different resting-state functional magnetic resonance imaging parameters in preclinical Alzheimer's disease. Front. Psych. 12:626332. doi: 10.3389/fpsyt.2021.626332

PubMed Abstract | CrossRef Full Text | Google Scholar

Weissberger, G. H., Melrose, R. J., Fanale, C. M., Veliz, J. V., and Sultzer, D. L. (2017). Cortical metabolic and cognitive correlates of disorientation in Alzheimer's disease. J. Alzheimers Dis. 60, 707–719. doi: 10.3233/JAD-170420

PubMed Abstract | CrossRef Full Text | Google Scholar

Xiao, L., Wang, J., Kassani, P. H., Zhang, Y., Bai, Y., Stephen, J. M., et al. (2020). Multi-Hypergraph learning-based brain functional connectivity analysis in fMRI data. IEEE Trans. Med. Imaging 39, 1746–1758. doi: 10.1109/TMI.2019.2957097

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, B., Zhang, X., Nie, F., Wang, F., Yu, W., and Wang, R. (2021). Fast multi-view clustering via nonnegative and orthogonal factorization. IEEE Trans. Image Process. 30, 2575–2586. doi: 10.1109/TIP.2020.3045631

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, S., Liu, C.-C., Li, W., Shen, H., Laird, P. W., and Zhou, X. J. (2012). Discovery of multi-dimensional modules by integrative analysis of cancer genomic data. Nucleic Acids Res. 40, 9379–9391. doi: 10.1093/nar/gks725

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, J., Wang, H., Zhao, Y., Guo, L., and Du, L., Alzheimer’s Disease Neuroimaging Initiative (2022). Identification of multimodal brain imaging association via a parameter decomposition based sparse multi-view canonical correlation analysis method. BMC Bioinformatics 23:128:128. doi: 10.1186/s12859-022-04669-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, Z., Bai, J., Zhong, S., Zhang, R., Kang, K., Zhang, X., et al. (2021). Integrative functional genomic analysis of molecular signatures and mechanistic pathways in the cell cycle underlying Alzheimer's disease. Oxidative Med. Cell. Longev. 2021:5552623. doi: 10.1155/2021/5552623

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhuang, L., Ni, H., Wang, J., Liu, X., Lin, Y., Su, Y., et al. (2020). Aggregation of vascular risk factors modulates the amplitude of low-frequency fluctuation in mild cognitive impairment patients. Front. Aging Neurosci. 12:604246. doi: 10.3389/fnagi.2020.604246

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: non-negative matrix factorization, Alzheimer’s disease, mild cognitive impairment, hypergraph learning, biomarkers

Citation: Zhuang J, Tian J, Xiong X, Li T, Chen Z, Chen R, Chen J and Li X (2023) Associating brain imaging phenotypes and genetic risk factors via a hypergraph based netNMF method. Front. Aging Neurosci. 15:1052783. doi: 10.3389/fnagi.2023.1052783

Received: 24 September 2022; Accepted: 08 February 2023;
Published: 02 March 2023.

Edited by:

Balaji Krishnan, University of Texas Medical Branch at Galveston, United States

Reviewed by:

Xianglian Meng, Changzhou Institute of Technology, China
Chao Huang, Florida State University, United States

Copyright © 2023 Zhuang, Tian, Xiong, Li, Chen, Chen, Chen and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xiaoxing Xiong, xiaoxingxiong@whu.edu.cn; Taihan Li, lith17@lzu.edu.cn

Authors note

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.