- 1School of Mathematics Science, Liaocheng University, Liaocheng, China
- 2School of Science and Technology, University of Camerino, Camerino, Italy
- 3Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
Function brain network (FBN) analysis has shown great potential in identifying brain diseases, such as Alzheimer's disease (AD) and its prodromal stage, namely mild cognitive impairment (MCI). It is essential to identify discriminative and interpretable features from function brain networks, so as to improve classification performance and help us understand the pathological mechanism of AD-related brain disorders. Previous studies usually extract node statistics or edge weights from FBNs to represent each subject. However, these methods generally ignore the topological structure (such as modularity) of FBNs. To address this issue, we propose a modular-LASSO feature selection (MLFS) framework that can explicitly model the modularity information to identify discriminative and interpretable features from FBNs for automated AD/MCI classification. Specifically, the proposed MLFS method first searches the modular structure of FBNs through a signed spectral clustering algorithm, and then selects discriminative features via a modularity-induced group LASSO method, followed by a support vector machine (SVM) for classification. To evaluate the effectiveness of the proposed method, extensive experiments are performed on 563 resting-state functional MRI scans from the public ADNI database to identify subjects with AD/MCI from normal controls and predict the future progress of MCI subjects. Experimental results demonstrate that our method is superior to previous methods in both tasks of AD/MCI identification and MCI conversion prediction, and also helps discover discriminative brain regions and functional connectivities associated with AD.
1. Introduction
Resting-state functional magnetic resonance imaging (rs-fMRI) provides a non-invasive measure of brain activity and attracts considerable attention for understanding the brain organization (Bijsterbosch et al., 2017; Zhang et al., 2020). Function brain network (FBN) derived from rs-fMRI scans has been increasingly employed to computer-aided diagnosis of brain disorders, such as autism spectrum disorder (Jie et al., 2018a; Wang et al., 2019a,c; Wen et al., 2019; He et al., 2020), Alzheimer's disease (AD) and its prodromal stage (i.e., mild cognitive impairment, MCI) (Stam, 2014; Fornito et al., 2015; Liu M. et al., 2015; Jie et al., 2018b).
Extracting effective features from FBNs is a critical step to improve classification performance and interpretability of brain functional networks (Kim et al., 2019; Qiu et al., 2019). As shown in Figure 1 (1), three kinds of feature representations have been employed for FBN-based disease identification based on different granularities, including global-level topology features, node-level features, and edge-level features. The first one is the global topological statistics of the whole FBN, such as sparsity and efficiency (Hamilton, 2020). Despite its simplicity, the global statistics may lack specificity. That is, due to their global characteristics, the global measures cannot help identify the disease-affected brain regions (i.e., nodes) and functional connections (i.e., edges) in a brain network. The second category focuses on node-based graph statistics (e.g., local clustering coefficients; Wee et al., 2012). They can specifically locate disease-related regions on the node level, but usually fail to recognize the contributions of different edges/connections in a network. Besides, both global- and node-level statistics extracted from FBNs tend to capture different network properties, which requires prior knowledge and thus makes the feature design an intractable problem (Hamilton, 2020).
Figure 1. (1) Different granularity of feature. From the clockwise direction is the global-level topology feature, node-level topology feature, and edge-level topology feature, respectively. (2) The mechanism of traditional edge feature extraction in FBN. The network adjacency matrix from each subject is first mapped onto a vector by removing the redundant part if the matrix is symmetric, and then the vectors from all subjects are rearranged together as an input of the following feature selection methods.
The third strategy uses edge-level features (e.g., edge weights) to represent a network (Qiao et al., 2017; Xue et al., 2020), which is simple and can naturally obtain the localization of effects on the granularity of edges. In practice, the adjacent matrix of FBN from each subject is generally concatenated into an edge vector (removing the redundant part if the adjacent matrix is symmetric), and then the edge vectors from all subjects are piled up, as shown in Figure 1 (2). In this case, the edge features associated with all subjects are stacked into a matrix for further selection (e.g., through t-test and LASSO). However, these methods ignore network topologies such as modularity that provides valuable information for understanding the pathological mechanism of AD-related brain disorders.
Modularity plays an important role in FBN modeling and analysis, and can help us understand operating mechanisms of brain (Shen et al., 2010; Gallen et al., 2016; Wen et al., 2019). Meunier et al. (2009b) conducted FBN analysis and found that FBN has a hierarchical modular organization with a fair degree of similarity between subjects. Motivated by the fact that the brain exhibits a modular organization, we propose a modular-LASSO feature selection (MLFS) framework that consists of a two-step learning scheme. Specifically, the proposed MLFS first searches modular structure of FBNs through a signed spectral clustering algorithm, and then selects discriminative features using group LASSO based on modularity information, followed by a support vector machine (SVM) for brain disease classification. Our proposed method is validated on the public ADNI dataset (Jack et al., 2008) with 563 rs-fMRI scans to identify AD/MCI subjects from normal controls and perform MCI conversion prediction, with experimental results demonstrating its superiority over conventional methods.
The rest of the paper is organized as follows. In section 2, we review the most relevant studies on fMRI-based FBN analysis. In section 3, we introduce the data used in the study and present our method. In section 4, we conduct experiments and provide a comparative evaluation of the involved methods. In section 5, we discuss the impact of parameters, the number of modules, the different node-level features on classification performance and the effect of connectivity variations in FBN, visualize the disease-related features (functional connections) and modules identified by our proposed method, and present limitations of this work as well as future research directions. Finally, we conclude the paper in section 6.
2. Related Work
In this section, we briefly review the most relevant studies on feature representation of functional brain networks (FBNs) and existing methods on modularity analysis of FBNs.
2.1. Feature Representation of FBNs
As the basis of subsequent classification/regression tasks, feature representation of brain networks is essential for FBN analysis. Currently, three categories of features based on different granularities (global/network-level, node-level, and edge-level) have been employed for representing FBNs.
The first two categories (i.e., global-level and node-level representation) use topological measures to represent the whole brain or brain regions for identifying patients from healthy controls. For example, Feng et al. (2020) extracted spatial and temporal eigenvalue features from high-order dynamic FBNs as feature representations of each subject for AD classification. Jie et al. (2016) extracted local clustering coefficients from hyper-connectivity networks as features to identify subjects with MCI. Although these studies have achieved good results, the topological measures involved in these methods need to be designed manually, which is cumbersome, time-consuming, and also subjective. In the third category, numerous studies represent FBNs by edge-level features (e.g., edge weights) for each subject, followed by edge vector-based feature selection for classification. For example, Sun et al. (2021) extracted edge weight features from sparse FBNs to identify patients with MCI and Autism spectrum disorder (ASD) disorder. Liu F. et al. (2015) extracted connectivity strengths from FBNs as features for social anxiety disorder classification. However, these studies usually ignore the overall topology of functional brain networks (e.g., modularity), and the edge features are generally of large scale, possibly resulting in a series of problems such as the curse of dimensionality and the error of multiple comparisons (Garcia et al., 2017).
2.2. Modularity Analysis of FBNs
Previous studies have shown that FBNs exhibit a modular organization, such that they are comprised of a group of sub-networks (Gallen et al., 2016). Research on network modularity helps us to understand the organizational principles of the brain, which has important theoretical significance and practical value in FBN analysis.
Many studies have focused on finding modules in brain networks. For example, Meunier et al. (2009a) studied the modular partitions of resting-state networks in the human brain, and investigated the influence of normal aging on the modular structure. Valencia et al. (2009) investigated modular organization in resting-state networks at the voxel level, and showed modules at a finer grain level. Although these studies on the partition of modules distinguished the different roles and status of nodes, they did not apply the modular structure to the analysis of FBNs (e.g., FBN construction, feature learning, and classification). Recently, many studies applied modularity prior to FBN construction. For example, Qiao et al. (2016) estimated FBNs by incorporating modularity prior, and achieved higher classification accuracy based on the modularized FBNs. Zhou et al. (2018) learned an optimal neighborhood high-order network with sparsity and modularity priors for MCI conversion prediction. However, these existing studies cannot explicitly employ the modular structure to guide the feature selection of brain networks to improve the diagnostic performance of early-stage dementia.
3. Materials and Methods
In this section, we first introduce the overall pipeline of FBN-based brain disease classification with the proposed MLFS method. As shown in Figure 2, this framework contains three major components, including (1) fMRI pre-processing and FBN construction; (2) feature selection based on MLFS; and (3) SVM-based classification.
Figure 2. Illustration of the proposed framework for brain disease classification, including three major parts: (1) image pre-processing and FBN construction; (2) feature selection based on MLFS; and (3) classification based on support vector machine (SVM).
3.1. Image Preprocessing and FBN Construction
In this paper, we evaluate our proposed method based on the dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI)1, which is used in a recent study (Wang et al., 2019b). The dataset contains 563 resting-state fMRI scans from 174 subjects, including 154 normal control (NC) cases, 165 early MCI (eMCI) cases, 145 late MCI (lMCI) cases, and 99 AD cases. Note that each participant may have more than one scan (with the time interval of at least 6 months between two scans). For independent evaluation, a subject-level cross-validation strategy will be used in our experiments. The scanning parameters of fMRI data are as follows: in-plane image resolution = 2.29 ~ 3.31 mm, slice thickness = 3.31 mm, echo time (TE) = 30 ms, repetition time (TR) = 2.2 ~ 3.1 s, and the scanning time for each subject is 7 min (resulting in 140 volumes). The demographic information of the studied subjects is summarized in Table 1.
We process the rs-fMRI scans involved in this study by using a standard pipeline in the FSL FEAT software (Jenkinson et al., 2012). To ensure signal stabilization, the first three volumes of each subject were discarded. The remaining volumes are corrected to achieve the same slice acquisition time and remove the effect of head motion. Specifically, the subjects with the maximal translation of head motion larger than 2.0 mm or maximal rotation larger than 2 are excluded. Besides, the structural skull stripping is performed based on T1-weighted MRI. Then, the skull-stripped images are aligned onto the Montreal Neurological Institute (MNI) space. After all subjects were registered to the common “standard” space, the band-pass filtering is performed within a frequency interval of [0.015, 0.15 Hz]. Next, nuisance signals, including white matter, cerebrospinal fluid, and motion parameters, were regressed out. Then, the fMRI data are further spatially smoothed by a Gaussian kernel with full-width-at-half-maximum (FWHM) of 6 mm. Note that we did not perform scrubbing, since this would introduce additional artifacts. Finally, the brain space of fMRI scans is partitioned into 116 pre-defined regions-of-regions (ROIs) using the Automated Anatomical Labeling (AAL) template (Tzourio-Mazoyer et al., 2002) via a deformable registration method (Vercauteren et al., 2009). The BOLD signals from the gray matter tissue are extracted, and the mean time series of each ROI is calculated.
After image preprocessing, we use the pairwise Pearson's correlation (PC) of the extracted BOLD signals to measure the functional connectivity between each pair of ROIs. As a result, we can obtain the estimated FBN for each subject, where each node corresponds to a specific ROI and each edge weight denotes the Pearson's correlation coefficient between BOLD signals associated with a pair of ROIs. Also, we apply Fisher's r-to-z transformation to normalize the edge weights in each FBN. Note that each FBN is a singed graph, where the positive edge weights may indicate the mutual promotion and those negative edge weights may indicate the mutual inhibition (Parente et al., 2018).
3.2. Modular-LASSO Feature Selection
In this section, we introduce the proposed modular-LASSO feature selection (MLFS) scheme for selecting features from the estimated FBNs. As shown in Figure 3, the MLFS contains three major parts: (1) modular structure extraction via a signed spectral clustering algorithm, (2) network rearrangement based on the extracted modular information, and (3) modular structure induced feature selection via group LASSO.
Figure 3. Illustration of the proposed MLFS framework that includes three major parts: (1) modular structure extraction based on signed spectral clustering, (2) adjacency matrix rearrangement based on the extracted modular structure, and (3) feature selection based on group LASSO.
3.2.1. Modular Structure Extraction
Nodes in an FBN tend to be organized with a modular structure, which means that nodes in the same module are densely connected with each other, and nodes of different modules are sparsely connected (Bechtel, 2003). In practice, one can employ spectral clustering algorithms to detect the modular structure in a network (Ng et al., 2002), but traditional spectral clustering methods require the adjacency matrix of a graph/network to be unsigned. Therefore, we cannot directly apply conventional spectral clustering algorithms to signed FBNs for modular structure discovery. To address this issue, a signed spectral clustering algorithm (Gallier, 2016) is used to search modular structures from singed FBNs in this work. Note that we only use the FBNs of normal controls to identify brain network modules, so as to make the identified modules more reasonable.
Denote m (m = 116 in this work) as the number of ROIs and K as the number of clusters (i.e., modules). An FBN is represented by an undirected weighted graph G(V, E, W) where V indicates the node set (i.e., ROIs), E indicates the edge set (i.e., functional connectivities between paired ROIs), and W ∈ Rm×m is the graph adjacency matrix estimated by PC. For any i, j ∈ V (i, j = 1, ⋯ , m), wij is the weight between a pair of nodes i and j. The signed degree of the node i is defined as follows:
and the signed degree matrix D ∈ Rm×m is defined as:
Accordingly, the signed normalized Laplacian L is defined as follows:
Given a partition (A1, ⋯ , AK) of V (with K clusters), the signed normalized cut sNcut(A1, ⋯ , AK) (Gallier, 2016) is defined as follows:
where Xk that contains the information of partition is an indicator vector for Ak, and each cluster will be treated as a specific module. Minimizing the above objective function in Equation (4) is equivalent to solving a generalized eigenvalue equation. The optimization algorithm for the spectral clustering of signed graphs (i.e., FBNs) is shown in Algorithm 1.
3.2.2. Adjacency Matrix Rearrangement
Based on the modular structure identified by the signed spectral algorithm, we first rearrange the adjacency matrix W for each subject so that nodes belonging to the same module are adjacent to each other, as shown in Step (B) of Figure 3. We then reshape the rearranged adjacency matrix into an edge vector (removing the redundant part since the adjacent matrix is symmetric) to represent each subject. Finally, we pile up the edge vectors of all subjects into a data matrix (or design matrix) , where N is the number of subjects and d = dw + db represents the number of total edges (i.e., connectivities).
This design matrix X consists of two parts: (1) that contains dw within-module edges that connect nodes within the K modules (with each module as a specific group), and (2) that contains db between-module edges that connect these K modules and these edges can be divided into db groups (with each edge corresponding to an individual group). That is, these d dimensional features can be divided into G = K + db groups. In this way, each subject can be represented by both the within-module edge-level features and the between-module edge-level features of its FBN.
3.2.3. Modular Structure Induced Feature Selection
We further develop a modular structure induced feature selection method to select the most informative edge-level features from FBNs for AD-related disease identification based on the group LASSO algorithm (Jiang et al., 2019). As mentioned before, X ∈ RN×d is the new design matrix for N training samples, and d have been naturally divided into G groups. Denote dg as the number of elements in the gth (g = 1, ⋯ , G) group, and as the response vector, where yi (i = 1, ⋯ , N) represents the class label of the ith subject. The proposed modularity-induced feature selection method can be formulated as
where λ > 0 is the regularization parameter, and ω is the to-be-learned weighted vector which is divided into G groups (with ωg representing the coefficient corresponding to the gth group). The second term in Equation (5) can generate a sparse solution and encourage some groups of ω to be zeros, which helps us select those edge-level features with non-zero coefficients in ω. In this way, our extracted modular structure can be explicitly employed to help identify the most informative edges in FBNs. We use the SLEP toolbox2 to solve the optimization problem defined in Equation (5).
3.3. Classification
Based on the selected features, we use a linear SVM with the default parameter (i.e., C = 1) for AD/MCI identification and MCI conversion prediction due to the two following considerations.
(1) The main goal of our experiment is to verify the effectiveness of the proposed MLFS feature selection method. However, considering the influence of different steps in the classification pipeline on the final results, it is difficult to conclude which step (FCN estimation, feature selection, and classifier) contributes more to the final accuracy. Therefore, we used the simplest and most popular classification method.
(2) It is challenging for some complicated deep learning methods, such as RCNN (Liang and Hu, 2015), BrainNetCNN (Kawahara et al., 2017), and GraphCNN (Defferrard et al., 2016), to tune hyper-parameters and train a good model without sufficient training samples (subjects). In practice, recent studies have shown that the classical machine learning algorithms tend to perform better than the deep neural networks (Dadi et al., 2019; Pervaiz et al., 2020).
4. Experiments
4.1. Competing Methods
In the experiments, we compare our proposed MLFS scheme with several traditional schemes for FBN-based classification. As shown in Figure 4, according to the different granularity, we first extract different commonly-used statistics of FBN as features, including global clustering coefficient, local clustering coefficient, and edge weights. Then, two popular feature selection algorithms, i.e., t-test and LASSO, are used to select discriminative features, followed by the SVM classifier. That is, the proposed MLFS is compared with five competing schemes, including (1) Global, (2) Node-t-test, (3) Node-LASSO, (4) Edge-t-test, and (5) Edge-LASSO. For a fair comparison, we employ the LIBSVM toolbox provided in Chang and Lin (2011) for SVM-based brain disease classification for all competing methods.
Figure 4. Different schemes for comparison. (1) The Global method extracts global clustering coefficients from FBNs as a one-dimensional vector. We extract local clustering coefficients from 116 nodes as a 116-dimensional matrix and perform feature selection via (2) t-test (called Note-t-test) and (3) LASSO (called Node-LASSO). We extract 116 × 115/2 network edge weights from 116 nodes as a 6,670-dimensional matrix, and perform feature selection via (4) t-test (called Edge-t-test) and (5) LASSO (called Edge-LASSO); (6) The proposed MLFS scheme for modularity-guided feature selection. For the fair comparison, the same SVM classifier is used for these six methods for classification.
4.2. Experimental Settings
Three classification tasks are performed to evaluate the performance of our proposed method and five competing methods, including (1) MCI conversion prediction (i.e., lMCI vs. eMCI classification), (2) eMCI vs. NC classification, and (3) AD vs. NC classification. Considering the fact that one subject may have multiple scans in the dataset, using scan-level cross-validation (CV) will cause potential bias in classification. Therefore, we employ a five-fold subject-level CV strategy to ensure that the training data and test data are independent. Specifically, we first divide 174 subjects into five-fold (with each fold containing the roughly same number of subjects). Then, we use four-fold as training data to select features and train the classifier, and the remaining one-fold to validate classification performance.
Besides, since the parameters involved in feature selection models may affect the number of selected features and the ultimate classification results, we conduct an inner five-fold CV on the training data to determine the optimal parameters for all competing methods, as shown in Figure 5 (1). For each parameter, we use 11 candidate values in [0.01, 0.1, 0.2, ⋯ , 0.9, 1]. Note that the optimal parameters may vary with different training sets. Therefore, we re-select features and re-train classifier (also linear SVM with C = 1) based on the current training set with optimal parameters, as shown in Figure 5 (2). Finally, we classify the test sample using the selected features and trained classifier. To avoid any bias introduced by random partition in CV, the process of data partition and five-fold CV are independently repeated 1,000 times, and the mean and standard deviation of classification results are reported. Besides, to illustrate the result is statistically significant, we perform paired t-tests (with p < 0.05) on the results of the involved methods, and then use the term marked by “*” to denote that the result of MLFS is significantly better than five competing methods.
Figure 5. The mechanism of cross-validation in our experiment, including the Inner 5-CV to determine the optimal parameters and the outer 5-CV to get the classification results.
We evaluate the performance of different methods via four evaluation metrics, including (1) accuracy (ACC) which is the proportion of subjects that are correctly classified samples in all samples, (2) sensitivity (SEN) which denotes the proportion of patients that are correctly classified, (3) specificity (SPE) which is the proportion of NCs that are correctly predicted, and (4) the area under the receiver operating characteristic (ROC) curve (AUC).
4.3. Classification Results
Table 2 summarizes the results of six methods in three classification tasks, and Figure 6 plots the corresponding ROC curves. From Table 2 and Figure 6, we have the following interesting observations.
(1) The proposed MLFS method achieves the significant best performance in three classification tasks, compared with five competing methods. Note that the five competing methods do not consider the modularity information in FBNs. These results imply that using modularity information to guide the feature selection (as we do in MLFS) helps boost the classification performance for AD and MCI.
(2) Regarding three different granularity features (i.e., global-level, node-level, and edge-level), we can see that the performance of the Global method (based on global feature) is the worst. Also, methods using edge-level features (i.e., Edge-t-test, Edge-LASSO) usually outperform two methods with node-level features (i.e., Node-t-test, Node-LASSO). The possible reason is that edge-level features may be able to capture more topological information of FBNs and tend to result in more stable performance.
(3) Regarding three feature selection algorithms, methods with LASSO generally achieve better performance than those with t-test in three tasks. This may be because that t-test only considers the category-level differences of features and does not fully consider the relationship between features and category labels.
(4) In the task of lMCI vs. eMCI classification, the six methods achieve worse performance when compared with the other two tasks (i.e., eMCI vs. NC and AD vs. NC classification). This implies that identifying late MCI subjects from early MCI subjects is very challenging, while identifying subjects with AD/eMCI from normal controls is relatively easier. The underlying reason is that the brain function degeneration in AD and late MCI subjects could be more serious than in the early stage of MCI and NC.
Table 2. Classification performance of six schemes in three classification tasks (mean ± standard deviation).
Figure 6. The ROC curves achieved by all six methods in three classification tasks: (1) lMCI vs. eMCI, (2) eMCI vs. NC, and (3) AD vs. NC.
5. Discussion
In this section, we first analyze the effect of several key hyperparameters in the proposed method, the impact of different node-level features on classification performance and the effect of connections variations in FBN. We then visualize the most discriminative features (i.e., functional connections) and modules identified by our method in different classification tasks. We also present the limitations of this work as well as several future research directions.
5.1. Effect of Number of Modules
Previous studies have found that human FBNs have a hierarchical modular organization and have different numbers of modules in each hierarchy (He et al., 2009; Meunier et al., 2009a; Power et al., 2011; Rubinov and Sporns, 2011). In our proposed MLFS scheme, we extract a total of K modules by using a signed spectral clustering algorithm, and the number of modules would affect the selected features and further affect classification performance. In Figure 7, we show the accuracies achieved by our MLFS in three classification tasks with respect to different numbers of modules. It can be observed from Figure 7 that, for each specific task, the accuracy values achieved by MLFS slightly vary when using different numbers of modules. And the best results are achieved when using 16, 8, and 14 modules in the task of lMCI vs. eMCI, eMCI vs. NC, and AD vs. NC classification, respectively.
Figure 7. Classification accuracy achieved by the proposed method using different numbers of modules in three classification tasks.
5.2. Sensitivity to Model Parameters
In Equation (5), the parameter λ is involved in group LASSO, which may affect the number of selected features. With the optimal module numbers (i.e., 16 modules for lMCI vs. eMCI classification, 8 modules for eMCI vs. NC classification, and 14 modules for AD vs. NC classification), we calculate the classification accuracy of the proposed MLFS with different values of λ, with experimental results reported in Figure 8. As shown in Figure 8, the MLFS works well with overall stable performance in both tasks of eMCI vs. NC and AD vs. NC classification. In the task of lMCI vs. eMCI classification, the accuracy results slightly fluctuate with different values of λ. Thus, we propose to select the optimal parametric values via inner cross-validation on the training data.
Figure 8. Classification accuracy achieved by the proposed method using different values of λ in three classification tasks.
5.3. Effect of Different Node-Level Features
When representing FBNs, node-level features can specifically locate disease-related regions, so as to help us understand the pathological mechanism of brain disorders. However, different node-level statistics extracted from FBNs tend to capture different network properties. Therefore, it is essential to analyze the effect of different node-level statistics on the final classification results. In Figure 9, we calculate the classification accuracy of the node-t-test method and node-LASSO method with five different node statistics: (1) local clustering coefficient (LCC), (2) degree centrality (DC), (3) betweenness centrality (BC), (4) closeness centrality (CC), and (5) eigenvector centrality (EC). It can be observed that the performance of different node-level statistics may vary for different tasks or feature selection methods. The results based on DC and CC statistics are overall the best.
Figure 9. Classification accuracy achieved by the node-level methods using different node statistics in three tasks of (1) lMCI vs. eMCI, (2) eMCI vs. NC, and (3) AD vs. NC classification. LCC, local clustering coefficient; DC, degree centrality; BC, betweenness centrality; CC, closeness centrality; EC, eigenvector centrality.
5.4. Discriminative Connections and Brain Regions
With the empirically optimal module numbers (see Figure 7) and feature selection parameter (see Figure 8), we investigate which features are selected by the proposed MLFS scheme for AD-related disease classification. Since features selected in each fold of cross-validation could be different, we select those features that occur in all five-fold as the most discriminative features for classification. Figure 10 shows the most discriminative connections selected by MLFS in three tasks. In Figure 10, the color of each arc is randomly assigned for better visualization, and the thickness of each arc represents the discriminating power of the corresponding connection (rather than the actual connectivity strength).
Figure 10. Most discriminative functional connections in three classification tasks: (1) lMCI vs. eMCI, (2) eMCI vs. NC, and (3) AD vs. NC classification.
In Figure 11, we visualize the modules identified by our method with the signed spectral clustering algorithm (see the 1st and 2nd rows) on the AAL template, and also visualize the most discriminative modules (see the 3rd row) based on the selected discriminative connections by our MLFS method. From this figure, we can observe that our identified discriminative modules contain several important brain regions, such as the middle temporal gyrus, hippocampus, para hippocampus, superior medial frontal gyrus, medial orbitofrontal gyrus, supramarginal gyrus and the precuneus, which have been reported in previous AD-related studies (Zhou et al., 2008; Han et al., 2012; Liu et al., 2012). These results further validate the reliability of our MLFS in identifying biomarkers for AD/MCI diagnosis.
Figure 11. Most discriminative modules identified by the signed spectral clustering algorithm (1st and 2nd rows) and our proposed MLFS method (3rd row) based on the selected discriminative connections in three tasks of (1) lMCI vs. eMCI, (2) eMCI vs. NC, and (3) AD vs. NC classification.
5.5. Effect of Connections Variations in FBN
Functional connectivity networks constructed via Pearson's correlation (PC) may be sensitive to noise. To investigate whether the variations of connections will influence our proposed method, we conduct a group of experiments by adding varying degrees of white Gaussian random noise to the FBN estimated by PC, and present the experimental results in Figure 12 (1). It can be observed that the classification results only show a slight fluctuation when the noise degree (standard deviation) is <0.1. However, the classification accuracy will be greatly reduced with the increase of noise degree.
Figure 12. (1) Results achieved by the proposed method with varying degrees of FBN random noise in lMCI vs. eMCI classification. (2) Results achieved by the proposed MLFS-boot method, the MLFS methods, and other five competing methods in lMCI vs. eMCI classification.
To further investigate the robustness of our method, we use a standard bootstrapping process for creating several training sets (with the same size as the original training set). Then we perform the training process on these pseudo-sets and create an ensemble of classifiers. Figure 12 (2) shows the experimental results in the task of lMCI vs. eMCI classification, involving the original MLFS method, MLFS with the bootstrapping process (called MLFS-boot), and five competing methods. It can be observed from Figure 12 (2) that the proposed method outperforms five competing methods. Especially, the MLFS-boot method results in a similar performance to the MLFS method, implying that the MLFS scheme has relatively good robustness.
5.6. Effect of Different Network Construction Methods
In the previous experiments, we only used the Pearson's correlation algorithm for estimating FBNs, since our main focus is to use the modularity information for selecting discriminative and interpretable features. To investigate how our proposed method is affected by different network construction methods, we also use sparse inverse covariance (SIC) (Huang et al., 2010), a popular computation scheme of partial correlation, to estimate FBNs. Based on the FBNs estimated via SIC, we then conduct lMCI vs. eMCI classification and report the results of the proposed method and five competing methods in Table 3.
Table 3. Classification performance of six schemes in MCI conversion prediction (i.e., lMCI vs. eMCI classification) using sparse inverse covariance to estimate the FBN (mean ± standard deviation).
From Table 3, we have several observations that are similar to the previous experiments. First, the proposed MLFS method achieves the statistically significant best performance in lMCI vs. eMCI classification, compared with five competing methods. This indicates that our method can achieve the best performance no matter what kind of brain network estimation algorithm is used. Second, the performance of the global method (based on global feature), as always, is the worst. The edge-based methods usually outperform the node-based methods. And the methods with LASSO generally achieve better performance than those with t-test.
Furthermore, from Tables 2, 3, we can see that, with the same experimental settings, using SIC to estimate FBNs can get better classification performance than PC. This implies that FBNs estimated by SIC may have several advantages. On the one hand, SIC can effectively reveal the partial correlation between brain regions. That is, the FBN estimated with SIC can factor out the contribution to the pairwise correlation that might be due to global or third-party effects. This may result in clearer modules in FBN. On the other hand, SIC estimation imposes a “sparsity” constraint on the FBN, which is appropriate to model brain connectivity because many past studies based on anatomical brain databases have shown that the true brain network is sparse.
5.7. Limitations and Future Work
There are several limitations in the current work. First, we perform modular structure search and feature selection through two separate steps, so that the identified modular structures are not necessarily optimal for the subsequent classification task. As a future work, we plan to explore a joint learning framework to perform modular structure search and feature selection for FBN analysis. Second, only the ADNI dataset (with a limited number of fMRI scans) is used for performance evaluation in the current study. We will apply the proposed method to identify other types of brain disorders based on large-scale datasets such as ABCD (Bjork et al., 2017), ABIDE (Heinsfeld et al., 2018), and REST-meta-MDD (Yan et al., 2019). Besides, when constructing functional brain networks, we ignore the temporal information in the time-series data. It is interesting to employ data-driven methods (e.g., deep neural networks) to incorporate temporal dynamics into FBN construction (Wang et al., 2019b; Jie et al., 2020), which will be our future work.
6. Conclusion
In this paper, we propose a modularity-guided functional brain network (FBN) analysis method, namely MLFS, to identify discriminative and interpretable features from FBNs for automated AD/MCI classification. Specifically, we first search modular information of FBN by a signed spectral clustering algorithm and then select edge-level network features based on a modularity-induced group LASSO method. Finally, we use the selected features to identify different stages of subjects with AD or MCI. Experimental results on 563 rs-fMRI scans from ADNI suggest the superiority of the proposed method in three classification tasks, compared with conventional methods for FBN-based brain disease diagnosis.
Data Availability Statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.
Ethics Statement
The studies involving human participants were reviewed and approved by the open dataset of Alzheimer's Disease Neuroimaging Initiative. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.
Author Contributions
YZ and LQ designed the study. YZ downloaded and analyzed the data, performed experiments, and drafted the manuscript. YZ, XJ, LQ, and ML revised the manuscript. All the authors read and approved the final manuscript.
Funding
YZ, XJ, and LQ were partly supported by National Natural Science Foundation of China (Nos. 61976110 and 11931008), Natural Science Foundation of Shandong Province (No. ZR2018MF020), and Taishan Scholar Program of Shandong Province.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Footnotes
References
Bechtel, W. (2003). “Modules, brain parts, and evolutionary psychology,” in Evolutionary Psychology (Boston, MA: Springer), 211–227. doi: 10.1007/978-1-4615-0267-8_10
Bijsterbosch, J., Smith, S. M., and Beckmann, C. F. (2017). An Introduction to Resting State fMRI Functional Connectivity. Oxford University Press.
Bjork, J. M., Straub, L. K., Provost, R. G., and Neale, M. C. (2017). The ABCD study of neurodevelopment: Identifying neurocircuit targets for prevention and treatment of adolescent substance abuse. Curr. Treat. Opt. Psychiatry 4, 196–209. doi: 10.1007/s40501-017-0108-y
Chang, C.-C., and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27. doi: 10.1145/1961189.1961199
Dadi, K., Rahim, M., Abraham, A., Chyzhyk, D., Milham, M., Thirion, B., et al. (2019). Benchmarking functional connectome-based predictive models for resting-state fMRI. NeuroImage 192, 115–134. doi: 10.1016/j.neuroimage.2019.02.062
Defferrard, M., Bresson, X., and Vandergheynst, P. (2016). Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural Inform. Process. Syst. 29, 3844–3852.
Feng, C., Jie, B., Ding, X., Zhang, D., and Liu, M. (2020). “Constructing high-order dynamic functional connectivity networks from resting-state fMRI for brain dementia identification,” in International Workshop on Machine Learning in Medical Imaging (Springer), 303–311. doi: 10.1007/978-3-030-59861-7_31
Fornito, A., Zalesky, A., and Breakspear, M. (2015). The connectomics of brain disorders. Nat. Rev. Neurosci. 16, 159–172. doi: 10.1038/nrn3901
Gallen, C. L., Baniqued, P. L., Chapman, S. B., Aslan, S., Keebler, M., Didehbani, N., et al. (2016). Modular brain network organization predicts response to cognitive training in older adults. PLoS ONE 11:e0169015. doi: 10.1371/journal.pone.0169015
Gallier, J. (2016). Spectral theory of unsigned and signed graphs. applications to graph clustering: a survey. arXiv preprint arXiv:1601.04692. doi: 10.13140/RG.2.1.5010.9606
Garcia, R., Paraiso, E. C., and Nievola, J. C. (2017). “Comparative study of dimensionality reduction methods using reliable features for multiple datasets obtained by rs-fMRI in ADHD prediction,” in Canadian Conference on Artificial Intelligence (Springer), 97–102. doi: 10.1007/978-3-319-57351-9_13
Hamilton, W. L. (2020). Graph representation learning. Synth. Lect. Artif. Intell. Mach. Learn. 14, 1–159. doi: 10.2200/S01045ED1V01Y202009AIM046
Han, S. D., Arfanakis, K., Fleischman, D. A., Leurgans, S. E., Tuminello, E. R., Edmonds, E. C., et al. (2012). Functional connectivity variations in mild cognitive impairment: associations with cognitive function. J. Int. Neuropsychol. Soc. 18:39. doi: 10.1017/S1355617711001299
He, Y., Byrge, L., and Kennedy, D. P. (2020). Nonreplication of functional connectivity differences in autism spectrum disorder across multiple sites and denoising strategies. Hum. Brain Mapp. 41, 1334–1350. doi: 10.1002/hbm.24879
He, Y., Wang, J., Wang, L., Chen, Z. J., Yan, C., Yang, H., et al. (2009). Uncovering intrinsic modular organization of spontaneous brain activity in humans. PLoS ONE 4:e5226. doi: 10.1371/journal.pone.0005226
Heinsfeld, A. S., Franco, A. R., Craddock, R. C., Buchweitz, A., and Meneguzzi, F. (2018). Identification of autism spectrum disorder using deep learning and the ABIDE dataset. NeuroImage 17, 16–23. doi: 10.1016/j.nicl.2017.08.017
Huang, S., Li, J., Sun, L., Ye, J., Fleisher, A., Wu, T., et al. (2010). Learning brain connectivity of Alzheimer's disease by sparse inverse covariance estimation. NeuroImage 50, 935–949. doi: 10.1016/j.neuroimage.2009.12.120
Jack C. R. Jr, Bernstein, M. A., Fox, N. C., Thompson, P., Alexander, G., Harvey, D., et al. (2008). The Alzheimer's disease neuroimaging initiative (ADNI): MRI methods. J. Magn. Reson. Imaging 27, 685–691. doi: 10.1002/jmri.21049
Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W., and Smith, S. M. (2012). FSL. NeuroImage 62, 782–790. doi: 10.1016/j.neuroimage.2011.09.015
Jiang, X., Zhang, L., Qiao, L., and Shen, D. (2019). Estimating functional connectivity networks via low-rank tensor approximation with applications to MCI identification. IEEE Trans. Biomed. Eng. 67, 1912–1920. doi: 10.1109/TBME.2019.2950712
Jie, B., Liu, M., Lian, C., Shi, F., and Shen, D. (2020). Designing weighted correlation kernels in convolutional neural networks for functional connectivity based brain disease diagnosis. Med. Image Anal. 63:101709. doi: 10.1016/j.media.2020.101709
Jie, B., Liu, M., and Shen, D. (2018a). Integration of temporal and spatial properties of dynamic connectivity networks for automatic diagnosis of brain disease. Med. Image Anal. 47, 81–94. doi: 10.1016/j.media.2018.03.013
Jie, B., Liu, M., Zhang, D., and Shen, D. (2018b). Sub-network kernels for measuring similarity of brain connectivity networks in disease diagnosis. IEEE Trans. Image Process. 27, 2340–2353. doi: 10.1109/TIP.2018.2799706
Jie, B., Wee, C.-Y., Shen, D., and Zhang, D. (2016). Hyper-connectivity of functional networks for brain disease diagnosis. Med. Image Anal. 32, 84–100. doi: 10.1016/j.media.2016.03.003
Kawahara, J., Brown, C. J., Miller, S. P., Booth, B. G., Chau, V., Grunau, R. E., et al. (2017). BrainnetCNN: Convolutional neural networks for brain networks; Towards predicting neurodevelopment. NeuroImage 146, 1038–1049. doi: 10.1016/j.neuroimage.2016.09.046
Kim, H.-C., Tegethoff, M., Meinlschmidt, G., Stalujanis, E., Belardi, A., Jo, S., et al. (2019). Mediation analysis of triple networks revealed functional feature of mindfulness from real-time fMRI neurofeedback. NeuroImage 195, 409–432. doi: 10.1016/j.neuroimage.2019.03.066
Liang, M., and Hu, X. (2015). “Recurrent convolutional neural network for object recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3367–3375. doi: 10.1109/CVPR.2015.7298958
Liu, F., Guo, W., Fouche, J.-P., Wang, Y., Wang, W., Ding, J., et al. (2015). Multivariate classification of social anxiety disorder using whole brain functional connectivity. Brain Struct. Funct. 220, 101–115. doi: 10.1007/s00429-013-0641-4
Liu, M., Zhang, D., Chen, S., and Xue, H. (2015). Joint binary classifier learning for ecoc-based multi-class classification. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2335–2341. doi: 10.1109/TPAMI.2015.2430325
Liu, Z., Zhang, Y., Yan, H., Bai, L., Dai, R., Wei, W., et al. (2012). Altered topological patterns of brain networks in mild cognitive impairment and Alzheimer's disease: a resting-state fMRI study. Psychiatry Res. 202, 118–125. doi: 10.1016/j.pscychresns.2012.03.002
Meunier, D., Achard, S., Morcom, A., and Bullmore, E. (2009a). Age-related changes in modular organization of human brain functional networks. NeuroImage 44, 715–723. doi: 10.1016/j.neuroimage.2008.09.062
Meunier, D., Lambiotte, R., Fornito, A., Ersche, K., and Bullmore, E. T. (2009b). Hierarchical modularity in human brain functional networks. Front. Neuroinformatics 3:37. doi: 10.3389/neuro.11.037.2009
Ng, A. Y., Jordan, M. I., and Weiss, Y. (2002). “On spectral clustering: analysis and an algorithm,” in Advances in Neural Information Processing Systems, 849–856.
Parente, F., Frascarelli, M., Mirigliani, A., Di Fabio, F., Biondi, M., and Colosimo, A. (2018). Negative functional brain networks. Brain Imag. Behav. 12, 467–476. doi: 10.1007/s11682-017-9715-x
Pervaiz, U., Vidaurre, D., Woolrich, M. W., and Smith, S. M. (2020). Optimising network modelling methods for fMRI. NeuroImage 211:116604. doi: 10.1016/j.neuroimage.2020.116604
Power, J. D., Cohen, A. L., Nelson, S. M., Wig, G. S., Barnes, K. A., Church, J. A., et al. (2011). Functional network organization of the human brain. Neuron 72, 665–678. doi: 10.1016/j.neuron.2011.09.006
Qiao, L., Zhang, H., Kim, M., Teng, S., Zhang, L., and Shen, D. (2016). Estimating functional brain networks by incorporating a modularity prior. NeuroImage 141, 399–407. doi: 10.1016/j.neuroimage.2016.07.058
Qiao, L., Zhang, L., Sun, Z., and Liu, X. (2017). Selecting label-dependent features for multi-label classification. Neurocomputing 259, 112–118. doi: 10.1016/j.neucom.2016.08.122
Qiu, Y., Lin, Q.-H., Kuang, L.-D., Gong, X.-F., Cong, F., Wang, Y.-P., et al. (2019). Spatial source phase: a new feature for identifying spatial differences based on complex-valued resting-state fMRI data. Hum. Brain Mapp. 40, 2662–2676. doi: 10.1002/hbm.24551
Rubinov, M., and Sporns, O. (2011). Weight-conserving characterization of complex functional brain networks. NeuroImage 56, 2068–2079. doi: 10.1016/j.neuroimage.2011.03.069
Shen, X., Papademetris, X., and Constable, R. T. (2010). Graph-theory based parcellation of functional subunits in the brain from resting-state fMRI data. NeuroImage 50, 1027–1035. doi: 10.1016/j.neuroimage.2009.12.119
Stam, C. J. (2014). Modern network science of neurological disorders. Nat. Rev. Neurosci. 15, 683–695. doi: 10.1038/nrn3801
Sun, L., Xue, Y., Zhang, Y., Qiao, L., Zhang, L., and Liu, M. (2021). Estimating sparse functional connectivity networks via hyperparameter-free learning model. Artif. Intell. Med. 111:102004. doi: 10.1016/j.artmed.2020.102004
Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., et al. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 15, 273–289. doi: 10.1006/nimg.2001.0978
Valencia, M., Pastor, M., Fernández-Seara, M., Artieda, J., Martinerie, J., and Chavez, M. (2009). Complex modular structure of large-scale brain networks. Chaos 19:023119. doi: 10.1063/1.3129783
Vercauteren, T., Pennec, X., Perchant, A., and Ayache, N. (2009). Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45, S61–S72. doi: 10.1016/j.neuroimage.2008.10.040
Wang, M., Huang, J., Liu, M., and Zhang, D. (2019a). Functional connectivity network analysis with discriminative hub detection for brain disease identification. Proc. AAAI Conf. Artif. Intell. 33, 1198–1205. doi: 10.1609/aaai.v33i01.33011198
Wang, M., Lian, C., Yao, D., Zhang, D., Liu, M., and Shen, D. (2019b). Spatial-temporal dependency modeling and network hub detection for functional MRI analysis via convolutional-recurrent network. IEEE Trans. Biomed. Eng. 67, 2241–2252. doi: 10.1109/TBME.2019.2957921
Wang, M., Zhang, D., Huang, J., Yap, P.-T., Shen, D., and Liu, M. (2019c). Identifying autism spectrum disorder with multi-site fMRI via low-rank domain adaptation. IEEE Trans. Med. Imaging 39, 644–655. doi: 10.1109/TMI.2019.2933160
Wee, C.-Y., Yap, P.-T., Zhang, D., Denny, K., Browndyke, J. N., Potter, G. G., et al. (2012). Identification of MCI individuals using structural and functional connectivity networks. NeuroImage 59, 2045–2056. doi: 10.1016/j.neuroimage.2011.10.015
Wen, X., Zhang, H., Li, G., Liu, M., Yin, W., Lin, W., et al. (2019). First-year development of modules and hubs in infant brain functional networks. NeuroImage 185, 222–235. doi: 10.1016/j.neuroimage.2018.10.019
Xue, Y., Zhang, L., Qiao, L., and Shen, D. (2020). Estimating sparse functional brain networks with spatial constraints for MCI identification. PLoS ONE 15:e0235039. doi: 10.1371/journal.pone.0235039
Yan, C.-G., Chen, X., Li, L., Castellanos, F. X., Bai, T.-J., Bo, Q.-J., et al. (2019). Reduced default mode network functional connectivity in patients with recurrent major depressive disorder. Proc. Natl. Acad. Sci. U.S.A. 116, 9078–9083. doi: 10.1073/pnas.1900390116
Zhang, L., Wang, M., Liu, M., and Zhang, D. (2020). A survey on deep learning for neuroimaging-based brain disorder analysis. Front. Neurosci. 14:779. doi: 10.3389/fnins.2020.00779
Zhou, Y., Dougherty, J. H. Jr, Hubner, K. F., Bai, B., Cannon, R. L., and Hutson, R. K. (2008). Abnormal connectivity in the posterior cingulate and hippocampus in early Alzheimer's disease and mild cognitive impairment. Alzheimer's Dement. 4, 265–270. doi: 10.1016/j.jalz.2008.04.006
Keywords: functional brain network, modularity, feature selection, signed spectral clustering, classification
Citation: Zhang Y, Jiang X, Qiao L and Liu M (2021) Modularity-Guided Functional Brain Network Analysis for Early-Stage Dementia Identification. Front. Neurosci. 15:720909. doi: 10.3389/fnins.2021.720909
Received: 05 June 2021; Accepted: 09 July 2021;
Published: 05 August 2021.
Edited by:
Jun Shi, Shanghai University, ChinaReviewed by:
Pingkun Yan, Rensselaer Polytechnic Institute, United StatesKuangyu Shi, University of Bern, Switzerland
Copyright © 2021 Zhang, Jiang, Qiao and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Lishan Qiao, cWxpc2hhbkAxNjMuY29t; Mingxia Liu, bXhsaXUxMjI2QGdtYWlsLmNvbQ==