Identification and discovery of imaging genetic patterns using fusion self-expressive network in major depressive disorder

Pang, Mengqian; Liu, Xiaoyun; Hao, Xiaoke; Wang, Meiling; Xie, Chunming; Zhang, Li; Yuan, Yonggui

doi:10.3389/fnins.2023.1297155

ORIGINAL RESEARCH article

Front. Neurosci., 21 November 2023

Sec. Brain Imaging Methods

Volume 17 - 2023 | https://doi.org/10.3389/fnins.2023.1297155

Identification and discovery of imaging genetic patterns using fusion self-expressive network in major depressive disorder

$\r\nMengqian Pang$ Mengqian Pang¹

Xiaoyun Liu²^*

Xiaoke Hao³

Meiling Wang⁴

Chunming Xie⁵

Li Zhang¹^*

Yonggui Yuan²^*

¹College of Information Science and Technology, Nanjing Forestry University, Nanjing, China
²Department of Psychosomatic and Psychiatry, Zhongda Hospital, School of Medicine, Southeast University, Nanjing, China
³School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
⁴School of Computer Science and Engineering, Nanjing University of Posts and Telecommunications, Nanjing, China
⁵Department of Neurology, Zhongda Hospital, School of Medicine, Southeast University, Nanjing, China

Introduction: Major depressive disorder (MDD) is a prevalent mental illness, with severe symptoms that can significantly impair daily routines, social interactions, and professional pursuits. Recently, imaging genetics has received considerable attention for understanding the pathogenesis of human brain disorders. However, identifying and discovering the imaging genetic patterns between genetic variations, such as single nucleotide polymorphisms (SNPs), and brain imaging data still present an arduous challenge. Most of the existing MDD research focuses on single-modality brain imaging data and neglects the complex structure of brain imaging data.

Methods: In this study, we present a novel association analysis model based on a self-expressive network to identify and discover imaging genetics patterns between SNPs and multi-modality imaging data. Specifically, we first build the multi-modality phenotype network, which comprises voxel node features and connectivity edge features from structural magnetic resonance imaging (sMRI) and resting-state functional magnetic resonance imaging (rs-fMRI), respectively. Then, we apply intra-class similarity information to construct self-expressive networks of multi-modality phenotype features via sparse representation. Subsequently, we design a fusion method guided by diagnosis information, which iteratively fuses the self-expressive networks of multi-modality phenotype features into a single new network. Finally, we propose an association analysis between MDD risk SNPs and the multi-modality phenotype network based on a fusion self-expressive network.

Results: Experimental results show that our method not only enhances the association between MDD risk SNP rs1799913 and the multi-modality phenotype network but also identifies some consistent and stable regions of interest (ROIs) multi-modality biological markers to guide the interpretation of MDD pathogenesis. Moreover, 15 new potential risk SNPs highly associated with MDD are discovered, which can further help interpret the MDD genetic mechanism.

Discussion: In this study, we discussed the discriminant and convergence performance of the fusion self-expressive network, parameters, and atlas selection.

1 Introduction

Major depressive disorder (MDD) is one of the most prevalent mental health disorders globally. It is characterized by continuous sadness, anhedonia, and changes in appetite or weight. Additionally, individuals with MDD may experience sleep disruptions, feelings of worthlessness, and thoughts of suicide. These symptoms can severely disrupt an individual's routine life and functioning (Abdoli et al., 2022). Due to its elevated incidence and recurrence rates, MDD is anticipated to become the most burdensome disease globally by 2030 (Zhang Y. et al., 2021). Nevertheless, the specific etiology of MDD remains elusive, with a non-unified pathogenic mechanism. Currently, the diagnosis of depression mainly depends on clinical symptoms and the scoring of the Hamilton Depression Rating Scale (Kennedy, 2022), which is not enough objective.

With the rapid development of imaging genetics, such as magnetic resonance imaging (MRI), functional magnetic resonance Imaging (fMRI), diffusion tensor imaging (DTI), and positron emission tomography (PET) are extensively employed for predicting and diagnosing brain and neurological disorders (Zhang et al., 2020; Li et al., 2023). Owing to its convenience and potential in assessing brain circuits, rs-fMRI become pivotal in MDD research (Liu et al., 2020b). Literature (Bondi et al., 2022; Dai et al., 2022) suggested that using machine learning with rs-fMRI could effectively diagnose MDD and pinpoint significant features. Moreover, combining functional connectivity (FC) with network attributes helped in distinguishing between MDDs and healthy controls (HCs). Despite their promise to identify dynamic FC changes, these methods often rely solely on a single imaging modality, limiting their capacity to find consistent biomarkers across multiple imaging data.

In recent years, imaging genetics provided vast opportunities for examining the influence of genetic variations on brain structure and function. Building on this concept, the primary objective in this field was to measure the association between genetic variations, such as SNPs, and neuroimaging biomarkers derived from various imaging modes. These insights deepened our understanding of the intricate pathogenesis of diseases (Hao et al., 2016). Given that the Alzheimer's Disease Neuroimaging Initiative (ADNI) database encompassed multi-modality imaging and genotype data, a significant portion of brain imaging genetic research was focused on Alzheimer's disease (Windon et al., 2022). However, few studies employed machine learning techniques to explore neurogenetic correlations specific to MDD. Zhang et al. (2023) proposed a methodology that integrates multi-stage diagnosis status to unearth associations between genetic risk variants and the multi-modality phenotype network in MDD. This approach successfully identified the risk SNP THP1 rs1799913 as being highly associated with MDD. Nevertheless, it overlooked the intricate spatial structure of multi-modality data, suggesting a potential for further refinement.

To address this, we introduced a self-expressive network (Elhamifar and Vidal, 2013; Ji et al., 2017). To theorize, each data point is distributed within a union of subspaces, and these subspaces can effectively represent the linear or affine combinations of samples in the dataset that belong to the same subspace. These subspaces were determined by clustering samples, where similar samples clustered in the same subspace, and dissimilar samples were distributed in different subspaces. First, we proposed a fusion self-expressive network that leveraged both participant diagnosis information and local network structures for correlation investigations. Then, we developed an imaging-genetics analysis framework, calculating relationships among subjects within the same class and constructing a self-expressive network of imaging-genetics data through sparse reconstruction. Finally, by incorporating participant diagnosis information, different self-expressive networks were iteratively merged using a fusion approach, forming a network capable of representing the underlying data structure. Our method was validated using the MDD genetic risk SNP TPH1 rs1799913. Experimental results indicated that our method not only improved the performance of correlation coefficient (CC) metrics but also identified a concise set of common ROIs shared between two brain network features closely associated with the MDD genetic risk SNP TPH1 rs1799913. Furthermore, we successfully identified 15 new potential risk SNPs associated with MDD, which also identified several MDD-related ROIs.

The contributions of this paper are as follows:

1. To effectively extract similar structures within the data, we introduced a self-expressive network for self-reconstruction of the original data.

2. We proposed a fusion self-expressive network that incorporated participant diagnosis information. This approach allowed for iterative fusion, resulting in a unified network that captured the comprehensive structure of the underlying data.

3. An association model based on the fusion self-expressive network was developed to mine the risk SNPs and multi-modality brain phenotype network. Using the risk SNP rs1799913 to validate our method, experimental results demonstrated its superiority over existing algorithms.

4. We also identified 15 new potential risk SNPs associated with MDD, providing a foundation for researchers to further investigate the mechanisms underlying MDD pathogenesis.

The subsequent sections of this paper are structured as follows: Section 2 presents the demographic statistics of participants and the pre-processing of multi-modality data. In Section 3, the proposed methodology for the identification of risk-associated SNPs and the construction of the multi-modal phenotype network is introduced. The introduction and analysis of experimental results are found in Section 4. Lastly, discussions and conclusions are respectively presented in Sections 5 and 6.

2 Data source and preprocessing

2.1 Participants

This study utilized datasets obtained from two hospitals, namely the Affiliated Zhongda Hospital of Southeast University and the Second Affiliated Hospital of Xinxiang Medical University. Patients were recruited from the inpatient and outpatient departments of psychiatry in these hospitals, while HCs were recruited through media advertising and community posting. The research procedures followed the guidelines outlined in the Declaration of Helsinki. All patients provided informed consent and met the following inclusion criteria: (1) met the diagnostic criteria outlined in the Diagnostic and Statistical Manual of Mental Disorders (Fourth Edition); (2) experienced their first depressive episode and were above 18 years old at onset; (3) obtained Hamilton Depression Scale-24 (HAMD-24) scores ≥20; (4) had no history of major psychiatric illness other than depression; (5) had no primary neurodegenerative disorders such as dementia or stroke; (6) had no history of substance abuse or dependence (including drugs, caffeine, nicotine, alcohol, or others), head trauma, or loss of consciousness; (7) had no cardiac or pulmonary diseases that could influence the MRI scan. The HC subjects met criteria (4) to (7) and were required to have a HAMD-24 score ≤ 8.

After excluding low-quality images affected by head motion or ghost intensity artifacts, this study included a total of 26 HCs and 45 patients with MDD from the Affiliated ZhongDa Hospital of Southeast University, as well as 38 HCs and 62 MDD patients from the Second Affiliated Hospital of XinXiang Medical University. The severity of depression was assessed using the Hamilton Depression Scale-24 (HAMD-24) scores. Among the 107 MDD patients included in the study (with HAMD-24 scores ≥20), they were further categorized into two subgroups based on their HAMD-24 scores: moderate depression (MD) for scores ranging from 20 to 34, and severe depression (SD) for scores ≥35 (Tolentino and Schmidt, 2018). The demographic characteristics of the participants are presented in Table 1.

TABLE 1

Table 1. Demographic statics of subjects.

2.2 Magnetic resonance imaging data acquisition and preprocessing

All participants underwent baseline magnetic resonance imaging (MRI) scans using a 3.0 T Siemens scanner (Siemens, Erlangen, Germany) with a 12-channel head coil. Pads were used to immobilize the heads of all subjects and minimize head movements. High-resolution 3D T1-weighted scans were acquired using a magnetization-prepared fast gradient echo (MPRAGE) sequence with the following parameters: repetition time (TR) = 1,900 ms, echo time (TE) = 2.48 ms, flip angle (FA) = 9°, acquisition matrix = 256 times 256, field of view (FOV) = 250 × 250 m², slice thickness = 1.0 mm, no gap between slices, scan time = 4 min 18 s, and a total of 176 volumes.

For the rs-fMRI scans, the following parameters were used: TR = 2,000 ms, TE = 25 ms, FA = 90°, acquisition matrix = 64 × 64, FOV = 240 × 240 m², slice thickness = 3.0 mm, no gap between slices, axial slice orientation with 36 slices, a total of 240 volumes, in-plane resolution parallel to the anterior-posterior conjunction = 3.75 × 3.75 m², and an acquisition time of 8 min. During the scans, subjects were instructed to lie on their backs with their hands naturally resting on their sides. Head movement was minimized by using pads, and ear plugs were provided to reduce scanner noise. Subjects were asked to relax their bodies, keep their eyes open, stay awake, and avoid focusing on any specific thoughts to prevent falling asleep. Image quality was checked immediately after scanning, and repeat scans were performed if necessary.

For quality control, two experienced radiologists examined all image data. The rs-fMRI images were preprocessed using the Resting State Functional Data Processing Assistant (DPARSF 2.3 Advanced) MRI toolkit, which integrates the Resting State Functional MRI Toolkit (REST) and the Statistical Parametric Mapping Package (SPM) programs (Yan and Zang, 2010). The first 10 time points were excluded to ensure stable longitudinal magnetization and accommodate scanner noise. The remaining 230 images underwent sequential processing steps: (1) correction for time differences and head motion using the 36th slice as the reference slice for slice time correction (participants with maximum head motion displacement >1.5 mm in any direction or angular motion >1.5° were excluded); (2) co-alignment of the T1-weighted image with the functional image and subsequent reorientation; (3) spatial normalization, where the T1-weighted anatomical images were segmented into white matter, gray matter, and cerebrospinal fluid, and then normalized to the Montreal Neurological Institute (MNI) space using a uniform segmentation algorithm. The transform parameters from the segmentation were applied to the functional images, which were resampled with 3 mm isotropic voxels; (4) spatial smoothing with a 4 mm full-width at half-peak (FWHM) isotropic Gaussian kernel; removal of linear trends within each voxel time series; regression of interference signals (white matter, cerebrospinal fluid signals, and head motion parameters calculated using rigid body six corrections); removal of spiked regression volumes; and application of a temporal bandpass filter (0.01–0.08 Hz) to reduce low-frequency drift and filter out high-frequency noise.

2.3 SNP genotype data sequencing and processing

DNA genotyping was conducted by Tianhao Biotechnology (Shanghai, China), following the standard protocol for DNA extraction from blood samples. Illumina Next sequencing and array technology (Illumina Inc., San Diego, CA, USA) were employed to determine the genotypes of single nucleotide polymorphisms (SNPs) in the genes of interest. Subsequently, we utilized the PLINK (v1.9) software to perform the Hardy-Weinberg equilibrium (HWE) test, assess linkage disequilibrium statistics, and calculate allele and genotype frequencies (Purcell et al., 2007). After excluding missing or erroneous values, a total of 5,897 SNPs were retained for further analysis in this study.

Genetic risk variants can help researchers understand the biological mechanisms of related diseases and provide valid hypotheses for drug design. In this study, we focused on thoroughly exploring the relationship between a given risk SNP and quantitative traits at the level of brain structure and function. Some studies have implicated a large number of genes associated with depression, including HTR2A, CACNA1C, BDNF, CRHR1, GSK3β, TPH1, among others, as detailed in a systematic review (Flint and Kendler, 2014). However, unlike the well-known risk SNP APOE rs429358 in Alzheimer's disease, there was no consistent genetic hypothesis for the pathogenesis of MDD. Tryptophan hydroxylase (TPH) was the rate-limiting enzyme in the biosynthesis of serotonin (5-HT). Haplotype analysis suggested a connection between TPH-1 and MDD. In this study (Zhang et al., 2023), TPH1 rs1799913 was found to have a strong correlation with MDD. Gizatullin et al. (2006) discovered six SNPs that were in linkage disequilibrium in both the patient and control groups. However, only one SNP (rs1799913) was significantly associated with MDD in their single-marker association analysis. This SNP was also referenced as SNP A779C in other research. It was linked to the cerebrospinal fluid (CSF) 5-hydroxyindoleacetic acid (5-HIAA) concentration, and multiple studies had indicated its association with suicidal behavior (Jarienė et al., 2018; Nielsen et al., 2020).

Therefore, we first used TPH1 rs1799913 to validate our proposed model. The values of TPH1 rs1799913 were encoded additively as 0, 1, 2, where the alleles were divided into major and minor alleles based on genotype frequencies, with the major allele encoded as “0,” the minor allele as “2” and the heterozygotes as “1” (Purcell et al., 2007). After validation, the remaining 5,896 SNPs were encoded using the same method and subsequently analyzed to determine their correlations with MDD.

3 Method

To simultaneously focus on brain structural imaging and functional connectivity information between different brain regions while considering the complex internal data structure, we proposed the FSN-MM model by diagnosis information to mine the relationship between MDD risk SNPs and the multi-modality phenotype network. The overview of our proposed model is depicted in Figure 1. We utilized the Automated Anatomical Labeling (AAL) atlas to extract voxel node features from sMRI data and network connectivity edge features from rs-fMRI data, respectively. Subsequently, we employed intra-class similarity information to construct novel self-expressive networks for the multi-modality phenotype by sparse representation. These self-expressive networks were then iteratively merged into a single network guided by diagnosis information. Leveraging this fusion self-expressive network model, we obtained a new restructured multi-modality phenotype network and applied a multi-modality association model to investigate the relationship between MDD risk SNPs and the multi-modality phenotype network.

FIGURE 1

Figure 1. The overview of FSN-MM model.

3.1 Extracting the features of multi-modality phenotype network

The node and edge features of each subject were integrated to form a multi-modality brain phenotype network. The node features were extracted from sMRI data, while the edge features were extracted from rs-fMRI data.

After preprocessing the sMRI data, voxel-based morphometry (VBM) analysis was conducted on each subject. This analysis utilized normalized gray matter density masks with voxel dimensions of 2 × 2 × 2 mm³ in the MNI space. The resulting VBM outputs were registered to the respective patient scans for each participant, and the mean gray matter volume (GMV) measurements were extracted for 116 ROIs based on the AAL template. Each ROI was considered as an individual node within the multi-modality brain network, enabling the derivation of a set of node features for each subject.

A functional connectivity network was established for each individual using rs-fMRI data, where nodes represented predetermined ROIs and edges signified the functional connections between these ROIs. The average time series of each ROI, normalized to have zero mean and unit variance, were extracted based on the AAL template. The functional connectivity networks were derived using the Pearson correlation coefficient, which captured the correlation between BOLD signals of paired ROIs (Li et al., 2017). Graph theory-based methods played a crucial role in the analysis of human brain disorders due to the abnormal topological properties of brain networks in psychiatric disorders. One of the most commonly used methods was clustering coefficients, which reflected the local clustering properties of the brain network (Liu J. et al., 2020; Li et al., 2022). The clustering coefficient (CC) is a threshold-free metric used to measure the degree of node aggregation in network graph theory and is one of the most commonly used methods. The CC for each ROI is determined by the functional connectivity between that ROI and other ROIs.

3.2 Self-expressive network

The concept of self-expressiveness postulates that each data point is distributed across multiple subspaces, and each subspace can be accurately represented by a linear or affine combination of samples from the same subspace. Within the multi-modality phenotype network, a complex structure is formed, encompassing several subspaces. Our methodology leverages this inherent self-expressiveness property, which captures the similarity structure of the data, to reconstruct the original input before performing correlation analysis. By introducing a self-expressive network for self-reconstruction, we can comprehensively describe the similarities present within the data. To construct such a self-expressive network, we utilize the self-expressiveness property to create a subject-subject similarity relationship matrix. This matrix serves as an equivalent representation of the self-expressive network, where the weighted edges signify pairwise similarity relationships.

In a self-expressive network SN = {V, E, W}, V represents a collection of nodes where each node corresponds to a subject. E represents a set of edges, and W represents a weight matrix for the edges. Specifically, given a set of data points {_{x_i}i = 1, ⋯, N} derived from multiple linear subspaces {_{R_i}i = 1, ⋯, K}, it is feasible to express a point in a specific subspace as a linear combination of other points in the same subspace. For the data matrix X, the property of self-expressiveness can be represented as X = W^xX, with X = W^xX being the self-expressive coefficient matrix. As shown in Ji et al. (2017), under the assumption of independence of the subspaces, the minimization of certain norms of W^x ensures a block-diagonal structure of W^x (up to some permutations). That is, $W_{i j}^{x} \neq 0$ only if points x_i and x_j reside in the same subspace. In mathematical terms, this can be formulated as an optimization problem:

\begin{array}{l} \begin{matrix} \underset{W^{x}}{m i n} | | W^{x} | |_{p} & s . t . X = W^{x} X, (d i a g (W^{x}) = 0) \end{matrix} & (1) \end{array}

The weight matrix in the sparse representation reflects the inherent geometric properties of the data and W^x should have a block-diagonal structure under a certain permutation, where each block corresponds to the data samples from the same subspace.

The aforementioned problem can also be approached as identifying clustering structures or block structures within the weight matrix W^x that correspond to the subspaces of the data. This approach is equivalent to leveraging the self-expressive property, which captures the similarity structure of the data, for reconstructing the original data. In this study, we utilize diagnosis information (i.e., HC, MD, SD) to perform the data reconstruction. Specifically, given c classes of samples $\tilde{X} = [x_{1}^{(1)}, \dots, x_{N_{1}}^{(1)}, \dots, x_{1}^{(c)}, \dots, x_{N_{c}}^{(c)}] \in R^{p \times N}$ , we reconstruct x_i using samples with the same label. Here, p represents the number of ROIs, $x_{i}^{(c)}$ represents the i-th sample in the c-th class, and it can be expressed as a combination of samples with the same diagnostic label, as shown in Equation 2:

\begin{array}{l} \underset{{(W_{i}^{x})}^{(c)} \geq 0}{m i n} \frac{1}{2} | | x_{i}^{(c)} - {(\tilde{X})}^{(c)} {(W_{i}^{x})}^{(c)} | |_{2}^{2} + λ | | {(W_{i}^{x})}^{(c)} | |_{1} & (2) \end{array}

where λ denotes a regularization parameter. ${(W_{i}^{x})}^{(c)}$ is a vector in which the element corresponding to $x_{i}^{(c)}$ is zero, and we discard negative solutions. Specifically, ${(W_{i}^{x})}^{(c)} = {[0, \dots, 0, W_{i, 1}, \dots, W_{i, i - 1}, 0, W_{i, i + 1}, c d o t s, W_{i, N_{j}}, 0, \dots, 0]}^{T}$ . Subsequently, we utilize the SLEP toolbox (Liu et al., 2009) to optimize Equation 2. Once we obtain ${(W_{i}^{x})}^{(c)}$ , the optimal solution of Equation 2, we can express the sparse reconstructive weight matrix W^x as follows:

\begin{array}{l} W^{x} = [{({\tilde{W}}_{1}^{x})}^{(c)}, . . ., {({\tilde{W}}_{N}^{x})}^{(c)}] + {[{({\tilde{W}}_{1}^{x})}^{(c)}, . . ., {({\tilde{W}}_{N}^{x})}^{(c)}]}^{T} & (3) \end{array}

The reconstructive weight matrix W^x is an N×N symmetric matrix, where each element represents the contribution of each x_j to the reconstruction of x_i. The diagonal elements of the constructed matrix W^x are set to zero. As shown in Figure 2, the self-reconstruction data demonstrates some clustering effects compared to the original data.

FIGURE 2

Figure 2. The process of data self-reconstruction by self-expressive method.

3.3 Fusion self-expressive network

The utilization of multi-modality brain imaging phenotypes is widely accepted as it provides more comprehensive information compared to single-modality biomarkers. With the availability of multi-modality data, fusion methods based on sample similarity have been extensively employed in various tasks such as classification, clustering, and prediction (Wang et al., 2014). Drawing inspiration from these previous studies and enhancing the clustering accuracy by bringing together data diagnosed within the same class in the multi-modality phenotype network, we introduce a fusion self-expressive network. This network effectively leverages the local structure of the network and diagnosis information between networks to facilitate efficient data integration.

The fusion self-expressive network FSN = {V^f, E^f, W^f} is defined as the network that incorporates fusion self-expressiveness, which includes high-weight edges, from one or more of these networks. In this context, the self-expressiveness property can be represented as X^m = W^fX^m, where X^m denotes the multi-modality brain imaging phenotypes and W^f represents the fusion self-expressive coefficient matrix. Specifically, given M self-expressive networks SN^m = {V^m, E^m, W^m} derived from M multi-modality brain imaging phenotypes $X^{m} = {[x_{1}^{m}, \dots, x_{n}^{m}, \dots, x_{N}^{m}]}^{T} \in R^{N \times r} (m = 1, 2, \dots, M)$ , r represents the number of ROIs. We aim to obtain the fusion self-expressive coefficient matrix W^f from the self-expressive network matrices W^m. To accomplish this objective, we compute the local affinity of the self-expressive network matrix based on diagnosis information. This operation involves setting the similarity (calculated as pairwise similarity values) between points of different classes to zero. Essentially, we assume that pairwise similarities with high weight values are within the same class more reliable than similarities between different classes. Therefore, we assign similarities to points of different classes through graph diffusion on the network. The calculation of the local affinity (A_L) can be expressed as follows:

\begin{array}{l} A_{L}^{m} (i, j) = {\begin{matrix} \frac{W^{m} (i, j)}{\sum_{k \in C_{i}} W^{m} (i, k)} & j \in C_{i} \\ 0 & o t h e r w i s e \end{matrix} & (4) \end{array}

where C_i represents the set of points in the same class as x_i in SN^m. Subsequently, for M self-expressive networks, the updated self-expressive matrix for the multi-modality data can be defined as:

\begin{array}{l} W_{t + 1}^{m} (i, j) = \sum_{k \in C_{i}} \sum_{l \in C_{i}} A_{L}^{m} (i, k) A_{L}^{m} (j, l) \frac{\sum_{q \neq m} W_{t}^{q} (k, l)}{M - 1} & (5) \end{array}

where $W_{t + 1}^{m}$ represents the fusion self-expressive network matrix after t iterations. The matrix W contains the complete information about the similarity of each subject to all others in the dataset. In contrast, the matrix A_L focuses solely on encoding the similarity between a given patient and others within the same class. Our algorithm always starts with W as the initial state and utilizes A_L as the kernel matrix in the fusion iteration process to capture the local structure of networks (Wang et al., 2021). The self-expressive information between subjects of the same class is only propagated through the common class. Furthermore, it is worth mentioning that if x_i and x_j are dissimilar in one modality, the self-expressive information can be expressed in other modalities. Thus, the self-expressive information among subjects within the same class can be obtained through fusion. Ultimately, after t iterations, the overall fusion self-expressive network matrix can be formulated as:

\begin{array}{l} W^{f} = {\begin{matrix} \frac{1}{M} \sum_{m = 1}^{M} W_{t}^{m} (i, j) & i \neq j \\ 0 & i = j \end{matrix} & (6) \end{array}

Equation 6 demonstrates that the fusion self-expressive network effectively leverages the local structure within networks and captures both common and complementary information across multiple modalities. The self-expressiveness property is expressed as X^m = W^fX^m, where W^f represents the fusion self-expressive coefficient matrix.

Figure 3 illustrates the process of self-expressive network fusion and the data fusion self-reconstruction process. Specifically, the model iteratively integrates self-expressive networks from the multi-modality phenotypes network into a single network, guided by diagnosis information. During the fusion process, weak similarities (low-weight edges) are eliminated, while strong similarities (high-weight edges) emerge within one or more networks and are incorporated into other networks. Additionally, low-weight edges supported by all networks are preserved, depending on their connectivity strength across classes spanning multiple networks. In the fusion self-expressive matrix, the intra-class similarity (high-weight edges) is enhanced, while the inter-class similarity (low-weight edges) is significantly reduced. By comparing with Figure 2, it is evident that the fusion self-reconstruction data exhibits better clustering performance than the self-reconstruction data.

FIGURE 3

Figure 3. The process of data fusion self-reconstruction and fusion of self-expressive matrices.

3.4 Introduction fusion self-expressive network into association model

Once the multi-modality phenotype network is constructed for each subject, pathological changes can be identified as abnormal alterations within the phenotype networks. These changes are closely associated with relevant ROIs and significant connectivity edges, as the features of the network nodes and edges are derived from GMV data obtained from sMRI and clustering coefficients from rs-fMRI. Here, let's consider a scenario with N subjects, and let each subject represent itself with a multi-modality phenotype network. The input consists of M modalities of phenotypes, denoted as $X^{m} = {[X_{1}^{m}, \dots, X_{n}^{m}, \dots, X_{N}^{m}]}^{T} \in R^{N \times d}$ , where d represents the dimensionality of node and edge features. The corresponding output is represented by $y = {[y_{1}, \dots, y_{n}, \dots, y_{N}]}^{T} \in R^{N}$ , where y represents the response value of SNP. In this context, let w^m∈R^d denote the linear discriminant function corresponding to the m-th modality. The multi-modality network phenotype association model can be formulated as:

\begin{array}{l} \underset{W}{m i n} \frac{1}{2} \sum_{m = 1}^{M} | | y - X^{m} w^{m} | |_{2}^{2} + λ | | W | |_{2, 1} & (7) \end{array}

where W = [w¹, w², ⋯, w^M]∈R^d×M is the weight matrix, and each row w_j represents the vector of coefficients assigned to thej-th feature across multi-modality. It is important to note that Equation 7 introduces the L_{2, 1}-norm regularization term, $| | W | |_{2, 1} = \sum_{j = 1}^{d} | | w_{j} | |_{2}$ , which serves as a “group-sparsity” regularizer. This regularization term penalizes all coefficients in the same row of the matrix W, encouraging joint feature selection. In other words, our proposed model aims to select only a small number of features across multi-modality. The regularization parameter λ controls the balance between the two terms in Equation 7. A larger value of λ leads to the selection of fewer features.

Then, we incorporate the fusion self-expressive network into the multi-modality association model by incorporating Equation 6 into Equation 7. The objective function of our proposed association model, referred to as FSN-MM, can be formulated as follows:

\begin{array}{l} \underset{W}{m i n} \frac{1}{2} \sum_{m = 1}^{M} | | y - W^{f} X^{m} w^{m} | |_{2}^{2} + λ | | W | |_{2, 1} & (8) \end{array}

where the parameter λ controls the regularization term, and its value can be determined through inner cross-validation on the training data. In the objective function (Equation 8), we replace the original data with the fusion self-reconstructed data. With this formulation, the FSN-MM model can jointly select a sparse subset of common features from the multi-modality data while fully leveraging the prior diagnosis information among subjects.

4 Experimental results and analysis

4.1 Experimental settings

In our experiments, we utilized the CC as an evaluation metric to measure the association analysis between the predicted and actual response values and employed the five-fold cross-validation strategy to validate the effectiveness of our proposed method. To determine the parameter λ in Equation 8, we tuned its values from 10⁻⁵, 3 × 10⁻⁵, 10⁻⁴, 3 × 10⁻⁴, ⋯, 3 and selected the optimal value through nested five-fold cross-validation on the training dataset.

In Table 2, we compared the performance of the single-modality (SM) method, concatenate-modality (CM) method, and multi-modality (MM) method with/without fusion self-expressive network. The SM, CM, and MM methods represented conventional approaches without incorporating the fusion self-expressive network. We presented the FSN-SM and FSN-CM methods, which were the improved versions of the SM and CM methods incorporating the fusion self-expressive network, respectively. The FSN-MM method simultaneously considered the multi-modality images and fusion self-expressive network. Additionally, we presented the SN-SM, SN-CM, and SN-MM methods, which were the improved versions of the SM, CM, and MM methods incorporating the self-expressive network, respectively. Descriptions of the various comparisons were presented in Table 2.

TABLE 2

Table 2. The detailed description of various comparisons.

4.2 Identification of imaging-genetic patterns in MDD using FSN-MM

4.2.1 Association between risk SNP and multi-modality network phenotype

We validated the model using the MDD risk SNP rs1799913. In our comparison, we evaluated the performance of our proposed FSN-MM method and compared it with conventional methods (including SM, CM, and MM) as well as improved methods that incorporate the fusion self-expressive network (including FSN-SM and FSN-CM). To ensure unbiased results, we performed five independent and non-repetitive five-fold cross-validations. We calculated the average CC results on the training and testing data separately for the node and edge modalities. Table 3 presented the average results.

TABLE 3

Table 3. Comparison of regression performance on risk SNP TPH1 rs1799913 by different methods.

As shown in Table 3, FSN-SM achieves CC values of 0.1934 and 0.2031 on the node and edge features, respectively, outperforming the conventional SM method. FSN-CM achieves a CC value of 0.2354, which is an improvement over the CM method. FSN-MM demonstrates the best performance with CC values of 0.3845 and 0.4037 on the two different features. These results highlight several important findings:

1) MM-type methods, such as FSN-MM, outperform SM-type methods by jointly selecting node and edge features, leading to enhanced performance in association analysis.

2) The introduction of the fusion self-expressive network, incorporating diagnosis information, consistently improves the CC performance compared to conventional methods across different feature types.

3) Functional connectivity edge features between different brain regions, compared to voxel-based morphometry node features, provide more insights for understanding the mechanisms of MDD.

Moreover, CM-type and MM-type methods both utilize node and edge features, but they employ different strategies to combine these features, resulting in distinct performance. CM-type methods directly concatenate node and edge features, which may lead to the loss of relationship information between the modalities and introduce more noise in the expanded feature space. On the other hand, MM-type methods utilize the multi-task strategy with L_{2, 1}-norm constraint to jointly select node and edge features, improving the robustness of ROI detection.

In summary, the FSN-MM method demonstrates the best performance in terms of CC measure. This indicates that considering multi-modality imaging data and incorporating the fusion self-expressive network with diagnosis information can effectively enhance the performance of association analysis between imaging phenotypes and genotypes.

4.2.2 Comparison with state-of-the-art approach

Based on existing MDD research, this study utilizes multi-modality data and compares the FSN-MM method with the state-of-the-art imaging genetics algorithm for MDD, known as MSD-MM (Zhang et al., 2023), as shown in Table 4. To ensure the validity and reliability of our proposed approach, the datasets, risk gene SNP (TPH1 rs1799913), and processing framework (including templates) used in this study are consistent with those in the research (Zhang et al., 2023).

TABLE 4

Table 4. Performance comparison with state-of-the-art approach on SNP rs1799913.

The CC values for FSN-MM on node and edge features are 0.3845 and 0.4037, respectively. In contrast, the CC values for MSD-MM on these features are 0.2432 and 0.2697, respectively. The FSN-MM method outperforms the MSD-MM algorithm in imaging genetics association. This study provides a practical solution for constructing and utilizing the fusion self-expressive network to build an imaging genetics analysis framework.

4.2.3 Identification of the consistent ROIs from multi-modality imaging data

To identify ROIs related to MDD, the FSN-MM method identifies consistent ROIs in both node and edge features in our experiments. Figure 4 shows the weight maps for the 116 ROIs associated with the risk SNP TPH1 rs1799913 across multiple modalities. The values of the weights are normalized, where the depth of color in the colorbar represents the magnitude of the ROI weight. In Figure 4, the SM-based and CM-based methods select numerous ROIs, but these ROIs are inconsistent in terms of the node and edge features. Such findings suggest that it might be difficult for researchers to conduct further investigations using these selected ROIs. However, the FSN-MM method can identify sparse and consistent ROIs associated with THP1 rs1799913 by multi-modality imaging data. These identified ROIs, such as the left temporal, right hippocampus and right precuneus regions are consistent with existing research findings and are highly correlated with MDD (Li et al., 2018; Dvorak et al., 2019; Roddy et al., 2019; Wang et al., 2020; Brosch et al., 2022).

FIGURE 4

Figure 4. Weight maps for the multi-modalities on 116 ROI associations with THP1 SNP rs1799913 respect to different methods.

In summary, the FSN-MM method tends to select consistent ROIs associated with the risk SNP in multi-modality imaging data, in contrast to methods like SM and CM. This is of significant value for further exploration into the mechanisms of MDD.

4.3 Discovery of imaging-genetic patterns in MDD using FSN-MM

4.3.1 Discovery of new potential risk SNPs

The etiology of MDD arises from multiple genetic risk SNPs rather than a single SNP. In the experiment, we examine the relationship between two brain network features and the MDD risk SNP (TPH1 rs1799913). The results show that the FSN-MM method is an effective tool for uncovering novel risk SNPs associated with MDD. In this study, the genotype data contain 5,897 SNPs, and we use the FSN-MM method for each SNP in the entire genotype dataset. Table 5 presents the results. Besides TPH1 rs1799913, 15 other SNPs exhibit a strong relationship with both node and edge features, suggesting that these 15 SNPs might be risk genetic SNPs for MDD.

TABLE 5

Table 5. Regression performance on multiple SNPs.

Among the 15 SNPs, rs3730089, rs3738401, rs4680, rs6311, rs6265, and rs11832738 are potential candidates that might be associated with MDD. Specifically, the literature (Huang et al., 2020) showed that PIK3R1 rs3730089 was associated with schizophrenia and bipolar affective disorder in the Han Chinese population. Schizophrenia, bipolar affective disorder, and major depressive disorder (MDD) often exhibit shared symptoms such as anhedonia and lack of motivation. Systematic reports (Chen et al., 2017) highlighted the familial aggregation of these three psychiatric disorders, with co-occurrence of any two or even all three disorders in some families. Moreover, evidence from symptomatology and psychopharmacology suggested an intrinsic connection between these three psychiatric disorders. Therefore, PIK3R1 rs3730089 might have served as a risk SNP marker in MDD research. Similarly, Zhang et al. (2006) indicated an association between DISC1 rs3738401 and schizophrenia in the Scottish population; Stein et al. (2006) demonstrated an association between COMT rs4680 and schizophrenia; Smith et al. (2013) suggested that HTR2A rs6311 moderately influenced the severity of depression; Lisiecka et al. (2015) revealed that the allelic variation of BDNF rs6265 led to specific neuro correlates of MDD, which might have been related to different mechanisms of MDD in the two allelic groups and might have had potential implications for patient screening and treatment. Liu et al. (2020a) suggested that the CACNA1C rs11832738 gene variant affected the severity of depression in MDD patients. The six SNPs are substantiated in the literature to be associated with depression, schizophrenia, and bipolar disorder. Consequently, these six SNP markers are pertinent to MDD research and hold referential value (Stein et al., 2006; Zhang et al., 2006; Smith et al., 2013; Lisiecka et al., 2015; Huang et al., 2020; Liu et al., 2020a).

Furthermore, although CAMK2B rs11542227, GALC rs733112836, NRG1 rs77493513, MAPK10 rs1201, KDSR rs1138488, DRD2 rs6279, LAMA2 rs2229848, GRIA3 rs550640, and THY1 rs3138094 exhibit good correlation coefficients through the FSN-MM method, there is no direct medical or biological research supporting the association of these nine SNPs with MDD at this time. We anticipate future studies will validate these findings. We hope that these novel MDD risk SNPs will be confirmed in future research, providing further insights into the etiology of MDD. In total, we have identified 15 new potential SNPs associated with MDD, which may contribute to the study of MDD.

4.3.2 ROI markers identification from sMRI data

Beyond enhancing the performance of correlation analyses, a primary objective of this study is to identify significant imaging phenotypes. For the sMRI data, we analyze six new SNPs with literature support from Table 5. Using five-fold cross-validation performed five times, we obtain average weight values. We select the top five ROIs with the highest weights for each SNP as significant ROI markers. Since the top five ROIs for each SNP are not the same, a combination results in seven unique ROIs. Table 6 displays these seven ROIs and their average weight values.

TABLE 6

Table 6. The top five ROIs chosen based on the node features from sMRI data for the six SNPs.

Figure 5 displays the visualization of these seven selected ROIs mapped onto the human brain in all three planes (sagittal, coronal, and axial). In each plane, the color of the labeled brain regions reflects the location of a selected ROI. Notably, most of the selected ROIs are consistent with earlier findings that focus on structural images and identify several diagnostic markers of MDD. Roddy et al. (2019) had shown that MDD patients exhibited bilateral volume reduction in the major hippocampal substructures and had identified the core hippocampal region as a potential marker of MDD progression. Moreover, researchers had found that reduced hippocampal gray matter volume was a common characteristic of MDD patients (Brosch et al., 2022). The bilateral middle frontal gyrus had been shown to exhibit significantly increased amplitude of low-frequency fluctuation (ALFF) in patients with subthreshold depression (Liu et al., 2020a; Zhang B. et al., 2021). As described in the literature (Hagan et al., 2015; Lu et al., 2016), abnormalities in the thalamic structure might have been potential feature markers of early-stage MDD. Compared to healthy controls, MDD patients exhibited reduced gray matter density in the bilateral temporal pole and right superior temporal gyrus (Peng et al., 2011).

FIGURE 5

Figure 5. Visualization of the seven ROIs selected by node features in three planes. HIP.R, Hippocampus.R; MTG.L, Middle Temporal Gyrus.L; PCUN.R, Precuneus.R; MFG.R, Middle Frontal Gyrus.R; DCG.L, Middle Cingulate Gyrus.L; AMYG.L, Amygdala.L; INS.R, Insula.R. All volumes of each brain figure are plotted by BrainNet (Xia et al., 2013). (A) Axial plane. (B) Sagittal plane. (C) Coronal plane.

4.3.3 ROI markers identification from rs-fMRI data

The functional connectivity between different regions of the brain is represented using a brain network model, where nodes and edges are defined as brain regions and their connections, respectively. In this study, a brain functional connectivity network is constructed using rs-fMRI data. Clustering coefficients are then extracted as edge features and incorporated into each brain region. Consequently, the dimensionality of the obtained edge features is the same as the number of brain regions, with each dimension corresponding to a specific brain region. This approach allows for the identification of relevant ROI markers from rs-fMRI data.

For the edge features in rs-fMRI data, the average weight values are also obtained through five-fold cross-validation with five repetitions, and the ROI with the highest weight among the six SNPs with literature references, is selected as a significant ROI marker, as shown in Table 7, focusing on the prefrontal cortex and hippocampus. Previous research using rs-fMRI data identified key brain region markers for MDD, such as the prefrontal cortex, hippocampus, and temporal gyrus. These markers coincide with the relevant ROIs we've highlighted. MDD patients exhibited abnormal patterns in the prefrontal cortex during rest before starting treatment, suggesting this region could be a potential diagnostic marker for MDD (Li et al., 2018). According to the literature (Sambataro et al., 2014), MDD patients showed enhanced connectivity in the default mode network, especially in the right hippocampus.

TABLE 7

Table 7. The top one ROI chosen based on the edge features from rs-fMRI data for the six SNPs.

Moreover, to analyze the functional connectivity of the selected brain regions and visualize the differences between MDD and HC on the functional connectivity network, the ROI with the highest weight (right prefrontal cortex) and an ROI with a lower weight (right insula) are selected from the highest weight ROIs corresponding to the six SNPs in Table 7. The average edge values of the functional connectivity networks for the MDD and HC groups are computed separately. Specifically, functional connectivity networks for each participant in the MDD and HC groups are constructed, followed by the computation of the average functional connectivity networks for each group. Lastly, the seven edges with the highest connection values are selected from all the edges within the given ROIs (Wang et al., 2019). Figure 6 graphically displays the first seven edges with the highest average connection values on the highest weight ROI and lower weight ROI. As depicted in Figure 6, the MDD group exhibits significant variations in the highest weight ROI edges compared to the HC group, where the edge connecting the right prefrontal cortex with the left hippocampus is replaced by the edge connecting the right prefrontal cortex with the left cingulate gyrus. The edges of the lower-weight ROI in the MDD group are similar to those in the HC group. These results indicate a strong association between the identified critical brain regions (highest weight ROI) and the pathogenesis of MDD.

FIGURE 6

Figure 6. The edges of the maximum weight ROI (A) and smaller weight ROI (B) on the MDD (left) and HC (right) group. The centroid red node represents the selected ROI, and the blue node denotes the corresponding ROI linked by top seven average connection value edges. (A) PCUN.R, Precuneus.R; SFGdor.R, superior frontal gyrus.R; DCG.L, Median Cingulate and Paracingulate Gyri.L; DCG.R, Median Cingulate and Paracingulate Gyri.R; PCG.R, Posterior Cingulate gyrus.R; CUN.R, Cuneus.R; HIP.L, Hippocampus.L; PCUN.L, Precuneus.L; MOG.L, Middle Occipital Gyrus.L; (B) INS.R, Insula.R; INS.L, Insula.L; HES.R, Heschl.R; ROL.L, Rolandic Oper.L; ROL.R, Rolandic Oper.R; STG.L, Temporal Sup.L; STG.R, Temporal Sup.R; TPOsup.R, Temporal Pole Sup.R. In the figure, all edges of each brain figure are plotted by BrainNet.

4.3.4 Consistent ROI identification from multi-modality imaging data

Figure 7 illustrates the weight maps of the multi-modality data for the 116 ROIs associated with six SNPs. The FSN-MM method successfully identifies sparse and consistent ROIs associated with the six SNPs. ROIs highly correlated with MDD are predominantly located in the left temporal, right hippocampus, and right precuneus regions. These identified ROIs align with previous findings, further validating the accuracy of our research (Li et al., 2018; Dvorak et al., 2019; Roddy et al., 2019; Wang et al., 2020; Brosch et al., 2022).

FIGURE 7

Figure 7. Weight maps for the multi-modalities on 116 ROI associations with six SNPs using FSN-MM. (A) rs3730089. (B) rs6311. (C) rs6265. (D) rs11832738. (E) rs3738401. (F) rs4680.

4.3.5 Consistent ROI identification from multi-modality imaging data for SNPs without literature support

We select an SNP (LAMA2 rs2229848) without literature support from Table 5, presenting its weight maps (Figure 8) for the multi-modalities on 116 ROIs. As depicted in Figure 8, the FSN-MM method adeptly identifies sparse consistent ROIs associated with rs2229848. ROIs showing a strong correlation with MDD are still chiefly located in the left temporal, right hippocampus, and right precuneus. This result is consistent with SNPs supported by the literature. While direct medical or biological research supporting the association of these 9 SNPs with MDD is currently lacking, it is anticipated that forthcoming studies will validate these findings, providing valuable insights into the etiology of MDD.

FIGURE 8

Figure 8. Weight maps for the multi-modalities on 116 ROI associations with LAMA2 SNP rs2229848 using FSN-MM.

5 Discussion

5.1 Discriminant and convergence performance of fusion self-expressive network

When comparing the fusion self-reconstructed data with the original data and the self-reconstructed data in Figures 2, 3, we observe a more scattered distribution among different classes and a more concentrated distribution within the same class. This indicates that the fusion self-expressive network generates more discriminative features for the diagnosis information (i.e., HC, MD, or SD).

To illustrate the convergence performance during the fusion process, we monitor the relative change of parameters in each iteration using the formula $\frac{| | W_{t + 1}^{f} - W_{t}^{f} | |}{| | W_{t}^{f} | |}$ , where $W_{t}^{f}$ represents the output of the fusion process after t steps. The initial iteration t is set to 25, and a simple stopping criterion is applied by setting a threshold of 10⁻⁶. If the relative change falls below the threshold, the optimization procedure terminates. The convergence performance of FSN-MM is illustrated in Figure 9. The relative change value decreases rapidly within the first 10 iterations, signifying the fast convergence of our proposed optimization algorithm. Thus, we set the number of iterations to 10 in our study.

FIGURE 9

Figure 9. Number of iterations.

5.2 Parameters selection

The proposed FSN-MM method has only one regularization parameter: the sparsity parameter λ. This parameter is used to balance the relative contributions of the two terms in Equation 8. To investigate the impact of the regularization, the parameter λ was set in the range of 10⁻⁴, 3 × 10⁻⁴, 10⁻³, 3 × 10⁻³, …, 1, 3. Figure 10 shows the correlation coefficients of the parameter λ on node and edge features. The performance in the region where 0.03 < λ < 0.3 is pretty good for both node and edge features. This region can be helpful for quickly selecting the optimal value of the parameter λ in future studies.

FIGURE 10

Figure 10. Bar and line graph of correlation coefficients for parameter λ on node and edge features.

5.3 Atlas selection

The AAL atlas is widely used in many neuroimaging studies due to its comprehensive coverage and clear demarcation of brain regions. To maintain consistency with prior related studies and to make our results comparable, we opted for the AAL atlas. However, choosing different atlases might have a significant impact on the results. Consequently, we also utilized the Brodmann Areas (BA) and the Harvard-Oxford Atlas (HOA) to delineate brain regions and conduct experiments.

Three atlases were tested with SNP rs1799913 respectively. We selected consistent ROIs in both node and edge features. As depicted in Figure 11, in the AAL atlas, the left temporal, right hippocampus, and right precuneus regions are highly correlated with MDD. In the BA, Brodmann areas 7 and 21 are highly correlated with MDD. These two areas largely correspond to the right precuneus and left temporal in the AAL atlas. The hippocampus is not delineated based on Brodmann's divisions since Brodmann's categorizations are primarily based on the cellular structure of the cerebral cortex. However, the hippocampus is a subcortical structure, so it does not directly correspond to a specific Brodmann area. In the cortical and subcortical sections of the Harvard-Oxford Atlas, the regions of the left temporal, right hippocampus, and right precuneus are again shown to be closely correlated with MDD. These three brain regions align closely with the locations identified in the AAL atlas. When experimenting with three atlases, we observed a significant level of consistency in our conclusions.

FIGURE 11

Figure 11. Visualization of ROIs related to MDD in different atlases (all volumes of each brain figure are plotted by BrainNet). (A) AAL. (B) BA. (C) HOA (cortical). (D) HOA (subcortical).

6 Conclusion

In this study, we proposed the FSN-MM method to optimally utilize the internal structure of brain imaging data and diagnosis information for correlation analysis. When we validated using the SNP THP1 rs1799913, our method outperformed other approaches in CC evaluations. Furthermore, we identified consistent and stable ROI biomarkers from the multi-modality phenotype network's voxel node and connectivity edge features. We also identified 15 MDD-associated risk SNPs, of which, six including CACNA1C rs11832738 were supported by the literature. The remaining nine SNPs await future validation.

This study probed the association between a single MDD genetic risk SNP and multi-modality neuroimaging data (sMRI and rs-fMRI). However, this study had limitations. The samples came from only two hospitals, and the sample size was not large enough. Moreover, much of the SNP data was not provided to us for research. We hoped that in the future, more MDD data would be made publicly available for research purposes. Additionally, future research could incorporate other brain imaging modalities like DTI and assess the relationship with multi-locus risk SNPs. Given the advancements in deep learning for biology and medicine, such techniques could tackle brain imaging-genetic associations in depression, potentially enhancing our understanding and treatment of the condition.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

Ethics statement

The studies involving humans were approved by Zhongda Hospital and Xinxiang Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s), and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

Author contributions

MP: Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing—original draft, Writing—review & editing. XL: Writing—review & editing. XH: Writing—review & editing. MW: Writing—review & editing. CX: Writing—review & editing. LZ: Writing—review & editing, Supervision, Writing—original draft. YY: Writing—review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This study was supported by the National Natural Science Foundation of China (Nos. 62276088 and 62106104).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abdoli, N., Salari, N., Darvishi, N., Jafarpour, S., Solaymani, M., Mohammadi, M., et al. (2022). The global prevalence of major depressive disorder (MDD) among the elderly: a systematic review and meta-analysis. Neurosci. Biobehav. Rev. 132, 1067–1073. doi: 10.1016/j.neubiorev.2021.10.041

PubMed Abstract | CrossRef Full Text | Google Scholar

Bondi, E., Maggioni, E., Brambilla, P., and Delvecchio, G. (2022). A systematic review on the potential use of machine learning to classify major depressive disorder from healthy controls using resting state fmri measures. Neurosci. Biobehav. Rev. 144, 104972. doi: 10.1016/j.neubiorev.2022.104972

PubMed Abstract | CrossRef Full Text | Google Scholar

Brosch, K., Stein, F., Schmitt, S., Pfarr, J-. K., Ringwald, K. G., Thomas-Odenthal, F., et al. (2022). Reduced hippocampal gray matter volume is a common feature of patients with major depression, bipolar disorder, and schizophrenia spectrum disorders. Mol. Psychiatry 27, 4234–4243. doi: 10.1038/s41380-022-01687-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, X., Long, F., Cai, B., Chen, X., Qin, L., Chen, G., et al. (2017). A novel relationship for schizophrenia, bipolar, and major depressive disorder. part 8: a hint from chromosome 8 high density association screen. Mol. Neurobiol. 54, 5868–5882. doi: 10.1007/s12035-016-0102-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Dai, P., Xiong, T., Zhou, X., Ou, Y., Li, Y., Kui, X., et al. (2022). The alterations of brain functional connectivity networks in major depressive disorder detected by machine learning through multisite rs-fmri data. Behav. Brain Res. 435, 114058. doi: 10.1016/j.bbr.2022.114058

PubMed Abstract | CrossRef Full Text | Google Scholar

Dvorak, J., Hilke, M., Trettin, M., Wenzler, S., Hagen, M., Ghirmai, N., et al. (2019). Aberrant brain network topology in fronto-limbic circuitry differentiates euthymic bipolar disorder from recurrent major depressive disorder. Brain Behav. 9, e01257. doi: 10.1002/brb3.1257

PubMed Abstract | CrossRef Full Text | Google Scholar

Elhamifar, E., and Vidal, R. (2013). Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2765–2781. doi: 10.1109/TPAMI.2013.57

PubMed Abstract | CrossRef Full Text | Google Scholar

Flint, J., and Kendler, K. S. (2014). The genetics of major depression. Neuron 81, 484–503. doi: 10.1016/j.neuron.2014.01.027

CrossRef Full Text | Google Scholar

Gizatullin, R., Zaboli, G., Jönsson, E. G., Åsberg, M., and Leopardi, R. (2006). Haplotype analysis reveals tryptophan hydroxylase (TPH) 1 gene variants associated with major depression. Biol. Psychiatry 59, 295–300. doi: 10.1016/j.biopsych.2005.07.034

PubMed Abstract | CrossRef Full Text | Google Scholar

Hagan, C. C., Graham, J. M., Tait, R., Widmer, B., van Nieuwenhuizen, A. O., Ooi, C., et al. (2015). Adolescents with current major depressive disorder show dissimilar patterns of age-related differences in acc and thalamus. Neuroimage Clin. 7, 391–399. doi: 10.1016/j.nicl.2014.12.019

PubMed Abstract | CrossRef Full Text | Google Scholar

Hao, X., Yao, X., Yan, J., Risacher, S. L., Saykin, A. J., Zhang, D., et al. (2016). Identifying multimodal intermediate phenotypes between genetic risk factors and disease status in alzheimer's disease. Neuroinformatics 14, 439–452. doi: 10.1007/s12021-016-9307-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, J., Chen, Z., Zhu, L., Wu, X., Guo, X., Yang, J., et al. (2020). Phosphoinositide-3-kinase regulatory subunit 1 gene polymorphisms are associated with schizophrenia and bipolar disorder in the han chinese population. Metab. Brain Dis. 35, 785–792. doi: 10.1007/s11011-020-00552-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Jarienė, G., Adomaitienė, V., Dunderienė, L., Dambrauskienė, K., Vilkas, V., Rybakova, I., et al. (2018). “The influence of tph1 and htr1a gene polymorphisms for a person's suicide risk,” in European Journal of Human Genetics: EJHG: Abstracts from the 50th European-Society-of-Human-Genetics (ESHG) Conference: Electronic Posters: Copenhagen, Denmark, May 27-30, 2017. European Society of Human Genetics, Vol. 26, suppl 2018 (London: Nature Publishing Group). doi: 10.1038/s41431-018-0248-6

CrossRef Full Text | Google Scholar

Ji, P., Zhang, T., Li, H., Salzmann, M., and Reid, I. (2017). “Deep subspace clustering networks,” in 31st Conference on Neural Information Processing Systems (NIPS 2017), Vol. 30 (Long Beach, CA). doi: 10.48550/arXiv.1709.02508

CrossRef Full Text | Google Scholar

Kennedy, S. H. (2022). Core symptoms of major depressive disorder: relevance to diagnosis and treatment. Dialogues Clin. Neurosci. 10, 271–277. doi: 10.31887/DCNS.2008.10.3/shkennedy

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, G., Rossbach, K., Zhang, A., Liu, P., and Zhang, K. (2018). Resting-state functional changes in the precuneus within first-episode drug-naive patients with mdd. Neuropsychiatr. Dis. Treat. 14, 1991–1998. doi: 10.2147/NDT.S168060

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, W., Tang, Y., Peng, L., Wang, Z., Hu, S., Gao, X., et al. (2023). The reconfiguration pattern of individual brain metabolic connectome for parkinson's disease identification. MedComm 4, e305. doi: 10.1002/mco2.305

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, W., Wang, Z., Zhang, L., Qiao, L., and Shen, D. (2017). Remodeling pearson's correlation for functional brain network estimation and autism spectrum disorder identification. Front. Neuroinform. 11, 55. doi: 10.3389/fninf.2017.00055

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, W-. K., Chen, Y-. C., Xu, X.-W., Wang, X., and Gao, X. (2022). Human-guided functional connectivity network estimation for chronic tinnitus identification: a modularity view. IEEE J. Biomed. Health Inform. 26, 4849–4858. doi: 10.1109/JBHI.2022.3190277

PubMed Abstract | CrossRef Full Text | Google Scholar

Lisiecka, D. M., O'Hanlon, E., Fagan, A. J., Carballedo, A., Morris, D., Suckling, J., et al. (2015). Bdnf val66met polymorphism in patterns of neural activation in individuals with mdd and healthy controls. J. Affect. Disord. 184, 239–244. doi: 10.1016/j.jad.2015.06.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, J., Pan, Y., Wu, F.-X., and Wang, J. (2020). Enhancing the feature representation of multi-modal MRI data by combining multi-view information for MCI classification. Neurocomputing 400, 322–332. doi: 10.1016/j.neucom.2020.03.006

CrossRef Full Text | Google Scholar

Liu, J., Shuiwang, J., and Jieping, Y. (2009). SLEP: Sparse learning with efficient projections. Arizona State Univ. 6, 7. Available online at: http://www.public.asu.edu/~jye02/Software/SLEP

Google Scholar

Liu, X., Hou, Z., Yin, Y., Xie, C., Zhang, H., Zhang, H., et al. (2020a). Cacna1c gene rs11832738 polymorphism influences depression severity by modulating spontaneous activity in the right middle frontal gyrus in patients with major depressive disorder. Front. Psychiatry 11, 73. doi: 10.3389/fpsyt.2020.00073

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, X., Song, D., Yin, Y., Xie, C., Zhang, H., Zhang, H., et al. (2020b). Altered brain entropy as a predictor of antidepressant response in major depressive disorder. J. Affect. Disord. 260, 716–721. doi: 10.1016/j.jad.2019.09.067

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, Y., Liang, H., Han, D., Mo, Y., Li, Z., Cheng, Y., et al. (2016). The volumetric and shape changes of the putamen and thalamus in first episode, untreated major depressive disorder. Neuroimage Clin. 11, 658–666. doi: 10.1016/j.nicl.2016.04.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Nielsen, D. A., Deng, H., Patriquin, M. A., Harding, M. J., Oldham, J., Salas, R., et al. (2020). Association of tph1 and serotonin transporter genotypes with treatment response for suicidal ideation: a preliminary study. Eur. Arch. Psychiatry Clin. Neurosci. 270, 633–642. doi: 10.1007/s00406-019-01009-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, J., Liu, J., Nie, B., Li, Y., Shan, B., Wang, G., et al. (2011). Cerebral and cerebellar gray matter reduction in first-episode patients with major depressive disorder: a voxel-based morphometry study. Eur. J. Radiol. 80, 395–399. doi: 10.1016/j.ejrad.2010.04.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., et al. (2007). Plink: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575. doi: 10.1086/519795

PubMed Abstract | CrossRef Full Text | Google Scholar

Roddy, D. W., Farrell, C., Doolin, K., Roman, E., Tozzi, L., Frodl, T., et al. (2019). The hippocampus in depression: more than the sum of its parts? advanced hippocampal substructure segmentation in depression. Biol. Psychiatry 85, 487–497. doi: 10.1016/j.biopsych.2018.08.021

PubMed Abstract | CrossRef Full Text | Google Scholar

Sambataro, F., Wolf, N., Pennuto, M., Vasic, N., and Wolf, R. (2014). Revisiting default mode network function in major depression: evidence for disrupted subsystem connectivity. Psychol. Med. 44, 2041–2051. doi: 10.1017/S0033291713002596

PubMed Abstract | CrossRef Full Text | Google Scholar

Smith, R. M., Papp, A. C., Webb, A., Ruble, C. L., Munsie, L. M., Nisenbaum, L. K., et al. (2013). Multiple regulatory variants modulate expression of 5-hydroxytryptamine 2a receptors in human cortex. Biol. Psychiatry 73, 546–554. doi: 10.1016/j.biopsych.2012.09.028

PubMed Abstract | CrossRef Full Text | Google Scholar

Stein, D. J., Newman, T. K., Savitz, J., and Ramesar, R. (2006). Warriors versus worriers: the role of comt gene variants. CNS Spectr. 11, 745–748. doi: 10.1017/S1092852900014863

PubMed Abstract | CrossRef Full Text | Google Scholar

Tolentino, J. C., and Schmidt, S. L. (2018). Dsm-5 criteria and depression severity: implications for clinical practice. Front. Psychiatry 9, 450. doi: 10.3389/fpsyt.2018.00450

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, B., Mezlini, A. M., Demir, F., Fiume, M., Tu, Z., Brudno, M., et al. (2014). Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337. doi: 10.1038/nmeth.2810

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, M., Hao, X., Huang, J., Shao, W., and Zhang, D. (2019). Discovering network phenotype between genetic risk factors and disease status via diagnosis-aligned multi-modality regression method in alzheimer's disease. Bioinformatics 35, 1948–1957. doi: 10.1093/bioinformatics/bty911

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, M., Ju, Y., Lu, X., Sun, J., Dong, Q., Liu, J., et al. (2020). Longitudinal changes of amplitude of low-frequency fluctuations in mdd patients: a 6-month follow-up resting-state functional magnetic resonance imaging study. J. Affect. Disord. 276, 411–417. doi: 10.1016/j.jad.2020.07.067

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, M., Shao, W., Hao, X., and Zhang, D. (2021). Identify complex imaging genetic patterns via fusion self-expressive network analysis. IEEE Trans. Med. Imaging 40, 1673–1686. doi: 10.1109/TMI.2021.3063785

PubMed Abstract | CrossRef Full Text | Google Scholar

Windon, C., Iaccarino, L., Mundada, N., Allen, I., Boxer, A. L., Byrd, D. ADNI, et al. (2022). Comparison of plasma and csf biomarkers across ethnoracial groups in the adni. Alzheimers Dement. 14, e12315. doi: 10.1002/dad2.12315

PubMed Abstract | CrossRef Full Text | Google Scholar

Xia, M., Wang, J., and He, Y. (2013). Brainnet viewer: a network visualization tool for human brain connectomics. PLoS ONE 8, e68910. doi: 10.1371/journal.pone.0068910

PubMed Abstract | CrossRef Full Text | Google Scholar

Yan, C., and Zang, Y. (2010). Dparsf: a matlab toolbox for “pipeline” data analysis of resting-state fmri. Front. Syst. Neurosci. 4, 1377. doi: 10.3389/fnsys.2010.00013

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, B., Qi, S., Liu, S., Liu, X., Wei, X., Ming, D., et al. (2021). Altered spontaneous neural activity in the precuneus, middle and superior frontal gyri, and hippocampus in college students with subclinical depression. BMC Psychiatry 21, 1–10. doi: 10.1186/s12888-021-03292-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, F., Sarginson, J., Crombie, C., Walker, N., StClair, D., and Shaw, D. (2006). Genetic association between schizophrenia and the disc1 gene in the scottish population. Am. J. Med. Genet. B Neuropsychiatr. Genet. 141, 155–159. doi: 10.1002/ajmg.b.30274

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, L., Pang, M., Liu, X., Hao, X., Wang, M., Xie, C., et al. (2023). Incorporating multi-stage diagnosis status to mine associations between genetic risk variants and the multi-modality phenotype network in major depressive disorder. Front. Psychiatry 14, 1139451. doi: 10.3389/fpsyt.2023.1139451

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, L., Wang, M., Liu, M., and Zhang, D. (2020). A survey on deep learning for neuroimaging-based brain disorder analysis. Front. Neurosci. 14, 779. doi: 10.3389/fnins.2020.00779

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y., Liu, X., Hou, Z., Yin, Y., Xie, C., Zhang, H., et al. (2021). Global topology alteration of the brain functional network affects the 8-week antidepressant response in major depressive disorder. J. Affect. Disord. 294, 491–496. doi: 10.1016/j.jad.2021.07.078

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: imaging genetics, major depressive disorder, multi-modality, self-expressive network, single-nucleotide polymorphisms

Citation: Pang M, Liu X, Hao X, Wang M, Xie C, Zhang L and Yuan Y (2023) Identification and discovery of imaging genetic patterns using fusion self-expressive network in major depressive disorder. Front. Neurosci. 17:1297155. doi: 10.3389/fnins.2023.1297155

Received: 19 September 2023; Accepted: 31 October 2023;
Published: 21 November 2023.

Edited by:

Mingrui Xia, Beijing Normal University, China

Reviewed by:

Shao-Wei Xue, Affiliated Hospital of Hangzhou Normal University, China
Weikai Li, Chongqing Jiaotong University, China

Copyright © 2023 Pang, Liu, Hao, Wang, Xie, Zhang and Yuan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xiaoyun Liu, MTM2MDgwOTc4OCYjeDAwMDQwO3FxLmNvbQ==; Li Zhang, bGl6aGFuZyYjeDAwMDQwO25qZnUuZWR1LmNu; Yonggui Yuan, eXlneWxoMjAwMCYjeDAwMDQwO3NpbmEuY29t

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.