Adaptive spatial-temporal neural network for ADHD identification using functional fMRI

Qiu, Bo; Wang, Qianqian; Li, Xizhi; Li, Wenyang; Shao, Wei; Wang, Mingliang

doi:10.3389/fnins.2024.1394234

ORIGINAL RESEARCH article

Front. Neurosci., 30 May 2024

Sec. Brain Imaging Methods

Volume 18 - 2024 | https://doi.org/10.3389/fnins.2024.1394234

Adaptive spatial-temporal neural network for ADHD identification using functional fMRI

Bo Qiu¹

Qianqian Wang²

Xizhi Li¹

Wenyang Li¹

Wei Shao³

Mingliang Wang^1,4^*

¹School of Computer Science, Nanjing University of Information Science and Technology, Nanjing, China
²Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
³College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
⁴Nanjing Xinda Institute of Safety and Emergency Management, Nanjing, China

Computer aided diagnosis methods play an important role in Attention Deficit Hyperactivity Disorder (ADHD) identification. Dynamic functional connectivity (dFC) analysis has been widely used for ADHD diagnosis based on resting-state functional magnetic resonance imaging (rs-fMRI), which can help capture abnormalities of brain activity. However, most existing dFC-based methods only focus on dependencies between two adjacent timestamps, ignoring global dynamic evolution patterns. Furthermore, the majority of these methods fail to adaptively learn dFCs. In this paper, we propose an adaptive spatial-temporal neural network (ASTNet) comprising three modules for ADHD identification based on rs-fMRI time series. Specifically, we first partition rs-fMRI time series into multiple segments using non-overlapping sliding windows. Then, adaptive functional connectivity generation (AFCG) is used to model spatial relationships among regions-of-interest (ROIs) with adaptive dFCs as input. Finally, we employ a temporal dependency mining (TDM) module which combines local and global branches to capture global temporal dependencies from the spatially-dependent pattern sequences. Experimental results on the ADHD-200 dataset demonstrate the superiority of the proposed ASTNet over competing approaches in automated ADHD classification.

1 Introduction

Attention Deficit Hyperactivity Disorder (ADHD), with an incidence rate of 7.2% (Thomas et al., 2015), has been the most prevalent psychiatric disorder among adolescents. Individuals affected by ADHD often commonly encounter difficulties in behavior management, hyperactivity, and maintaining attention or focus. However, due to complex pathological mechanisms of ADHD, most current diagnosis methods for ADHD primarily rely on clinical behavioral observations, which may be subjective. Undoubtedly, computer aided diagnosis methods provides a more objective and comprehensive assessment, aiming to help enhance accuracy and efficiency of ADHD diagnosis.

Resting-state functional magnetic resonance imaging (rs-fMRI), which can capture changes in blood flow in response to stimulation, has emerged as a valuable tool for diagnosing diverse psychiatric diseases (Damoiseaux, 2012; Jie et al., 2014a; Wang et al., 2019a; Wang M. et al., 2022). Functional connectivities (FCs), derived from the rs-fMRI, provide insights into quantifying the temporal correlation of functional activation across different brain regions. FCs are usually defined by the correlation (i.e., Pearson correlation) between blood-oxygen-level-dependent (BOLD) signals. In recent years, researchers have designed various learning-based computer-aided diagnostic methods for ADHD analysis and they have observed that ADHD patients exhibit abnormal FCs between ROIs. These abnormal FCs can serve as potential biomarkers for clinical diagnosis of ADHD. Previous FC-based methods were usually conducted with the assumption that FC remains constant during fMRI recording. Recently, more and more studies have confirmed that brain activity is actually dynamic (Arieli et al., 1996; Makeig et al., 2004; Onton et al., 2006), and analysis based on this can reveal changes in FCs over time (Du et al., 2018; Zhang et al., 2021). These changes can help us understand how cognitive states evolve over time, which is critical for better understanding the pathology of brain diseases. For this reason, there has been a shift toward dynamic connectivity analysis in recent efforts (Bahrami et al., 2021; Wang Z. et al., 2022; Yang et al., 2022).

Specifically, most dFC-based methods can be roughly categorized into two groups: (1) conventional machine learning methods (Wang et al., 2017; Vergara et al., 2018; Feng et al., 2022) and (2) deep learning methods (Wang et al., 2019b; Yan et al., 2019; Lin et al., 2022). Previous machine learning methods first extract features manually and then feed them into subsequent prediction models. These approaches take fMRI feature learning and downstream model training as independent processes, possibly leading to suboptimal model performance. In contrast, deep learning methods usually perform feature learning and downstream prediction tasks in an end-to-end manner, which can learn task-oriented discriminative fetatures to facilitate ADHD identification. By automatically learning features from the dFC network, deep learning methods provide a cohesive framework for feature learning and classification. Considering that the functional connectivity network can be mathematically modeled as a graph, graph convolutional network (GCN), renowned for their effectiveness in processing the graph data, has been widely used in FC analysis. However, it is worth noting that many GCN methods are designed based on predefined FCs, which hinders the adaptive learning of interactions between different brain ROIs. Furthermore, most dFC-based methods only focus on temporal dependencies between adjacent timestamps, ignoring important global dynamic evolution.

To solve this issue, we propose a novel adaptive spatial-temporal neural network (ASTNet) that can not only adaptively learn functional connectivities between brain ROIs but also mine global temporal dependencies in dFCs. As illustrated in Figure 1, the proposed ASTNet consists of three components, i.e., the partition of rs-fMRI time series, adaptive functional connectivity generation, and temporal dependency mining. Specifically, we first divide the rs-fMRI time series into multiple segments using non-overlapping sliding windows to characterize the temporal variability of fMRI time series. After that, for each time-series segment, we design an adaptive functional connectivity generation (AFCG) module that first adaptively learns FCs between ROIs and then use GCN to capture topological information of brain network. Finally, a temporal dependency mining (TDM) module which integrates local and global branches, is proposed to capture temporal dependencies from the spatially-dependent pattern sequences. Within the TDM module, the global branch investigates variations in individual points within the FC structure, such as the emergence or disappearance and the strengthening or weakening of connections, which is referred to as spatial variation. Meanwhile, the local branch examines temporal changes between dFCs, known as temporal variation. To further obtain subject-level representation, we concatenate the generated features from these two branches, followed by a fully connected layer for disease classification. Experimental results on 620 subjects in the ADHD-200 dataset demonstrate the effectiveness of our ASTNet in adaptive graph learning and temporal dependence mining. This demonstrates the importance and great potential of our model in ADHD identification, with great promise in practical applications.

Figure 1

Figure 1. Overview of the proposed adaptive spatial-temporal neural network (ASTNet), including three components: (A) partitioning rs-fMRI time series into T segments via non-overlapping sliding windows, (B) adaptive functional connectivity generation (AFCG), where the adjacency matrix is first learned via adaptive graph learning (AGL) module for each time window and then fed into graph convolutional network (GCN), and (C) a temporal dependency mining module (TDM) to capture temporal dynamics across all time windows. With the output of the TDM, a fully-connected layer is further used for disease classification.

2 Related work

2.1 Static FC-based method

Conventional FC-based methods usually first extract handcrafted features from functional connectivity networks and then train a classifier (e.g., support vector machine, SVM) for disease prediction (Bai et al., 2009; Jie et al., 2014a,b; Plis et al., 2014; Bi et al., 2018). For example, Bi et al. (2018) designed a random SVM cluster method for AD identification. This method firstly randomly selected samples and FC features to establish multiple SVMs, and then employed an ensemble strategy for the final prediction. Jie et al. (2014a) extracted and integrated multiple properties of static FC networks (e.g., connectivity strength and local clustering coefficient) for diagnosing brain diseases and achieved better performance compared with single network measures. Even so, these methods usually rely on handcrafted feature representations for classification models, thereby possibly producing sub-optimal classification performance.

More recently, deep learning methods have been proposed to automatically learn data-driven features from dFC networks. These methods offer a unified framework for fMRI feature learning and brain disorder prediction, ultimately achieving better performance. For example, Liang et al. (2021) proposed a novel convolutional neural network combined with a prototype learning (CNNPL) framework to classify brain functional networks for the diagnosis of autism spectrum disorder. Specifically, it used traditional convolutional neural networks to extract high-level features from pre-defined FCs and further designed a prototype learning strategy to automatically learn prototypes of each category for ASD classification. Eslami et al. (2019) proposed to extract the lower dimensional feature representation of FCs using an autoencoder, followed by a single layer perceptron (SLP) for ASD identification. Kawahara et al. (2017) developed three distinct convolutional layers—edge-to-edge (E2E), edge-to-node (E2N), and node-to-graph layer (N2G)—to capture the spatial characteristics of structural brain connectivity for cognitive and motor developmental score prediction in premature infants. Due to the graph-structured nature of brain functional networks, graph neural network (GNN), which can learn expressive graph representations, have shown significant potential in FC-based brain disease diagnosis. For example, Ktena et al. (2018) proposed learning a graph similarity metric using a siamese graph convolutional neural network for ASD classification. Yao et al. (2021) developed a mutual multi-scale triplet graph convolutional network for brain disorder diagnosis using functional or structural connectivity. Li et al. (2020) designed an ROI-aware graph convolutional layer that leveraged fMRI's topological and functional information for ASD diagnosis.

These deep learning methods greatly improve the efficiency and classification/regression performance in FC-based analysis due to their end-to-end architecture. However, these methods mainly study static patterns of brain networks, thereby ignoring the dynamic characteristics of brain FCs. Besides, GNN-based methods generally take a fixed graph structure as input, whose reliability remains to be discussed.

2.2 Dynamic FC-based method

Several dynamic functional analysis methods have recently been proposed for brain disease classification (Wang et al., 2019b, 2023; Yan et al., 2019; Gadgil et al., 2020; Lin et al., 2022; Liu et al., 2022; Liang et al., 2023). For example, Wang et al. (2019b) proposed a spatial-temporal convolutional-recurrent neural network (STNet) for Alzheimer's disease progression prediction using rs-fMRI time series. Specifically, a convolutional component was first employed to construct the FC within each time-series segment. Then, the long short-term memory (LSTM) units were used to model the temporal dynamics patterns of these successive FCs. Finally, a fully connected layer is used to perform disease progression prediction. Lin et al. (2022) developed a convolutional recurrent neural network (CRNN) for dynamic FCs analysis and automated brain disease diagnosis. In this method, a sequence of pre-constructed FC networks was input into three convolutional layers to extract temporal features, and an LSTM layer was used to capture temporal information from multiple time segments, followed by three fully connected layers for brain disease classification. To take advantage of spatio-temporal information of fMRI data, Yan et al. (2019) designed a multi-scale RNN framework to classify schizophrenia. Specifically, stacked convolution layers were used to extract different scale features, followed by a two-layer stacked Gated Recurrent Unit (GRU) to mine dynamic information conveyed in fMRI series. Gadgil et al. (2020) trained a spatio-temporal graph convolutional network (ST-GCN) on each segment of the BOLD time series to predict gender and age. In this method, a positive and symmetric “edge importance” matrix was first integrated to determine the importance of spatial graph edges. Then, three layers of ST-GC units were used to perform spatial graph convolution, followed by a fully connected layer for final prediction. Liang et al. (2023) proposed a self-supervised multi-task learning model for detecting AD progression, in which a masked map auto-encoder and temporal contrast learning were jointly pre-trained to capture the structural and evolutionary features of longitudinal brain networks. Liu et al. (2022) proposed a method based on nested residual convolutional denoising autoencoder (NRCDAE) and convolutional gated recurrent unit (GRU) for ADHD diagnosis. Specifically, the NRCDAE was used to reduce the spatial dimension of rs-fMRI and extract the 3D spatial features. Then, the 3D convolutional GRU was adopted to extract the spatial and temporal features simultaneously for classification. Although existing dynamic FC-based methods consider the temporal dynamics in the prediction of disease progression, those methods fail to capture the global temporal changing patterns of the whole brain (i.e., the longitudinal network-level patterns).

3 Materials and method

In this section, we introduce the materials used in this work, the proposed method, as well as implementation details.

3.1 Material

3.1.1 Data acquisition

We use the ADHD-200 dataset to validate the effectiveness of the proposed method. The ADHD-200 dataset includes 973 subjects collected from eight different imaging sites. Specifically, the dataset contains 362 ADHD patients, 585 normal controls (NC), and 26 undiagnosed subjects and can be accessed from the NeuroImaging Tools & Resource Collaboratory (NITRC) website.¹ Each participant's data in the ADHD-200 dataset consists of a resting-state functional MRI scan, a structural MRI scan, and the corresponding phenotypic information. Note that ADHD patients in the ADHD-200 dataset are further categorized into three subtypes: ADHD-Combined, ADHD-Hyperactive/Impulsive, and ADHD-Inattentive. To simplify the binary classification task, all subtypes in the ADHD-200 dataset are uniformly labeled as 1. During the ADHD-200 Global Competition, the ADHD-200 dataset is divided into a training set and a test set, each with corresponding phenotypic information. The numbers of subjects are 768 and 197, respectively. In this paper, we also follow this division in our experiments for a fair comparison. Note that in our performance evaluation, we exclude 26 subjects without released labels in the test set. Furthermore, we also discard subjects from the Pitt and Washu imaging sites in our study because their training sets only contained normal control (NC) subjects. Thus, a total of 620 subjects are used in this study, including 340 ADHD patients and 280 NCs. The detailed demographic information of involved subjects and data partition for experiments are provided in Table 1.

Table 1

Table 1. Demographic information and data partition of the studied subjects from ADHD-200 dataset.

3.1.2 Data pre-processing

All resting-state fMRI data used in our study were preprocessed by the C-PAC pipeline.² This pipeline includes several processing steps such as skull stripping, slice timing correction, head motion realignment, intensity normalization, band-pass filtering (0.01–0.1 Hz), and the regression of white matter, cerebrospinal fluid, and motion parameters. To minimize the impact of head motion on our results, we first removed fMRI data from participants whose heads moved more than 2.0 mm in any direction or 2° in any rotation. After that, we performed structural skull stripping and then mapped the remaining fMRI data to the Montreal Neurological Institute (MNI) space. A 6mm Gaussian kernel was used to spatially smooth the rs-fMRI data. Note that further our analysis excluded subjects with a frame displacement exceeding 2.5 min (FD > 0.5). Finally, the automated anatomical labeling (AAL) template was used to extract the mean rs-fMRI time series for a set of 116 pre-defined ROIs.

3.2 Method

As illustrated in Figure 1, the proposed ASTNet includes (1) partition of rs-fMRI time series, (2) adaptive functional connectivity generation, and (3) temporal dependency mining.

3.2.1 Partition of rs-fMRI time series

To characterize the temporal variability of fMRI series, we first employ the sliding window strategy to partition all rs-fMRI time series into T non-overlapping windows, each with a fixed window size L. Specifically, for each window, we represent the segmented time series as $G_{t} = {(v_{1}^{t}, v_{2}^{t}, . . ., v_{N}^{t})}^{⊤} \in R^{N \times L} (t = 1, \dots, T)$ , where the t represents the t-th segments and N denotes the number of nodes. In our paper, the window size L is set as 20. For PKU, NYU, OHSU, NI, and KKI sites, the lengths of extracted fMRI time series are 231, 171, 72, 256, and 119 and the corresponding TR is 2.5, 1.96, 2, 2.5, and 2s, respectively. Since each site has a different scanning time (i.e., length of fMRI time series), we obtain T = 5, T = 12, T = 8, T = 3, and T = 11 segments for these five sites, respectively. The reason for choosing such window length is that window sizes around 30–60 s can provide a robust estimation of the dynamic fluctuations in rs-fMRI data (Wang et al., 2021). For each subject, the time-series segment S will be considered as the input of the proposed network.

3.2.2 Adaptive functional connectivity generation

In order to better explore the spatial relationships between brain regions, we employ an adaptive graph learning (AGL) strategy to learn functional connectivities, instead of relying on prior knowledge or manual labor. We define a non-negative function Equation (1) with a learnable weight vector $ω = {(ω_{1}, ω_{2}, . . ., ω_{F_{d_{e}}})}^{T} \in R^{F_{d e} * 1}$ based on the graph data G_t for each window to represent the connection between any two brain nodes x_m and x_n:

\begin{array}{l} A_{m n} = g (x_{m}, x_{n}) = \frac{e x p (R E L U (ω^{T} | x_{m} - x_{n} |))}{\sum_{n = 1}^{N} e x p (R E L U (ω^{T} | x_{m} - x_{n} |))}, & (1) \end{array}

where x_i represents the fMRI data of the i-th brain ROI, nonlinear activation function ReLU guarantees A_mn is nonnegative, and the softmax operation normalizes each row of A. To introduce prior knowledge, we incorporate the following regularization loss Equation (2):

\begin{array}{l} L_{g r a p h_l e a r n i n g} = \sum_{m, n = 1}^{N} | | x_{m} - x_{n} | |_{2}^{2} A_{m n} + λ | | A | |_{F}^{2} . & (2) \end{array}

That is, the smaller distance ||x_m−x_n||₂ between x_m and x_n, the larger A_mn is. This regularization allows nodes/ROIs with similar features to have greater connection weights. Furthermore, considering the sparsity nature of brain functional network (i.e., brain graph), the second term is introduced, where λ≧0 is a regularization parameter. Through the proposed graph learning mechanism, we obtain an adaptively learned adjacency matrix A used for the subsequent graph convolution operation Equation (3):

\begin{array}{l} H^{l + 1} = σ (A H^{l} W^{l}), & (3) \end{array}

where H^l is the time series signal characteristics of brain network nodes in layer l, A represents the learned adjacency matrix, W denotes a learnable weight matrix, and σ is the activation function. The fundamental principle underlying graph convolution is the iterative aggregation of neighboring node information to update the feature representation of the central node. Finally, we get the functional connectivity matrix by Equation (4):

\begin{array}{l} S_{t} = H^{l} {(H^{l})}^{T}, & (4) \end{array}

where H^l denotes the final node features generated from GCN and S_t measures the degree of second-order dependency between ROIs.

3.2.3 Temporal dependency mining

To capture temporal dynamic information within fMRI series across temporal dimension, we design the temporal dependency mining (TDM) module which includes two parallel architectures, i.e., local and global branches. The local branch is designed to explore the temporal evolution of adjacent sliding windows, providing insights into fine-grained changes. The global branch is used to capture the evolutionary patterns across all timestamps. Details are introduced below.

3.2.3.1 Local temporal dependency mining branch

To capture the local temporal dependency of dFCs, we propose the use of a bi-directional Gated Recurrent Unit (BiGRU), which is a type of recurrent neural network (RNN). Different from unidirectional GRU, the bidirectional GRU (BiGRU) consists of two GRUs, where one GRU scans the sequence from the beginning to the end, while the other scans the sequence from the end to the beginning. This bidirectional structure enables the model to consider both past and future information simultaneously, thereby enhancing the accuracy of feature information capture in sequential data. Mathematically, the BiGRU can be represented as Equation (5):

\begin{array}{l} y_{t} = G R U (x_{t}, \vec{h_{t - 1}}) \oplus G R U (x_{t}, \overset{⃖}{h_{t - 1}}), & (5) \end{array}

where x_t denotes initial temporal state, h_t represents hidden state, and arrows represent different operation directions. The GRU is composed of two gating mechanisms, including the reset gate and the update gate. The calculation formulas for the GRU unit are as follows:

\begin{array}{r} z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}]), \\ r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}]), \\ {\tilde{h}}_{t} = t a n h (W \cdot [r_{t} * h_{t - 1}, x_{t}]), \\ h_{t} = (1 - z_{t}) * h_{t - 1} + z_{t} * {\tilde{h}}_{t} . & (6) \end{array}

In Equation (6), the reset gate operation r_t controls the fusion of new input information with the previous “memory,” and the update gate z_t influences the amount of information to be forgotten from the previous moment. In the second formula, W_r denotes a weight matrix, while r_t is obtained by linearly transforming the concatenated matrix of x_t and h_t−1. This value is subsequently utilized in the third formula to update the hidden information of the candidate. For ease of understanding:

\begin{array}{l} {\tilde{h}}_{t} = t a n h (x_{t} W_{x h} + (r_{t} ⊙ h_{t - 1}) W_{h h} + b_{h}) & (7) \end{array}

In Equation (7) the value of r_t in the update gate influences the amount of information to be forgotten from the previous moment, as indicated by the Hadamard product with h_t−1. The first equation represents the update gate, while the fourth equation controls the extent to which previous information is incorporated into the current state. In the fourth equation, the closer z_t is to 1, the more information it retains or “remembers.” (1−z_t)*h_t−1 selectively forgets parts of the previous hidden state, while $z_{t} * {\tilde{h}}_{t}$ selectively incorporates candidate hidden states. In summary, the fourth equation combines forgetting some information passed down from h_t−1 with incorporating relevant information from the current node, resulting in the final memory representation h_t. In this way, we can obtain the final local time dependency information between adjacent time-sliced fMRI data by recursively transmitting hidden state information.

3.2.3.2 Global temporal dependency mining branch

We employ global attention to capture the global temporal dependency of dFCs. For each segment, we first employ an MLP to extract potential hidden abnormal connection information, ensuring that spatial information at different stages remains intact despite the temporal interactions. Specifically, the upper triangular data of the symmetric matrix is first converted to a one-dimensional vector x and then fed into the MLP to obtain the characteristic information F_t, represented as Equation (8):

\begin{array}{l} F_{t} = g (\sum_{i = 0}^{M} w_{i} x_{i}), & (8) \end{array}

where w_i is learnable weight and g is activation function. Then, we incorporate a global attention mechanism to capture temporal dependencies between dynamic FCs. The formula of global attention is defined as Equation (9):

\begin{array}{l} M_{c} (F) = σ (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F))), & (9) \end{array}

where F represents the function connectivity information processed by previous MLP on T segments and σ denotes the sigmoid function. Note that the MLP weights are shared for both inputs and the ReLU activation function. The obtained attention weights (i.e., M_c(F)) are used to combine information from multiple windows, resulting in a final feature representation expressed as a one-dimensional vector denoted as $\bar{d}$ . Finally, we concatenate the output of global and local branches, yielding a one-dimensional vector. Then, this one-dimensional vector is fed into three fully connected layers to obtain the final classification result.

It's worth noting that, to avoid the trivial solution (i.e., ω = (0, 0,…, 0), which is due to minimizing the above loss function L_{graph_learning} independently, we utilize it as a regularized term to form the final loss function Equation (10):

\begin{array}{l} L_{l o s s} = L_{c r o s s_e n t r o p y} + L_{g r a p h_l e a r n i n g}, & (10) \end{array}

where L_{cross_entropy} denotes the categorical_crossentropy of the classification task.

3.3 Implementation

We implement the framework using Python 3.7 and Pytorch library. For each subject, the adjacency matrix is constructed via our designed AGL strategy, where a random initialization technique initializes the vector ω according to a normal distribution. Subsequently, the graph convolution process comprises three GCN layers, followed by batch normalization, ReLU activation, and a dropout rate of 0.5. We then perform a dot product operation on the representation generated from the GCN layers to construct symmetric matrices describing the degree of correlation between nodes. To reduce dimensionality, we flatten the upper triangular portion of each matrix into a vector. This vector is subsequently fed into an MLP consisting of three fully connected layers. Additionally, we incorporate two dropout layers to mitigate overfitting.

Subsequently, we employ an attention mechanism to obtain attention weights and compute the weighted sum of the vector data. This process yields a 32-dimensional vector, which serves as the final output of the global branch. For local analysis using the BiGRU, we perform experiments with different numbers of units, specifically 4, 16, and 64, for training data from various sites.

4 Experiment and result analysis

4.1 Methods for comparison

In the experiments, we compare our ASTNet model with the following eight methods, including the baseline methods and any other variants of the proposed method.

1. MLP (Tolstikhin et al., 2021): in this method, the static FC matrix for each subject is directly used as the input of the MLP model. Specifically, the MLP model comprises three fully-connected layers with hidden neurons of 1,024, 256, and 32, respectively.

2. AE (Wang et al., 2014): auto-encoder (AE) is an unsupervised learning model that can learn a mapping supervised by input X itself. Specifically, AE extracts useful features from brain networks through bottleneck-like fully connected layers. The hidden layer dimension is determined by the data length in different sites.

3. GCN (Kipf and Welling, 2016): this method first uses two layers of GCN based on Pearson correlation to update spatial correlation between ROIs where the data length in different sites determines dimensions. Then, the FCs, calculated from the dot product of node features, are used as the input to construct a three-layer MLP model with hidden neuron number of 1,024, 256, and 32, respectively.

4. AGL_s: in this method, FCs are learned by the adaptive method. Static adaptive graph learning (AGL_s) method replaces the Pearson correlation matrices with adaptive brain networks as the input. The network structure is the same as the previous MLP method.

5. BiGRU (Chung et al., 2014): This method partitions fMRI data with a constant length of 20 time points. For each segment, we build a functional connectivity matrix. Then, these matrices are sent into the BiGRU model to study their temporal change information with different numbers of units, specifically 4, 16, and 64, for training data.

6. AGL_d: dynamic adaptive graph learning method (AGL_d) implies adaptive learning to construct adjacency matrices on each segment. Then, the TDM module is used to analyze time-varying information between different sliding window data. Specifically, the local branch adopts a two layers BiGRU with 4 units to make the final classification, and the Global branch explores temporal and spatial variability using global attention where MLP has three hidden layers 1,024, 256, and 32, respectively, and the ratio is set as 3 in attention mechanism.

Note that, we both have static and dynamic experiments for method MLP and GCN, named as MLP_s, MLP_d, GCN_s, and GCN_d. And we apply global branch in the TDM to mine temporal dependency for MLP_d and adopt the whole TDM module for GCN_d classification.

4.2 Experiment settings

We evaluate the proposed method on five different sites (i.e., PKU, NYU, OHSU, NI, and KKI) of the ADHD database based on rs-fMRI data. We divide data on each site into training data and test data, following Global Competition. The test set is unseen during the training stage.

To evaluate classification performance, three metrics are used, including accuracy (ACC), sensitivity (SEN), and specificity (SPE). These metrics are defined as follows: ACC = (TP + TN)/(TP + TN + FP + FN), SEN = TP/(TP + FN), SPE = TN/(TN + FP). Here, TP, TN, FP, and FN represent true positive, true negative, false positive, and false negative values, respectively. Higher values for these metrics indicate better classification performance.

4.3 Classification performance

The quantitative results achieved by different methods in the binary classification tasks are reported in Table 2. From Table 2, one could have three main observations.

Table 2

Table 2. Accuracy (ACC) values achieved by our proposed ASTNet and eight competing methods on five sites (i.e., PKU, NYU, OHSU, NI, and KKI) of ADHD-200 dataset.

First, our proposed method and its variants (i.e., AGL_d and GCN_d) generally achieve better performance compared with the baseline methods (i.e., MLP, BiGRU, and Auto-Encoder) in the classification task. For example, in terms of ACC values, ASTNet achieved an improvement of 13.9%, compared with the best baseline method (with 66.6%) in ADHD classification. This demonstrates that our designed adaptive functional connectivity learning strategy and temporal dependency mining module can help extract more discriminative fMRI features, thus enhancing classification performance. Second, our proposed ASTNet and its variants outperform those methods without considering the temporal dynamics (e.g., with GCN_s and AGL_s) in terms of most metrics. In particular, the SEN values produced by our ASTNet in site PKU and NYU are 85.1 and 75.0%, which is higher than other methods. These results suggest that our TDM module can effectively capture dynamic changes in rs-fMRI time series. Finally, our ASTNet is superior to its variants (i.e., AGL_d and GCN_d). This result implies that the adaptive functional connectivity learning strategy and TDM module help boost the learning performance of ASTNet.

4.4 Interpretable analysis of the learned FCs

The proposed ASTNet can automatically learn dFCs in a data-driven manner, which differs from previous studies that rely on predefined FC networks (e.g., via Pearson's correlation). We now further analyze the FC networks learned by the proposed adaptive method. Specifically, as introduced in Section 3.2.2, the AFCG can learn new connectivity strength between each central node/ROI and all the remaining N-1 ROIs. Therefore, we can generate a fully connected FC network based on the learned connectivity vector. Given the size of the sliding window, different lengths of data will result in different numbers of segments. Taking PKU site as an example, we can construct K = 11 dynamic FC networks for each subject, with each network corresponding to a segment. Finally, using the standard t-test, we measure the group difference between ADHD and NC via p-values, with group difference matrices visualized in Figure 2. For comparison, in Figure 2, we also report the group difference of the stationary FC network that is constructed via measuring Pearson correlation coefficients between fMRI time series of pairwise brain ROIs. Note that the obtained p-values were binarized (i.e., setting p-values more than 0.05 to 1 and 0 otherwise) for clarity in Figure 2. From Figure 2, we have the following observations.

Figure 2

Figure 2. Visualization of group difference matrices generated by our learned dynamic FC networks and static network. Note that p-values > 0.05 are set to 1 (shown in yellow), while those ≤0.05 were set to 0 (shown in green). The Fⁱ(i = 1, …, 11) repesents the group difference matrices generated by the i-th segment.

First, from Fⁱ, i = 1, …, 11 in Figure 2, we can observe that the group difference matrices generated by different segments exhibit significant differences, which further validates the temporal variability of brain networks. Second, by comparing our learned Fⁱ and Static in Figure 2, it can be found that the dynamic FC network learned by our ASTNet shows superiority over the pre-defined static network in identifying disease-related functional connectivities and ROIs. For example, Several ROIs, such as the anterior cingulate and paracingulate gyri node (ACG.L) in F¹, the superior parietal gyrus node (SPG.L) in F⁶, and the cerebellum nodes in F⁵, are detected by our dynamic FC networks in AD vs. NC classification. These findings aligns with previous AD-related studies, which further demonstrates the learned FCs by our ASTNet have good interpretability.

4.5 Comparison with state-of-the-art methods

We further compare the proposed ASTNet with seven state-of-the-art (SOTA) methods designed for ADHD analysis, including PCA-LDA (Dey et al., 2012), EM-MI (Dou et al., 2020), 3D CNN (Zou et al., 2017), SASNI (Zhang et al., 2017), SPAE (Cao et al., 2023), STAAE (Dong et al., 2020), and KD-Transformer (Zhang et al., 2022). Note that all the methods use the standard training/test sets division by the data set. The classification results achieved by different methods are reported in Table 3, with the best results highlighted in bold. From Table 3, we can have the following findings.

Table 3

Table 3. Quoted results from literature on ADHD-200 dataset.

First, the proposed ASTNet outperforms seven SOTA methods in ADHD classification task on the ADHD-200 dataset, which implies that our ASTNet can learn more discriminative features for ADHD identification. Second, Compared with static methods (i.e., PCA-LDA, 3D CNN, SASNI, and KD-Transformer), the methods (i.e., SPAE, STAAE, and KD-Transformer) that consider temporal dynamics in fMRI series achieves relatively better performance. This suggests that temporal information conveyed in fMRI series plays an important role in distinguishing ADHD patients from normal controls.

Third, the ACC of our ASTNet achieves an improvement of 5% compared with the STAAE that designs a spatiotemporal attention auto-encoder long-distance dependency in time. This finding further demonstrates the superiority of our ASTNet in dynamic brain network learning and brain disorder classification.

5 Discussion

In this section, we explore the influence of different sliding window sizes, compare the proposed method with its degraded variants, and discuss several limitations of the current work and future work.

5.1 Influence of sliding window size

In main experiments, we divide fMRI series using sliding window strategy with window size of 20. To investigate the influence of different sliding window sizes on results, we vary the values of sliding window size within [10, 15, ⋯ , 30]. The results in ADHD classification on five sites are reported in Figure 3. As shown in Figure 3, we can see that our the classification accuracy of our ASTNet fluctuates to a certain extent as the window size increases. When window size is 20, our method achieves its peak performance across different sites, which validates that our selected window size is reasonable.

Figure 3

Figure 3. Results of the proposed ASTNet method with respect to different sliding windows length in ADHD vs. NC on different sites.

5.2 Ablation study

To demonstrate the effectiveness of each module in the proposed ASTNet, we further compare our ASTNet with its degenerated variants, including (1) GCN_d without incorporating adaptive graph learning (AGL) module, (2) AGL_d without incorporating GCN module, (3) ASTNet_G without global branch and (4) ASTNet_L without local branch. The experiment results of our ASTNet and its variants are reported in Table 4.

Table 4

Table 4. Ablation results on ADHD-200 dataset.

It can be found from that our ASTNet consistently outperforms GCN_d that fails to adaptively learn FC strength. This implies that our designed adaptive graph learning strategy can automatically generate more reliable FC network for subsequent analysis, thus boosting model performance. In addition, we can observe that our ASTNet is superior to ASTNet_G without modeling long-term dependencies among dynamic functional connectivities (dFCs). Besides, our ASTNet achieves better performance than ASTNet_L that can not capture local temporal dependency in dFCs. These observations further demonstrate the advantage of our ASTNet, which simultaneously uses global and local branches in fMRI temporal feature learning.

5.3 Limitations and future work

While our work achieves good results in automatically identifying ADHD using fMRI data, several issues still need to be considered in the future to further improve the performance of the proposed method. First, considering the small-sample-size issue of fMRI data, we will employ transfer learning and pretraining strategies to further enhance model generalization. Second, different brain image modalities, such as structural MRI and Positron Emission Tomography (PET), can provide complementary information for ADHD diagnosis. Integrating multimodal neuroimages would be an interesting avenue to pursue, which will be our future work. Finally, we only construct the functional connectivity matrix based on the AAL atlas with 116 pre-defined ROIs in this work. In the future, we will explore multi-scale functional connectivity networks divided by multiple brain atlas to capture complementary topological information.

6 Conclusion

In this paper, we propose an end-to-end adaptive spatial-temporal neural network for ADHD classification using rs-fMRI time-series data. Specifically, we first divide fMRI data into non-overlapping segments to characterize the temporal variability. Then, a adaptive functional connectivity generation (AFCG) module is used model spatial dependencies between brain ROIs for each segment. In particular, within the AFCG, a adaptive graph learning strategy is designed to learn functional connectivity strength a data-driven manner. Finally, we develop a temporal dependency mining (TDM) module that integrates global and local branches to capture the temporal dynamics across multiple time segments. Extensive experiments on the dataset demonstrate the superiority of our ASTNet over several state-of-the-art methods, demonstrating its potential in identifying ADHD.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

BQ: Writing – original draft. QW: Writing – review & editing. XL: Validation, Writing – original draft. WL: Writing – original draft. WS: Writing – review & editing. MW: Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. BQ, XL, WL, and MW were supported in part by the National Natural Science Foundation of China (No. 62102188), the Natural Science Foundation of Jiangsu Province of China (No. BK20210647), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (No. 21KJB520013), and the Project funded by China Postdoctoral Science Foundation (No. 2021M700076).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1. ^https://fcon 1000.projects.nitrc.org/indi/adhd200/

2. ^http://preprocessed-connectomes-project.org/abide/

References

Arieli, A., Sterkin, A., Grinvald, A., and Aertsen, A. (1996). Dynamics of ongoing activity: explanation of the large variability in evoked cortical responses. Science 273, 1868–1871. doi: 10.1126/science.273.5283.1868

PubMed Abstract | Crossref Full Text | Google Scholar

Bahrami, M., Laurienti, P. J., Shappell, H., and Simpson, S. L. (2021). A mixed-modeling framework for whole-brain dynamic network analysis. Netw. Neurosci. 6, 591–613. doi: 10.1162/netn_a_00238

PubMed Abstract | Crossref Full Text | Google Scholar

Bai, F., Watson, D. R., Yu, H. J., Mei Shi, Y., Yuan, Y., Zhang, Z., et al. (2009). Abnormal resting-state functional connectivity of posterior cingulate cortex in amnestic type mild cognitive impairment. Brain Res. 1302, 167–174. doi: 10.1016/j.brainres.2009.09.028

PubMed Abstract | Crossref Full Text | Google Scholar

Bi, X., Shu, Q., Sun, Q., and Xu, Q. (2018). Random support vector machine cluster analysis of resting-state fMRI in Alzheimer's disease. PLoS ONE 13:e0194479. doi: 10.1371/journal.pone.0194479

PubMed Abstract | Crossref Full Text | Google Scholar

Cao, C., Li, G., Fu, H., Li, X., and Gao, X. (2023). “SPAE: spatial preservation-based autoencoder for ADHD functional brain networks modelling,” in Proceedings of the 2023 ACM International Conference on Multimedia Retrieval (Association for Computeing Machinery), 370–377. doi: 10.1145/3591106.3592213

Crossref Full Text | Google Scholar

Chung, J., Gülçehre, Ç., Cho, K., and Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv [Preprint]. arXiv/1412.3555. doi: 10.48550/arXiv.1412.3555

Crossref Full Text | Google Scholar

Damoiseaux, J. S. (2012). Resting-state fMRI as a biomarker for Alzheimer's disease? Alzheimers Res. Ther. 4, 1–2. doi: 10.1186/alzrt106

PubMed Abstract | Crossref Full Text | Google Scholar

Dey, S., Rao, A. R., Shah, M., and Fair, D. A. (2012). Exploiting the brain's network structure in identifying ADHD subjects. Front. Syst. Neurosci. 6:75. doi: 10.3389/fnsys.2012.00075

PubMed Abstract | Crossref Full Text | Google Scholar

Dong, Q., Qiang, N., Lv, J., Li, X., Liu, T., and Li, Q. (2020). “Spatiotemporal attention autoencoder (STAAE) for ADHD classification,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science, Vol. 12267, eds. Martel, A.L., et al. (Cham: Springer). doi: 10.1007/978-3-030-59728-3_50

PubMed Abstract | Crossref Full Text | Google Scholar

Dou, C., Zhang, S., Wang, H., Sun, L., Huang, Y., and Yue, W. (2020). ADHD fMRI short-time analysis method for edge computing based on multi-instance learning. J. Syst. Archit. 111:101834. doi: 10.1016/j.sysarc.2020.101834

Crossref Full Text | Google Scholar

Du, Y., Fu, Z., and Calhoun, V. D. (2018). Classification and prediction of brain disorders using functional connectivity: promising but challenging. Front. Neurosci. 12:525. doi: 10.3389/fnins.2018.00525

PubMed Abstract | Crossref Full Text | Google Scholar

Eslami, T., Mirjalili, V., Fong, A., Laird, A. R., and Saeed, F. (2019). ASD-DiagNet: a hybrid learning approach for detection of autism spectrum disorder using fMRI data. Front. Neuroinform. 13:70. doi: 10.3389/fninf.2019.00070

PubMed Abstract | Crossref Full Text | Google Scholar

Feng, Y., Jia, J., and Zhang, R. (2022). “Classification of Alzheimer's disease by combining dynamic and static brain network features,” in Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence (Association for Computering Machinery), 41–48. doi: 10.1145/3577530.3577537

PubMed Abstract | Crossref Full Text | Google Scholar

Gadgil, S., Zhao, Q., Pfefferbaum, A., Sullivan, E. V., Adeli, E., Pohl, K. M., et al. (2020). “Spatio-temporal graph convolution for resting-state fMRI analysis,” in Medical Image Computing and computer-assisted intervention: MICCAI International Conference on Medical Image Computing and Computer-Assisted Intervention, Vol. 12267 (Cham: Springer), 528–538. doi: 10.1007/978-3-030-59728-3_52

PubMed Abstract | Crossref Full Text | Google Scholar

Jie, B., Zhang, D., Gao, W., Wang, Q., Wee, C. Y., Shen, D., et al. (2014). Integration of network topological and connectivity properties for neuroimaging classification. IEEE Trans. Biomed. Eng. 61, 576–589. doi: 10.1109/TBME.2013.2284195

PubMed Abstract | Crossref Full Text | Google Scholar

Jie, B., Zhang, D., Wee, C. Y., and Shen, D. (2014). Topological graph kernel on multiple thresholded functional connectivity networks for mild cognitive impairment classification. Hum. Brain Mapp. 35, 2876–2897. doi: 10.1002/hbm.22353

PubMed Abstract | Crossref Full Text | Google Scholar

Kawahara, J., Brown, C. J., Miller, S. P., Booth, B. G., Chau, V., Grunau, R. E., et al. (2017). BrainNetCNN: convolutional neural networks for brain networks; towards predicting neurodevelopment. Neuroimage 146, 1038–1049. doi: 10.1016/j.neuroimage.2016.09.046

PubMed Abstract | Crossref Full Text | Google Scholar

Kipf, T., and Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv [Preprint]. arXiv/1609.02907. doi: 10.48550/arXiv.1609.02907

Crossref Full Text | Google Scholar

Ktena, S. I., Parisot, S., Ferrante, E., Rajchl, M., Lee, M., Glocker, B., et al. (2018). Metric learning with spectral graph convolutions on brain connectivity networks. Neuroimage 169, 431–442. doi: 10.1016/j.neuroimage.2017.12.052

PubMed Abstract | Crossref Full Text | Google Scholar

Li, X., Zhou, Y., Gao, S., Dvornek, N. C., Zhang, M., Zhuang, J., et al. (2020). BrainGNN: interpretable brain graph neural network for fMRI analysis. bioRxiv. doi: 10.1101/2020.05.16.100057

PubMed Abstract | Crossref Full Text | Google Scholar

Liang, W., Zhang, K., Cao, P., Zhao, P., Liu, X., Yang, J., et al. (2023). “Modeling Alzheimers' disease progression from multi-task and self-supervised learning perspective with brain networks,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Cham: Springer). doi: 10.1007/978-3-031-43907-0_30

Crossref Full Text | Google Scholar

Liang, Y., Liu, B., and Zhang, H. (2021). A convolutional neural network combined with prototype learning framework for brain functional network classification of autism spectrum disorder. IEEE Trans. Neural Syst. Rehabil. Eng. 29, 2193–2202. doi: 10.1109/TNSRE.2021.3120024

PubMed Abstract | Crossref Full Text | Google Scholar

Lin, K., Jie, B., Dong, P., Ding, X., Bian, W., Liu, M., et al. (2022). Convolutional recurrent neural network for dynamic functional MRI analysis and brain disease identification. Front. Neurosci. 16:933660. doi: 10.3389/fnins.2022.933660

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, S., Zhao, L., Zhao, J., Li, B., and Wang, S. H. (2022). Attention deficit/hyperactivity disorder classification based on deep spatio-temporal features of functional magnetic resonance imaging. Biomed. Signal Process. Control. 71:103239. doi: 10.1016/j.bspc.2021.103239

PubMed Abstract | Crossref Full Text | Google Scholar

Makeig, S., Debener, S., Onton, J., and Delorme, A. (2004). Mining event-related brain dynamics. Trends Cogn. Sci. 8, 204–210. doi: 10.1016/j.tics.2004.03.008

PubMed Abstract | Crossref Full Text | Google Scholar

Onton, J., Westerfield, M., Townsend, J., and Makeig, S. (2006). Imaging human EEG dynamics using independent component analysis. Neurosci. Biobehav. Rev. 30, 808–822. doi: 10.1016/j.neubiorev.2006.06.007

PubMed Abstract | Crossref Full Text | Google Scholar

Plis, S. M., Hjelm, D. R., Salakhutdinov, R., Allen, E. A., Bockholt, H. J., Long, J. D., et al. (2014). Deep learning for neuroimaging: a validation study. Front. Neurosci. 8:229. doi: 10.3389/fnins.2014.00229

PubMed Abstract | Crossref Full Text | Google Scholar

Thomas, R., Sanders, S., Doust, J., Beller, E., and Glasziou, P. (2015). Prevalence of attention-deficit/hyperactivity disorder: a systematic review and meta-analysis. Pediatrics 135, e994–e1001. doi: 10.1542/peds.2014-3482

PubMed Abstract | Crossref Full Text | Google Scholar

Tolstikhin, I. O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., et al. (2021). MLP-mixer: an all-MLP architecture for vision. ArXiv. abs/2105.01601.

Google Scholar

Vergara, V. M., Mayer, A. R., Kiehl, K. A., and Calhoun, V. D. (2018). Dynamic functional network connectivity discriminates mild traumatic brain injury through machine learning. Neuroimage 19, 30–37. doi: 10.1016/j.nicl.2018.03.017

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, M., Huang, J., Liu, M., and Zhang, D. (2021). Modeling dynamic characteristics of brain functional connectivity networks using resting-state functional mri. Med. Image Anal. 71:102063. doi: 10.1016/j.media.2021.102063

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, M., Lian, C., Yao, D., Zhang, D., Liu, M., Shen, D., et al. (2019b). Spatial-temporal dependency modeling and network hub detection for functional MRI analysis via convolutional-recurrent network. IEEE Trans. Biomed. Eng. 67, 2241–2252. doi: 10.1109/TBME.2019.2957921

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, M., Zhang, D., Huang, J., Liu, M., and Liu, Q. (2022a). Consistent connectome landscape mining for cross-site brain disease identification using functional MRI. Med. Image Anal. 82:102591. doi: 10.1016/j.media.2022.102591

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, M., Zhang, D., Huang, J., Yap, P. T., Shen, D., Liu, M., et al. (2019a). Identifying autism spectrum disorder with multi-site fMRI via low-rank domain adaptation. IEEE Trans. Med. Imaging 39, 644–655. doi: 10.1109/TMI.2019.2933160

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, M., Zhu, L., Li, X., Pan, Y., and Li, L. (2023). Dynamic functional connectivity analysis with temporal convolutional network for attention deficit/hyperactivity disorder identification. Front. Neurosci. 17:1322967. doi: 10.3389/fnins.2023.1322967

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, W., Huang, Y., Wang, Y., and Wang, L. (2014). “Generalized autoencoder: a neural network framework for dimensionality reduction,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 496–503. doi: 10.1109/CVPRW.2014.79

Crossref Full Text | Google Scholar

Wang, X., Ren, Y., and Zhang, W. (2017). Multi-task fused lasso method for constructing dynamic functional brain network of resting-state fMRI. J. Image Graph. 22, 978–987.

Google Scholar

Wang, Z., Xin, J., Chen, Q., Wang, Z., and Wang, X. (2022). NDCN-brain: an extensible dynamic functional brain network model. Diagnostics 12:1298. doi: 10.3390/diagnostics12051298

PubMed Abstract | Crossref Full Text | Google Scholar

Yan, W., Calhoun, V., Song, M., Cui, Y., Yan, H., Liu, S., et al. (2019). Discriminating schizophrenia using recurrent neural network applied on time courses of multi-site fMRI data. EBioMedicine 47, 543–552. doi: 10.1016/j.ebiom.2019.08.023

PubMed Abstract | Crossref Full Text | Google Scholar

Yang, W., Xu, X., Wang, C., Mei Cheng, Y., Li, Y., Xu, S., et al. (2022). Alterations of dynamic functional connectivity between visual and executive-control networks in schizophrenia. Brain Imaging Behav. 16, 1294–1302. doi: 10.1007/s11682-021-00592-8

PubMed Abstract | Crossref Full Text | Google Scholar

Yao, D., Sui, J., Wang, M., Yang, E., Jiaerken, Y., Luo, N., et al. (2021). A mutual multi-scale triplet graph convolutional network for classification of brain disorders using functional or structural connectivity. IEEE Trans. Med. Imaging 40, 1279–1289. doi: 10.1109/TMI.2021.3051604

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, J., Zhou, L., and Wang, L. (2017). Subject-adaptive integration of multiple SICE brain networks with different sparsity. Pattern Recognit. 63, 642–652. doi: 10.1016/j.patcog.2016.09.024

Crossref Full Text | Google Scholar

Zhang, J., Zhou, L., Wang, L., Liu, M., and Shen, D. (2022). Diffusion kernel attention network for brain disorder classification. IEEE Trans. Med. Imaging 41, 2814–2827. doi: 10.1109/TMI.2022.3170701

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, X., Liu, J., Yang, Y., Zhao, S., Guo, L., Han, J., et al. (2021). Test-retest reliability of dynamic functional connectivity in naturalistic paradigm functional magnetic resonance imaging. Hum. Brain Mapp. 43, 1463–1476. doi: 10.1002/hbm.25736

PubMed Abstract | Crossref Full Text | Google Scholar

Zou, L., Zheng, J., Miao, C., McKeown, M. J., and Wang, Z. J. (2017). 3D CNN based automatic diagnosis of attention deficit hyperactivity disorder using functional and structural MRI. IEEE Access 5, 23626–23636. doi: 10.1109/ACCESS.2017.2762703

Crossref Full Text | Google Scholar

Keywords: dynamic functional connectivity, temporal dependency, local and global evolution patterns, adaptive learning, fMRI

Citation: Qiu B, Wang Q, Li X, Li W, Shao W and Wang M (2024) Adaptive spatial-temporal neural network for ADHD identification using functional fMRI. Front. Neurosci. 18:1394234. doi: 10.3389/fnins.2024.1394234

Received: 01 March 2024; Accepted: 15 May 2024;
Published: 30 May 2024.

Edited by:

Feng Liu, Tianjin Medical University General Hospital, China

Reviewed by:

Shijie Zhao, Northwestern Polytechnical University, China
Kangcheng Wang, Shandong Normal University, China
Zheyi Zhou, Beijing Normal University, China

Copyright © 2024 Qiu, Wang, Li, Li, Shao and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Mingliang Wang, d21sNDg5QG51aXN0LmVkdS5jbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Adaptive spatial-temporal neural network for ADHD identification using functional fMRI

1 Introduction

2 Related work

2.1 Static FC-based method

2.2 Dynamic FC-based method

3 Materials and method

3.1 Material

3.1.1 Data acquisition

3.1.2 Data pre-processing

3.2 Method

3.2.1 Partition of rs-fMRI time series

3.2.2 Adaptive functional connectivity generation

3.2.3 Temporal dependency mining

3.2.3.1 Local temporal dependency mining branch

3.2.3.2 Global temporal dependency mining branch

3.3 Implementation

4 Experiment and result analysis

4.1 Methods for comparison

4.2 Experiment settings

4.3 Classification performance

4.4 Interpretable analysis of the learned FCs

4.5 Comparison with state-of-the-art methods

5 Discussion

5.1 Influence of sliding window size

5.2 Ablation study

5.3 Limitations and future work

6 Conclusion

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher's note

Footnotes

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good