- 1Department of Biomedical Engineering, Yale University, New Haven, CT, United States
- 2Department of Neurology, Yale School of Medicine, New Haven, CT, United States
- 3Department of Radiology and Biomedical Imaging, Yale University, New Haven, CT, United States
Parkinson's disease (PD) is a common and complex neurodegenerative disorder with five stages on the Hoehn and Yahr scaling. Characterizing brain function alterations with progression of early stage disease would support accurate disease staging, development of new therapies, and objective monitoring of disease progression or treatment response. Functional magnetic resonance imaging (fMRI) is a promising tool in revealing functional connectivity (FC) differences and developing biomarkers in PD. While fMRI and FC data have been utilized for diagnosis of PD through application of machine learning approaches such as support vector machine and logistic regression, the characterization of FC changes in early-stage PD has not been investigated. Given the complexity and non-linearity of fMRI data, we propose the use of a long short-term memory (LSTM) network to distinguish the early stages of PD and understand related functional brain changes. The study included 84 subjects (56 in stage 2 and 28 in stage 1) from the Parkinson's Progression Markers Initiative (PPMI), the largest-available public PD dataset. Under a repeated 10-fold stratified cross-validation, the LSTM model reached an accuracy of 71.63%, 13.52% higher than the best traditional machine learning method and 11.56% higher than a CNN model, indicating significantly better robustness and accuracy compared with other machine learning classifiers. Finally, we used the learned LSTM model weights to select the top brain regions that contributed to model prediction and performed FC analyses to characterize functional changes with disease stage and motor impairment to gain better insight into the brain mechanisms of PD.
1. Introduction
Parkinson's disease (PD) is a common and complex neurodegenerative disorder (Bloem et al., 2021), affecting around 9.4 million people around the world in 2020 (Maserejian et al., 2020). According to the Hoehn and Yahr scaling, five stages of disease severity have been proposed for PD (Hoehn and Yahr, 1998). In the early stages, patients only have mild signs affecting one side (stage 1) or both sides (stage 2) of the body. Accurate disease staging is crucial for treatment planning, enrollment in clinical trials, and following disease progression.
In recent years, resting-state functional magnetic resonance imaging (rs-fMRI) has been increasingly used to investigate the brain basis of motor and non-motor symptoms, disease severity, and disease progression in PD (Tinaz, 2021). Many of these rs-fMRI studies used the functional connectivity (FC) within and between neural networks as a potential biomarker of PD pathophysiology (Prodoehl et al., 2014; Engels et al., 2018; Wang et al., 2018), but the results have been heterogeneous.
Some studies have investigated machine learning (ML) approaches in early diagnosis of PD using rs-fMRI data (Zhang, 2022). Model-based techniques such as logistic regression are strongly based on prior statistical assumptions, which may not be applicable to real data with variable dependencies (Gao et al., 2018). Model-free algorithms including support vector machine and random forest are able to adapt to inherent characteristics of the dataset with fewer assumptions (Gao et al., 2018) and outperform traditional model-based classifiers in real-world clinical applications. A support vector machine model trained on randomized logistic regression feature selection was implemented to discriminate cognitive status in PD from connection-wise FC patterns and reached an accuracy of 80.0% (Abós et al., 2017). The support vector machine analysis based on inter-group dynamic amplitude of low-frequency fluctuations in PD was found to have significantly higher classification accuracy in reference to controls (Zhang et al., 2019). The random forest algorithm has been successfully implemented for brain connectivity markers and depression and cognitive impairment in PD (Lin et al., 2020, 2021). A brain network graph analysis using rs-fMRI, which identified PD-associated brain network alterations and achieved an average accuracy of 95%, has also been proposed for diagnostic purposes in PD (Kazeminejad et al., 2017). To date, most current classification work using rs-fMRI data has mainly focused on distinguishing between PD patients and healthy control subjects (Dehsarvi and Smith, 2019; Haq et al., 2020; Vivar-Estudillo et al., 2021), but not on disease progression within early PD cohorts. We think that further characterization of the early stages of PD, e.g., understanding brain differences in stage 1 and 2, is necessary to better understand the mechanism and progression of PD.
Recently, deep learning has been successfully implemented in patient representation learning. Convolutional neural networks (CNNs; Kim, 2017) have been widely applied in medical image analysis cases (Anwar et al., 2018). A CNN analysis based on electroencephalogram (EEG) data showed high accuracy (88.25%) for detection of early PD (Oh et al., 2020). Similarly, a CNN model trained on time-frequency representation of EEG was proposed for detection of PD (Khare et al., 2021). A CNN model was trained on structural MRI data to classify PD and healthy controls by transfer learning, and achieved an accuracy of 88.9% (Sivaranjini and Sujatha, 2020). However, the huge time and computational consumption required for CNN training on 4D fMRI data presents an obstacle for its maturation in clinical practice. Furthermore, the black-box nature of CNN methods is a challenge for model interpretability, which is crucial for model utility beyond classification success. Finally, CNNs are well-suited for processing spatial information, but do not take advantage of the temporal sequence of fMRI volumes. On the other hand, the recurrent neural networks (RNN; Medsker and Jain, 2001) have the capacity to capture the temporal dynamics and execute sequential prediction efficiently. The long short-term memory (LSTM; Hochreiter and Schmidhuber, 1997) unit, a prominent variant of RNN with sophisticated gating mechanisms, is designed to overcome the vanishing gradient problem in long sequences. RNNs and LSTMs have first achieved great success in natural language processing (NLP) tasks (Young et al., 2018), and the medical application is also emerging rapidly. Using rs-fMRI datasets, the LSTM model has been investigated on autism identification (Dvornek et al., 2017) and Alzheimer's disease prediction (Hong et al., 2019). The LSTM network has been successfully implemented on voice samples (Rizvi et al., 2020) and walking patterns (Balaji et al., 2021) in PD, but these studies did not investigate the abnormalities in brain functional connectivity. The combination of CNN and RNN has also been investigated in rs-fMRI analysis for neurological disorders. For schizophrenia discrimination, a multi-scale RNN model combining CNN and RNN was proposed, reaching an accuracy of 83.2% (Yan et al., 2019). In Alzheimer's disease (AD) diagnosis, a spatiotemporal model that combines both convolutional and recurrent components was reported to improve AD classification compared with other state-of-the-art approaches (Wang et al., 2019). In major depressive disorder classification, a temporal adaptive graph convolutional network was proposed on rs-fMRI data and FC mapping, achieving a higher accuracy than other state-of-the-art methods of 73.5% (Yao et al., 2020). Nevertheless, in the field of deep learning, PD early-stage characterization has not been investigated with rs-fMRI datasets.
Currently, most rs-fMRI studies in PD enroll a relatively small number (<50) of patients. Due to the variations in data acquisition and pre-processing pipelines among different institutions, as well as the heterogeneity of PD, current research still lacks reproducibility and accuracy across independent datasets. This is certainly a concern for the data-driven approaches. However, some of these concerns can be alleviated with the landmark open data project in PD called the Parkinson's Progression Markers Initiative (PPMI; Marek et al., 2018). The PPMI has organized the first large-scale and the largest-size public multicenter clinical study to study PD progression, including advanced imaging, biologic sampling, and clinical and behavioral assessments.
In this work, we propose to characterize the early stages of PD using the rs-fMRI data obtained from the PPMI database by applying an LSTM-based model. To the best of our knowledge, this project would be the first use of LSTMs to distinguish the early stages of PD (stage 1 vs. stage 2) using rs-fMRI data. The LSTM model allows for the analysis of the raw time-series data from the brain regions of interest (ROIs), retaining more original imaging information compared to models that predict from the pre-processed FC data. We trained and validated the LSTM model under a 5-times repeated 10-fold stratified cross validation and compared the results with a CNN model and traditional machine learning classifiers that predict from FC measures. We also assessed model validity by measuring the association between model output scores and a continuous measure of motor severity in PD. Finally, we interpreted the LSTM model and highlighted the top brain regions and FC measures that contributed most to the classification of the early stages of PD.
2. Materials and Methods
2.1. Dataset and Pre-processing
All subject data were carefully selected and extracted from the public PPMI database, noting each subject's sex, age, disease onset side, and disease stage. The original PPMI study was conducted following the Declaration of Helsinki and the Good Clinical Practice (GCP) guidelines approved by the local ethics committees of the 24 participating sites in the US (18), Europe (5), and Australia (1) with informed consent obtained from all the enrolled subjects (Marek et al., 2018). Data from 90 subjects were originally curated, but six subjects were excluded due to poor image normalization or excessive head motion. Thus, a total of 84 age-matched (p>0.05, unpaired two-sample t-test) subjects who were also matched for sex and disease onset side (p>0.05, Pearson's chi-squared test) were selected, including 56 subjects at stage 2 and 28 subjects at stage 1. The demographic information of the cohort is summarized in Table 1. All rs-fMRI images were acquired for 8.5 min with TR = 2,400 ms, TE = 25 ms, flip angle = 80°, matrix = 68 ×66, and FOV = 222 mm. After the first four frames were discarded, each patient's raw rs-fMRI scan contained 206 frames in the 4D sequence (8.24 min), yielding a total of 17,304 frames in the dataset. Each single frame was saved in nifti format.
Pre-processing was performed using the CONN functional connectivity toolbox v17 (Whitfield-Gabrieli and Nieto-Castanon, 2012). The pre-processing steps included motion correction, outlier detection, normalization to the MNI template, smoothing, ROI extraction, and sequence cropping. Outliers were defined as frame-wise displacement above 0.9 mm or global signal changes above five standard deviations. The head motion data of all the subjects are displayed in Table 2, and no significant differences in head motion measures between stages were found. Each frame was aligned to the AAL-116 atlas (Tzourio-Mazoyer et al., 2002) with functional and structural simultaneous gray matter (GM)/white matter (WM)/cerebrospinal fluid (CSF) segmentation and MNI normalization. A Gaussian smoothing at 8 mm full-width at half-maximum (FWHM) was implemented. De-noising steps included correction for physiological and other sources of noise by regressing out the principal components of the white matter and cerebrospinal fluid signal using the CompCor method (Chai et al., 2012), regression of motion artifacts and outliers, and linear detrending. Global signal was not removed. Finally, data were bandpass-filtered (0.008 Hz < f <0.09 Hz) to capture the fluctuations of the blood oxygenation level-dependent signal that typically occur within this frequency range at rest. The mean of the voxel values in each ROI was used as that region's signal. Along the time dimension, each mean ROI time-series was extracted and standardized by dividing by the standard deviation among time frames to represent the relative change.
Considering the relatively small number of early stage subjects for deep learning training, data augmentation was introduced to help the model generalize better and prevent overfitting. The input time sequences were cropped with a fixed sequence length w = 50 (representing 2 min of imaging) and stride length s = 1 to move along the time dimension of the rs-fMRI series. Thus, for each subject, 156 cropped sequences were acquired, which boosted the sample size to 13,104 in total. This augmentation was applied to all the subjects in both training and testing.
2.2. LSTM Model
We aimed to investigate an LSTM model for PD early stage characterization from rs-fMRI data. We hypothesized that the early stage classification performance would improve compared with traditional ML classifiers that predict from FC measures and CNN-based models that analyze local information. The overall workflow of our project is shown in Figure 1.
LSTMs are a special type of RNN that can address the vanishing gradient and limited long-term memory problems in a vanilla RNN model, taking the previous information and the current data input to update the cell state and hidden state. The key equations in an LSTM cell are:
where at time step t, is the vector of N ROI values, is the hidden state, is the cell state, with an input gate deciding what information from the current estimated cell state is updated, a forget gate deciding how much of the previous hidden state should be discarded, and an output gate filtering the cell state to update the hidden state. W∈ℝM×N is the matrix of weights applied to the input, U∈ℝM×M is the matrix of weights applied to the previous hidden state, b∈ℝM is the bias, and σ is the sigmoid activation function.
The proposed LSTM model structure is similar to the model in Dvornek et al. (2017) and shown in Figure 2. The model takes the average time-series of ROIs as the input and then utilizes the output of each time step, which aggregates the decoded hidden state of each cell as the input of the fully connected (dense) layer with 1 node. A dropout layer with dropout rate = 0.5 was integrated between the dense layer and the sigmoid activation layer to prevent overfitting, and the final output of the sigmoid layer would be interpreted as the probability of being assigned as each stage, where higher probabilities correspond to higher likelihood of stage 2. This architecture would directly take the signal at every time point into consideration, which would improve the network's robustness handling noisy rs-fMRI data.
To evaluate the effect of a temporal analysis model, we implemented a CNN model with similar structure as in Oh et al. (2020) on the average time-series ROIs. The CNN model contains three 1-D convolutional layers with stride equal to 1 and kernel size as 7, 5, and 3, respectively. The numbers of the output channels in each layer are 16, 32, and 32, respectively. Following each convolutional layer, a max pooling layer was applied with sliding window size as 2 and stride as 2. The final output was then given by three consecutive dense layers with hidden size as 64, 32, and 1. Similarly, a dropout layer with dropout rate = 0.5 was incorporated after each dense layer to prevent overfitting. The Rectified Linear Unit (ReLU) was applied as the activation function for each layer except the sigmoid activation for the last dense layer, as the final output could be interpreted as the likelihood of being classified as each stage.
We also conducted traditional ROI-based machine learning methods in rs-fMRI analysis as the baseline. A standard pipeline is calculating the FC matrix, representing the correlation or covariance between each ROI pair (connectome), and then using the FC matrices as predictors in a traditional machine learning classifier (Dadi et al., 2019). Here, we tested random forest (RF), linear support vector machine classifier (SVC) (as in Abós et al., 2017; Zhang et al., 2019) and logistic regression (LR). The FC matrices were calculated using the Ledoit Wolf estimator (Ledoit and Wolf, 2004) for large covariance matrices, and only the upper triangle of the matrices were used as the inputs of the ML classifiers to prevent redundancy since the FC matrices are symmetric.
2.3. Model Training and Evaluation
To comprehensively evaluate the model performances, five repeats of 10-fold stratified cross validation was implemented. For each repeated run, 10% of subjects are selected as test set, 10% are selected as validation set, and the remaining 80% are the training set. Note that the random split was carried out on both stage 1 and stage 2 in order to stratify the imbalanced dataset and keep the ratio of the two stages roughly the same in all the splits. To handle the class imbalance, sample weights are assigned inversely proportional to the stage ratio in the dataset. For the LSTM model and all the ML models, the hyperparameter tuning was first performed by using the validation set to evaluate the performance of the models trained on the training set under the different hyperparameters. After the best hyperparameters were selected, the final model was trained on both the training and the validation set and then evaluated on the test set to best utilize the relatively smaller dataset.
For the LSTM model setup, the initial hidden and cell states of the LSTM were set as all zeros. The input length was set to 50 time points based on initial testing on a random cross-validation fold (details in Supplementary Material). The hidden size M of the LSTM layer was tuned from the list of [16, 32, 64, 128]. To prevent overfitting on the LSTM model, L2 regularization and the early-stopping mechanism were introduced to enhance the model's generalization ability. For both deep learning models, training was terminated when the validation loss had not reduced in 10 epochs or when the maximum epoch number 50 was reached. Both LSTM and CNN networks were trained using the cross entropy loss function, Adam optimizer with learning rate = 1e-4, batch size = 200, dropout rate fixed at 0.5, the λ of the L2 regularization at 0.01, and other default parameter settings. Both the LSTM and CNN models were implemented using PyTorch and trained on a NVIDIA GeForce GTX 1080 GPU at the Farnam Cluster of the Yale Center for Research Computing.
All the other ML models were implemented via Scikit-learn. The regularization parameter inversely proportional to the regularization strength of LR was searched in the range of [1e-9, 10] and the search range of SVC was [0.01, 10], both with step size as 1e-1 in log scale. For RF, the number of trees were tuned from the list of [10, 50, 100, 200, 500]. The maximum depth of the tree was searched from 2 to 6. The minimum number of samples to split an internal node was selected from the list of [10, 20, 50, 100, 200, 500, 1,000, 2,000, 3,000] and the searching list of the minimum number of samples in a leaf node was [10, 50, 100, 200, 300, 500]. All the other hyperparameters were set as default.
All the evaluation metrics were based on subject-level outputs, which are given by the majority vote of all the sequences from one subject. Evaluation at the subject level better matches the real clinical diagnosis of one label per subject, and the ensembling also has improved performance compared to the sample-wise results. The subject-wise accuracy, precision, recall (sensitivity), specificity, and F1 score were reported. The equations for accuracy, precision, recall, specificity, and F1 score calculation are as below,
where positive and negative refer to stage 2 and stage 1, respectively, TP is the true positive, TN is the true negative, FP is the false positive, and FN is the false negative.
The corrected repeated k-fold cross validation test (Bouckaert and Frank, 2004) was conducted as the significance test for model performance comparison. For a r-times k-fold cross-validation, the following statistic is calculated,
where xij is the difference in the statistic of interest between two models being compared from the ith fold of the jth cross-validation run, n1 is the number of subjects used for training, n2 is the number of subjects used for testing, and is the estimated variance of the differences xij. This test corrects the estimate of the variance by taking the dependency between cross-validation samples into account. The significance level α was set at 0.05.
We also compared the model outputs to a separate measure of PD severity to further assess validity of model predictions for disease staging. The Movement Disorders Society-Unified Parkinson's Disease Rating Scale (MDS-UPDRS) is the standard assessment tool for disease severity and progression of PD (Goetz et al., 2008). The MDS-UPDRS part III motor exam score (range: 0–132) rates the severity of motor impairment. Higher scores indicate worse motor impairment. The MDS-UPDRS-III scores of stage 1 subjects are 14.00 ± 5.57, and that of stage 2 subjects are 23.69 ± 10.70. We computed the Pearson correlation between the model output scores and the MDS-UPDRS-III scores to evaluate whether the stage classification model output is associated with a more continuous measure of disease severity.
2.4. Model Interpretation and Connectivity Analysis
We interpreted the LSTM model by exploring the learnable input-hidden weights W of the LSTM cell (Dvornek et al., 2017). The ROIs associated with high magnitudes of the weights that are directly applied to the ROI time-series data play a significant role in giving the output, and thus will be given high importance in the model analysis. After z-score normalization, we highlighted the ROIs with the magnitudes of the associated weights above mean and one standard deviation of the weights. These ROIs were considered as the important ROIs for early stage PD classification.
Pairwise FC analysis was then conducted based on the selected top ROIs. The covariance matrices for all subjects were computed from the top ROIs. Then for each edge, the Welch's t-test (Ruxton, 2006) was used to compare the FC for the ROI pairs between the stage 1 and stage 2 subjects, taking into consideration the different number of subjects in each stage. The significance of the Welch's t-statistic was assessed using a permutation test with 10,000 random permutations of the subject stage labels to find whether there was a significant difference in the FC between the two stages. Correction for multiple comparisons was performed by controlling the false discovery rate (FDR; Benjamini and Hochberg, 1995).
Regression analysis was performed to analyze the association between the FC of top ROIs and the MDS-UPDRS-III scores. The elastic net model (Zou and Hastie, 2005) implemented using Scikit-learn was used to predict MDS-UPDRS-III scores from the pairwise FC between top ROIs. The elastic net regularization parameters were searched using the repeated cross-validation splitting strategy (3 runs, 10 folds), and the optimal hyperparameters were then applied to the model for the entire dataset. The searching range of both α and l1_ratio is [0, 1]. The selected regularization parameters for LSTM were α = 0.1936 and l1_ratio = 0.1000.
The permutation test with 10,000 runs was again used to assess significance of the regression coefficients, where the subject MDS-UPDRS-III scores were randomly shuffled and the p-values were calculated as the percentage of permutation results with a coefficient magnitude greater than the magnitude of the original observation. The top ROI analysis results were also compared with results of similar whole-brain FC analysis to see whether the top ROIs play a dominant part in disease stage progression. All the FC differences were visualized via BrainNet Viewer (Xia et al., 2013).
3. Results
3.1. Characterization Results
Table 3 summarizes the classification results of the LSTM model, the CNN model, and all the ML classifiers. All the models have been selected with the best hyperparameters. The proposed LSTM model yielded the highest values among all the quantitative metrics, outperforming the CNN model and all the other ML methods. Under the corrected repeated k-fold cross validation test (k = 10, r = 5, n2 = 8, n1 = 76, degree of freedom = 49, and tthre = −2.010), the results of the LSTM model showed significant improvement in accuracy, F1, and recall compared to all other models. In precision and specificity, the LSTM model showed significant improvement compared to SVC.
Table 4 presents the correlation between the model output scores and the MDS-UPDRS-III scores. The SVC model, which resulted in the second highest precision and specificity for stage classification (Table 3), did not produce significant correlation of model output and MDS-UPDRS-III scores (r = 0.1748, p = 0.12). The LSTM model showed the highest correlation between the stage prediction score with MDS-UPDRS-III scores (r = 0.4270, p = 0.00003).
Table 4. Correlation between the model output scores and the MDS-UPDRS-III scores (mean ± standard deviation of 5 runs).
3.2. Brain Abnormality Detection
We detected the top ROIs related to the brain abnormality in disease development by investigating the learnable weights in the LSTM model. Table 5 displays the top ROIs with the greatest absolute weights for the overall LSTM model in descending order. The level is the computed z-score of the absolute value of the extracted weights, showing how much the region magnitude is above the mean weight magnitude of all the ROIs.
Table 5. Top ROIs and the associated brain functions with the greatest weight magnitudes in the LSTM model.
3.3. Brain Connectivity Analysis
The brain connectivity analysis was carried out first by assessing FC differences between stages 1 and 2 for all the pairwise connections between the top ROIs by using the permutation Welch's t-test. The edges that showed significant differences between the two stages are listed in Table 6, and the nodes and edges are displayed in Figure 3. All listed edges were weaker in stage 2 than in stage 1, indicating that a more advanced disease stage in PD is associated with decreased functional connectivity.
Table 6. Significantly different edges in stage 1 and 2 by permutation Welch's t-test for the LSTM model.
Figure 3. Significantly different edges in stage 1 and 2 by permutation Welch's t-test for the LSTM model. All edges showed decreased connectivity in stage 2. PoCG, postcentral gyrus; SOG, superior occipital gyrus; CAL, calcarine sulcus; ORBsupmed, orbitofrontal cortex, superior medial part; IFGo, inferior frontal gyrus, orbital part.
For motor impairment-related connectivity analysis, the elastic net regression was used to regress the MDS-UPDRS-III score on the FC edges and permutation testing was conducted to assess the significance of regression coefficients. Table 7 summarizes the significant edges of the regression results and the related direction of association with motor score. The nodes and edges are displayed in Figure 4. After applying the FDR correction with a false discovery rate of 0.2, the top three edges remained significant.
Table 7. Edges with significant weights by permutation test of the elastic net regression of MDS-UPDRS-III scores for the LSTM model.
Figure 4. The edges with significant weights by permutation t-test of elastic net regression of MDS-UPDRS-III score for the LSTM model. Red: edges with positive coefficients. Blue: edges with negative coefficients. SMA, supplementary motor area; PoCG, postcentral gyrus; SOG, superior occipital gyrus; PCG, posterior cingulate gyrus; CAL, calcarine sulcus; ORBsupmed, orbitofrontal cortex, superior medial part; IFGo, inferior frontal gyrus, orbital part; VERM10, vermis_10; CRBL6, cerebellum_6.
A conventional whole-brain FC permutation Welch's t-test and MDS-UPDRS-III score regression analysis were conducted as a comparison with the sub-ROI group analysis (details in Supplementary Figures 1, 2, respectively). Note that none of the edges detected by whole-brain analysis survived the FDR correction, potentially indicating overfitting in the traditional whole-brain analysis results.
4. Discussion
In this work, we investigated an LSTM model for early-stage PD characterization using rs-fMRI data. Under a 5-run, 10-fold repeated stratified cross-validation, the proposed LSTM model performed significantly better than the CNN model and the other traditional ML methods for the classification of stage 1 and stage 2 PD subjects. The model output scores were also better correlated with the motor severity scale. The learnable weights in the well-trained LSTM model produced meaningful interpretations. The post-hoc FC analysis revealed edges that differed significantly between stages 1 and 2 in both classification and regression analyses. The findings of potentially influential top brain regions and abnormal FC among them could provide a deeper understanding of the neuroanatomical substrates of early disease stages in PD.
The classification results suggest the LSTM model has the potential to better extract and utilize temporal information from the rs-fMRI data. The conventional ML analysis methods are FC-based, i.e., the FC matrices are the input of the ML methods instead of the original rs-fMRI series. This requires an additional step of data pre-processing, which needs additional time and computation, while it may also cause some loss in the functional information. Our proposed method directly uses the rs-fMRI series as the input, which successfully preserved brain functional information while reducing noise and redundancy. While both the LSTM and CNN models used the rs-fMRI series as input, the LSTM model performed better than the CNN model. The input gate, forget gate, and output gate in the LSTM cell are designed to handle temporal dependencies in relatively long sequences, which could successfully extract temporal information along the time dimension in a time- and computation-efficient way.
Furthermore, the LSTM model outputs produced the highest correlation with the MDS-UPDRS-III scores. A high correlation is desired, as the degree of motor impairment plays an important part in early stage distinction. Thus, the proposed LSTM model not only resulted in the best stage classification performance, but also the confidence of the LSTM model's classification produced the highest correlation with a closely related continuous rating of disease severity.
In brain abnormality detection, a diverse set of ROIs were important for predicting disease severity. Most notably, the basal ganglia structures including the lenticular nucleus and caudate, and the supplementary motor area together with the postcentral gyrus are implicated in sensorimotor impairment in PD (Ji et al., 2018). The cerebellum is involved in tremor generation (Ma et al., 2015; Zhang et al., 2016). The amygdala and insula are limbic structures involved in emotional processing and play a role in anxiety and depression in PD (Tinaz et al., 2021). The calcarine sulcus and superior occipital gyrus are visual processing areas implicated in visual symptoms of PD such as hallucinations (Bejr-kasem et al., 2019). The posterior cingulate is the major hub in the default mode network that shows abnormal FC in PD and other neurodegenerative disorders (Tinaz, 2021). Finally, the frontal regions mediate higher cognitive and executive functions and are implicated in cognitive dysfunction in PD even in the early stages and in the absence of dementia (Tinaz, 2021). Thus, the top influential ROIs for the LSTM model that play an important role in distinguishing early-stage PD are also relevant brain regions linked to motor and non-motor functions that are affected in PD.
In brain connectivity analysis, interestingly, the influential edges are mostly between the nodes related to non-motor brain functions. This finding has important clinical implications suggesting that progression even in the early stages of the disease involves FC changes in non-motor networks, underlining the importance of evaluating the severity of not only motor but also non-motor impairment in clinical progression studies and prediction models (Tinaz et al., 2021). Current clinical works investigating non-motor neuropsychiatric symptoms in PD (Alzahrani et al., 2015) and the related FC analyses (Baggio et al., 2015) supports this finding as well.
In motor impairment-related connectivity analysis, higher MDS-UPDRS-III scores indicate worse motor impairment. Therefore, the “increased” direction in Table 7 denotes a positive relationship between edge strength and motor impairment, whereas “decreased” shows the opposite relationship. The edge strength between the cerebellar vermis and sensorimotor (supplementary motor area and postcentral gyrus) and visual areas (superior occipital gyrus) is associated with worse motor impairment suggesting an reorganization of brain circuits. Similar shifts from the defective basal ganglia circuits to the cerebellar circuits have been reported in PD (Tinaz, 2021).
The brain connectivity analysis of the LSTM model was based on the top ROIs with high magnitudes of the learned weights, which showed its superiority compared with the conventional whole-brain FC analysis. While the results of the LSTM ROI analysis showed similar trends as the traditional whole-brain FC analysis in terms of highlighting similar regions with increased or decreased connectivity, the very large number of connections in the whole-brain FC analysis not only hinders interpretation but also did not survive correction, thus, were not informative. Furthermore, given the relatively smaller number of subjects, the whole-brain FC analysis could result in overfitting. Thus, these analyses may not be robust in revealing FC changes that generalize to the greater PD population. The post hoc regression results of the LSTM model using the FC between the top ROIs also highlighted the connections that may be implicated in disease severity, whereas the traditional FC analysis did not show a significant relationship between disease severity and whole-brain FC across the entire group. In conclusion, the LSTM model was able to accurately classify the two early disease stages by identifying the specific brain regions and edges that contribute strongly to disease stage and motor impairment.
There are some limitations of this work that require further investigation. First, we applied a single pipeline for the pre-processing of the rs-fMRI data without global signal regression (GSR). Other pre-processing strategies including GSR could reveal different FC results (Murphy et al., 2017). As there is not a single right way of pre-processing rs-fMRI data (Murphy et al., 2017), a future direction of this study would be investigating the effect of other pre-processing strategies such as GSR on the FC results in PD early stage characterization. Also, the current work focuses on the usage of only rs-fMRI data in early stage PD classification. Future work should introduce multimodal data into the characterization, such as task-based fMRI data, cognitive and behavioral assessments, and other biological markers. This will fully utilize each patient's clinical profile and more comprehensively assess factors that may predict disease stage, potentially improving stage classification and thus producing more fruitful features for characterizing differences between the two early stages. Another promising future direction is investigating the usage of other advanced temporal deep learning models such as gated recurrent unit (Cho et al., 2014) and transformer (Vaswani et al., 2017) networks. By implementing various network backbone structures, the classification performance is expected to further improve with insightful biomarker findings.
5. Conclusion
We proposed the usage of an LSTM model for early stage PD characterization using rs-fMRI data from the majority of the PPMI dataset. Under the repeated stratified cross-validation, the LSTM model significantly outperformed the CNN model and other ML methods in accuracy, F1 score, and recall, with the highest correlation between model output scores and the MDS-UPDRS-III motor scores. The LSTM model interpretation results suggested the highly influential ROIs in early stage PD progression, and the brain connectivity analysis results identified prominent edges in brain FC changes and the disease severity. The propounded regions and edges are related to the symptoms of PD, which supports the validity of our proposed LSTM model for early stage characterization. Identification of brain regions and functional connections affected by early PD progression could potentially help unravel the mechanisms of PD and facilitate the development of new therapeutic targets.
Data Availability Statement
The data analyzed in this study was obtained from Parkinson's Progression Markers Initiative (PPMI), the following licenses/restrictions apply: Investigators seeking access to PPMI data must sign the Data Use Agreement, submit an Online Application and comply with the study Publications Policy. Requests to access these datasets should be directed to PPMI, https://www.ppmi-info.org/access-data-specimens/download-data.
Ethics Statement
The original PPMI study was conducted following the Declaration of Helsinki and the Good Clinical Practice (GCP) guidelines approved by the Local Ethics Committees of the 24 participating sites in the US (18), Europe (5), and Australia (1) with informed consent obtained from all the enrolled subjects. The patients/participants provided their written informed consent to participate in this study.
Author Contributions
XG: conceptualization, methodology, software, validation, formal analysis, visualization, and writing—original draft, review, and editing. ST: conceptualization, data curation, and writing—review and editing. ND: conceptualization, methodology, writing—review and editing, and supervision. All authors contributed to manuscript revision, read, and approved the submitted version.
Funding
XG was supported by the Biomedical Engineering Ph.D. fellowship at Yale University. ST was supported by the National Institute of Neurological Disorders and Stroke (Grant Number K23NS099478).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
We thank the Yale Center for Research Computing for use of the research computing infrastructure, specifically the Farnam cluster. Data used in the preparation of this article were obtained from the Parkinson's Progression Markers Initiative (PPMI) database (https://www.ppmi-info.org/access-data-specimens/download-data). For up-to-date information on the study, visit www.ppmi-info.org. PPMI—a public-private partnership—was funded by the Michael J. Fox Foundation for Parkinson's Research funding partners 4D Pharma, Abbvie, Acurex Therapeutics, Allergan, Amathus Therapeutics, ASAP, Avid Radiopharmaceuticals, Bial Biotech, Biogen, BioLegend, Bristol-Myers Squibb, Calico, Celgene, Dacapo Brain Science, Denali, The Edmond J. Safra Foundaiton, GE Healthcare, Genentech, GlaxoSmithKline, Golub Capital, Handl Therapeutics, Insitro, Janssen Neuroscience, Lilly, Lundbeck, Merck, Meso Scale Discovery, Neurocrine Biosciences, Pfizer, Piramal, Prevail, Roche, Sanofi Genzyme, Servier, Takeda, Teva, UCB, Verily, and Voyager Therapeutics.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnimg.2022.952084/full#supplementary-material
References
Abós, A., Baggio, H. C., Segura, B., García-Díaz, A. I., Compta, Y., Martí, M. J., et al. (2017). Discriminating cognitive status in Parkinson's disease through functional connectomics and machine learning. Sci. Rep. 7:45347. doi: 10.1038/srep45347
Alzahrani, H., Venneri, A. (2015). Cognitive and neuroanatomical correlates of neuropsychiatric symptoms in Parkinson's disease: a systematic review. J. Neurol. Sci. 356, 32–44. doi: 10.1016/j.jns.2015.06.037
Anwar, S. M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M. K. (2018). Medical image analysis using convolutional neural networks: a review. J. Med. Syst. 42:226. doi: 10.1007/s10916-018-1088-1
Baggio, H. C., Segura, B., Garrido-Millan, J. L., Marti, M.-J., Compta, Y., Valldeoriola, F., et al. (2015). Resting-state frontostriatal functional connectivity in Parkinson's disease–related apathy. Mov. Disord. 30, 671–679. doi: 10.1002/mds.26137
Balaji, E., Brindha, D., Elumalai, V. K., Vikrama, R. (2021). Automatic and non-invasive Parkinson's disease diagnosis and severity rating using LSTM network. Appl. Soft Comput. 108:107463.
Bejr-kasem, H., Pagonabarraga, J., Martínez-Horta, S., Sampedro, F., Marín-Lahoz, J., Horta-Barba, A., et al. (2019). Disruption of the default mode network and its intrinsic functional connectivity underlies minor hallucinations in Parkinson's disease. Mov. Disord. 34, 78–86. doi: 10.1002/mds.27557
Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300.
Bloem, B. R., Okun, M. S., Klein, C. (2021). Parkinson's disease. Lancet 397, 2284–2303. doi: 10.1016/S0140-6736(21)00218-X
Bouckaert, R. R., Frank, E. (2004). “Evaluating the replicability of significance tests for comparing learning algorithms,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining (Sydney, NSW: Springer), 3–12.
Chai, X., Castañón, A., Öngür, D., Whitfield-Gabrieli, S. (2012). Anticorrelations in resting state networks without global signal regression. Neuroimage 59, 1420–1428. doi: 10.1016/j.neuroimage.2011.08.048
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., et al. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078. doi: 10.48550/arXiv.1406.1078
Dadi, K., Rahim, M., Abraham, A., Chyzhyk, D., Milham, M., Thirion, B., et al. (2019). Benchmarking functional connectome-based predictive models for resting-state fMRI. NeuroImage 192, 115–134. doi: 10.1016/j.neuroimage.2019.02.062
Dehsarvi, A., Smith, S. (2019). Classification of resting-state fMRI using evolutionary algorithms: towards a brain imaging biomarker for Parkinson's disease. arXiv preprint arXiv:1910.05378. doi: 10.48550/arXiv.1910.05378
Dvornek, N. C., Ventola, P., Pelphrey, K. A., Duncan, J. S. (2017). “Identifying autism from resting-state fMRI using long short-term memory networks,” in International Workshop on Machine Learning in Medical Imaging (Quebec City, QC: Springer), 362–370.
Engels, G., Vlaar, A., McCoy, B., Scherder, E., Douw, L. (2018). Dynamic functional connectivity and symptoms of Parkinson's disease: a resting-state fMRI study. Front. Aging Neurosci. 10:388. doi: 10.3389/fnagi.2018.00388
Gao, C., Sun, H., Wang, T., Tang, M., Bohnen, N. I., Müller, M. L. T. M., et al. (2018). Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson's disease. Sci. Rep. 8:7129. doi: 10.1038/s41598-018-24783-4
Goetz, C. G., Tilley, B. C., Shaftman, S. R., Stebbins, G. T., Fahn, S., Martinez-Martin, P., et al. (2008). Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord. 23, 2129–2170. doi: 10.1002/mds.22340
Haq, N., Cai, J., Yu, T., McKeown, M. J., Wang, Z. J. (2020). “Parkinson's disease detection from fMRI-derived brainstem regional functional connectivity networks,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Lima), 33–43.
Hong, X., Lin, R., Yang, C., Zeng, N. (2019). Predicting Alzheimer's disease using LSTM. IEEE Access 7, 80893–80901. doi: 10.1109/ACCESS.2019.2919385
Ji, G.-J., Hu, P., Liu, T.-T., Li, Y., Chen, X., Zhu, C., et al. (2018). Functional connectivity of the corticobasal ganglia–thalamocortical network in Parkinson disease: a systematic review and meta-analysis with cross-validation. Radiology 287, 973–982. doi: 10.1148/radiol.2018172183
Kazeminejad, A., Golbabaei, S., Soltanian-Zadeh, H. (2017). “Graph theoretical metrics and machine learning for diagnosis of Parkinson's disease using rs-fMRI,” in 2017 Artificial Intelligence and Signal Processing Conference (AISP), 2017 (Shiraz), 134–139.
Khare, S. K., Bajaj, V., Acharya, U. R. (2021). PDCNNet: an automatic framework for the detection of Parkinson's disease using EEG signals. IEEE Sensors J. 21, 17017–17024. doi: 10.1109/JSEN.2021.3080135
Kim, P (2017). “Convolutional neural network,” in MATLAB Deep Learning (Berkeley, CA: Springer; Apress), 121–147. doi: 10.1007/978-1-4842-2845-6_6
Ledoit, O., Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88, 365–411. doi: 10.1016/S0047-259X(03)00096-4
Lin, H., Cai, X., Zhang, D., Liu, J., Na, P., Li, W. (2020). Functional connectivity markers of depression in advanced Parkinson's disease. NeuroImage 25:102130. doi: 10.1016/j.nicl.2019.102130
Lin, H., Liu, Z., Yan, W., Zhang, D., Liu, J., Xu, B., et al. (2021). Brain connectivity markers in advanced Parkinson's disease for predicting mild cognitive impairment. Eur. Radiol. 31, 9324–9334. doi: 10.1007/s00330-021-08086-3
Ma, H., Chen, H., Fang, J., Gao, L., Ma, L., Wu, T., et al. (2015). Resting-state functional connectivity of dentate nucleus is associated with tremor in Parkinson's disease. J. Neurol. 262, 2247–2256. doi: 10.1007/s00415-015-7835-z
Marek, K., Chowdhury, S., Siderowf, A., Lasch, S., Coffey, C. S., Caspell-Garcia, C., et al. (2018). The Parkinson's progression markers initiative (PPMI)-establishing a PD biomarker cohort. Ann. Clin. Transl. Neurol. 5, 1460–1477. doi: 10.1002/acn3.644
Maserejian, N., Vinikoor-Imler, L., Dilley, A. (2020). Estimation of the 2020 global population of Parkinson's disease (PD). Mov. Disord. 35(Suppl. 1).
Murphy, K., Fox, M. D. (2017). Towards a consensus regarding global signal regression for resting state functional connectivity MRI. Neuroimage 154, 169–173. doi: 10.1016/j.neuroimage.2016.11.052
Oh, S. L., Hagiwara, Y., Raghavendra, U., Yuvaraj, R., Arunkumar, N., Murugappan, M., et al. (2020). A deep learning approach for Parkinson's disease diagnosis from EEG signals. Neural Comput. Appl. 32, 10927–10933. doi: 10.1007/s00521-018-3689-5
Prodoehl, J., Burciu, R., Vaillancourt, D. (2014). Resting state functional magnetic resonance imaging in Parkinson's disease. Curr. Neurol. Neurosci. Rep. 14:448. doi: 10.1007/s11910-014-0448-6
Rizvi, D. R., Nissar, I., Masood, S., Ahmed, M., Ahmad, F. (2020). An LSTM based deep learning model for voice-based detection of Parkinson's disease. Int. J. Adv. Sci. Technol. 29, 337–343. Retrieved from: http://sersc.org/journals/index.php/IJAST/article/view/7166
Ruxton, G. D (2006). The unequal variance t-test is an underused alternative to Student's t-test and the Mann-Whitney U test. Behav. Ecol. 17, 688–690. doi: 10.1093/beheco/ark016
Sivaranjini, S., Sujatha, C. (2020). Deep learning based diagnosis of Parkinson's disease using convolutional neural network. Multim. Tools Appl. 79, 15467–15479. doi: 10.1007/s11042-019-7469-8
Tinaz, S (2021). Functional connectome in Parkinson's disease and parkinsonism. Curr. Neurol. Neurosci. Rep. 21, 1–14. doi: 10.1007/s11910-021-01111-4
Tinaz, S., Kamel, S., Aravala, S. S., Sezgin, M., Elfil, M., Sinha, R. (2021). Distinct neural circuits are associated with subclinical neuropsychiatric symptoms in Parkinson's disease. J. Neurol. Sci. 423:117365. doi: 10.1016/j.jns.2021.117365
Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., et al. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 15, 273–289. doi: 10.1006/nimg.2001.0978
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). “Attention is all you need,” in Advances in Neural Information Processing Systems (Long Beach, CA), 5998–6608.
Vivar-Estudillo, G., Hajari, N., Ibarra-Manzano, M., Cheng, I. (2021). “Parkinson's disease detection and diagnosis from fMRI: a literature review,” in International Conference on Human-Computer Interaction (Washington, DC), 630–638.
Wang, M., Lian, C., Yao, D., Zhang, D., Liu, M., Shen, D. (2019). Spatial-temporal dependency modeling and network hub detection for functional MRI analysis via convolutional-recurrent network. IEEE Trans. Biomed. Eng. 67, 2241–2252. doi: 10.1109/TBME.2019.2957921
Wang, Z., Jia, X., Chen, H., Feng, T., Wang, H. (2018). Abnormal spontaneous brain activity in early Parkinson's disease with mild cognitive impairment: a resting-state FMRI study. Front. Physiol. 9:1093. doi: 10.3389/fphys.2018.01093
Whitfield-Gabrieli, S., Nieto-Castanon, A. (2012). Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks. Brain Connect. 2, 125–141. doi: 10.1089/brain.2012.0073
Xia, M., Wang, J., He, Y. (2013). BrainNet viewer: a network visualization tool for human brain connectomics. PLoS ONE 8:e68910. doi: 10.1371/journal.pone.0068910
Yan, W., Calhoun, V., Song, M., Cui, Y., Yan, H., Liu, S., et al. (2019). Discriminating schizophrenia using recurrent neural network applied on time courses of multi-site FMRI data. EBioMedicine 47, 543–552. doi: 10.1016/j.ebiom.2019.08.023
Yao, D., Sui, J., Yang, E., Yap, P.-T., Shen, D., Liu, M. (2020). “Temporal-adaptive graph convolutional network for automated identification of major depressive disorder using resting-state fMRI,” in International Workshop on Machine Learning in Medical Imaging (Hangzhou), 1–10.
Young, T., Hazarika, D., Poria, S., Cambria, E. (2018). Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13, 55–75. doi: 10.48550/arXiv.1708.02709
Zhang, C., Dou, B., Wang, J., Xu, K., Zhang, H., Sami, Z. U., et al. (2019). Dynamic alterations of spontaneous neural activity in Parkinson's disease: a resting-state fMRI study. Front. Neurol. 10:1052. doi: 10.3389/fneur.2019.01052
Zhang, J (2022). Mining imaging and clinical data with machine learning approaches for the diagnosis and early detection of Parkinson's disease. NPJ Parkinson's Dis. 8, 1–15. doi: 10.1038/s41531-021-00266-8
Zhang, J.-R., Feng, T., Hou, Y.-N., Chan, P., Wu, T. (2016). Functional connectivity of the corticobasal ganglia–thalamocortical network in Parkinson disease: a systematic review and meta-analysis with cross-validation. CNS Neurosci. Therap. 22, 378–386. doi: 10.1111/cns.12512
Keywords: Parkinson's disease, early-stage characterization, fMRI, long short-term memory, deep learning, functional connectivity
Citation: Guo X, Tinaz S and Dvornek NC (2022) Characterization of Early Stage Parkinson's Disease From Resting-State fMRI Data Using a Long Short-Term Memory Network. Front. Neuroimaging 1:952084. doi: 10.3389/fnimg.2022.952084
Received: 24 May 2022; Accepted: 22 June 2022;
Published: 13 July 2022.
Edited by:
Qingyu Zhao, Stanford University, United StatesReviewed by:
Jiahong Ouyang, Stanford University, United StatesDongren Yao, Massachusetts General Hospital and Harvard Medical School, United States
Weizheng Yan, Georgia State University, United States
Copyright © 2022 Guo, Tinaz and Dvornek. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Nicha C. Dvornek, nicha.dvornek@yale.edu