Bridging structural MRI with cognitive function for individual level classification of early psychosis via deep learning

Wen, Yang; Zhou, Chuan; Chen, Leiting; Deng, Yu; Cleusix, Martine; Jenni, Raoul; Conus, Philippe; Do, Kim Q.; Xin, Lijing

doi:10.3389/fpsyt.2022.1075564

ORIGINAL RESEARCH article

Front. Psychiatry, 10 January 2023

Sec. Schizophrenia

Volume 13 - 2022 | https://doi.org/10.3389/fpsyt.2022.1075564

Bridging structural MRI with cognitive function for individual level classification of early psychosis via deep learning

Yang Wen^1,2,3^†

Chuan Zhou^1,4^†

Leiting Chen^1,4

Yu Deng⁵

Martine Cleusix⁶

Raoul Jenni⁶

Philippe Conus⁷

Kim Q. Do⁶

Lijing Xin²^*

¹Key Laboratory of Digital Media Technology of Sichuan Province, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
²Animal Imaging and Technology Core, Center for Biomedical Imaging, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
³Laboratory for Functional and Metabolic Imaging, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
⁴Institute of Electronic and Information Engineering of UESTC in Guangdong, Dongguan, Guangdong, China
⁵Department of Biomedical Engineering, King's College London, London, United Kingdom
⁶Department of Psychiatry, Center for Psychiatric Neuroscience, Centre Hospitalier Universitaire Vaudois and University of Lausanne, Lausanne, Switzerland
⁷Service of General Psychiatry, Department of Psychiatry, Centre Hospitalier Universitaire Vaudois and University of Lausanne, Lausanne, Switzerland

Introduction: Recent efforts have been made to apply machine learning and deep learning approaches to the automated classification of schizophrenia using structural magnetic resonance imaging (sMRI) at the individual level. However, these approaches are less accurate on early psychosis (EP) since there are mild structural brain changes at early stage. As cognitive impairments is one main feature in psychosis, in this study we apply a multi-task deep learning framework using sMRI with inclusion of cognitive assessment to facilitate the classification of patients with EP from healthy individuals.

Method: Unlike previous studies, we used sMRI as the direct input to perform EP classifications and cognitive estimations. The proposed deep learning model does not require time-consuming volumetric or surface based analysis and can provide additionally cognition predictions. Experiments were conducted on an in-house data set with 77 subjects and a public ABCD HCP-EP data set with 164 subjects.

Results: We achieved 74.9 ± 4.3% five-fold cross-validated accuracy and an area under the curve of 71.1 ± 4.1% on EP classification with the inclusion of cognitive estimations.

Discussion: We reveal the feasibility of automated cognitive estimation using sMRI by deep learning models, and also demonstrate the implicit adoption of cognitive measures as additional information to facilitate EP classifications from healthy controls.

1. Introduction

Artificial intelligence (AI) approaches, particularly machine learning (ML) and deep learning (DL), have been extensively studied to accelerate medical data analysis and assist clinical interventions in many pathological contexts (1, 2). Many applications have been conducted in psychiatric disorders using neuroimaging measures [e.g., sMRI (3)] as input and incorporated with AI models (e.g., supported vector machine and artificial neural networks) to establish automated diagnostic workflows at a single subject level (4, 5). Previous machine learning works in schizophrenia have used handcrafted features extracted from sMRI data to distinguish patients from healthy individuals (6), but such feature extraction process usually involves a long computational time. To reduce computational cost, recent efforts have focused on using directly sMRI images as input, and promising results have been achieved with the help of the latest AI models (e.g., convolutional neural networks, CNNs) (7, 8). However, these studies have mainly focused on patients at chronic stage, the classification of early psychosis (EP) patients from healthy controls (HCs), is considered to be more challenging (9–12), because the brain structural changes in patients with EP are mild and not evident, making computer-aided classification methods less robust and accurate.

Furthermore, progressive cognitive deficit is one major feature of schizophrenia (13–15), inspiring the possibility of using individual cognition levels, in addition to sMRI images, to facilitate automated classification of patients with EP from HCs. Several recent studies have used the DL framework to incorporate cognitive estimation into the workflow to facilitate the diagnosis of Alzheimer's disease by explicitly including cognitive measures as secondary inputs (16, 17). However, this approach requires additional cognitive assessment that is not part of routine neuropsychiatric clinical examinations. Moreover, although several studies have been done using sMRI images to identify individual cognitive impairments via DL (18, 19), to the best of our knowledge, no study has been done to incorporate cognition estimation for classifying patients with EP and controls.

Therefore, in this study, we aim to apply a multi-task DL model by using sMRI as an input to classify patients with EP from healthy controls and to simultaneously predict cognition levels at the single subject level. We further investigated whether the inclusion of cognitive levels estimation could facilitate the classification for patients with EP and controls. Specifically, as shown in Figure 1B, a three-dimensional convolutional neural network (3D-CNN) is used to learn discriminative structural features directly from sMRI arrays. Then, three multilayer perceptron (MLP) subbranches are used to perform EP/HC classifications and cognition estimations. We evaluate the proposed model on an in-house data set, consisting of 77 sMRI 3D arrays (38 patients with EP, 39 HCs).

FIGURE 1

Figure 1. Illustrations of (A) our workflow for classification of patients with EP from HCs and cognitive estimation on six dimensions, and (B) the deep learning architecture with a 3D-CNN feature encoder and three independent MLP subbranches for different subtasks of EP classification and cognitive estimations.

2. Materials and methods

2.1. Problem setup

As shown in Figure 1A, given the sMRI image, we seek to estimate participant's cognitive level and classify patients with EP from healthy individuals in a fully automated manner. Unlike previous studies (16, 20–22), we directly utilized sMRI images as input without additional imaging analysis (e.g., voxel-based morphometry), which allowed us to more natively understand how brain structure itself contributes to the EP classification and cognition estimation.

2.2. Materials and data set

2.2.1. Participants

sMRI data and corresponding neurocognitive scores were obtained from Department of Psychiatry at the Lausanne University Hospital (CHUV). The data set consists of 38 patients with EP and 39 healthy controls (HC). Detailed demographic information of all participants are shown in Table 1. Specifically, the Positive and Negative Syndrome Scale (PANSS) was provided as the sum of positive, negative and general PANSS values. The patients with EP were recruited from the TIPP Program (Treatment and Early Intervention in Psychosis Program, University Hospital, Lausanne, Switzerland) (23). All the participants provided informed written consent for this study, and the procedure was approved by the local Ethics Committee (Commission cantonale déthique de la recherché sur lêtre humain - CER-VD), in accordance with the Declaration of Helsinki. Detailed recruitment criteria for participants can be found in Supplementary material A.

TABLE 1

Table 1. Demographic information and neurocognition performance of 77 subjects.

2.2.2. Structural MRI acquisition

Patients and controls underwent magnetic resonance imaging at a 7 Tesla/68 cm MR scanner (Siemens Medical Solutions, Erlangen, Germany). A 32-channel receive coil (NOVA Medical Inc., MA) with a single channel volume transmit coil was used. 3D T1-weighted MR images were acquired using MP2RAGE (TE/TR = 1.87/5,500 ms, TI1/TI2 = 750/2,350 ms, α1/α2 = 4°/5°, slice thickness = 1 mm, FOV = 240 × 256 × 160 mm³, voxel size = 1 mm³ isotropic, bandwidth = 240 Hz/Px) (24). The original dimension of acquired sMRI data array is 240 × 256 × 160.

2.2.3. Preprocessing

To generate appropriate inputs, we performed preprocessing of sMRI data using CAT12 toolkit for estimation of the probability maps of white matter (WM) and gray matter (GM). Skull striping and registration to standard space with MNI152 template were performed. Then, probability maps of WM and GM were generated after tissue segmentation and bias correction. The resulting WM and GM probability maps were down-sampled to 120 × 120 × 120 for computational efficiency.

2.2.4. Neurocognitive measures

The MATRICS Consensus Cognitive Battery (MCCB) (25, 26) was assessed for both EP and HC groups, excluding the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT), which does not “translate” well into French as an index of social cognition. The neurocognitive measures include six dimensions, i.e., processing speed (PSp), vigilance (Vig), working memory (WMe), verbal learning (VeL), visual learning (ViL) and problem solving (PSo). There exists some missing entries in the cognitive assessment data, so we replaced all missing data with the average values to generate proper training data (17). The quantity of missing entries is: PSp 5, Vig 3, WMe 4, VeL 1, ViL 1, and PSo 1. There is at most one missing cognitive dimension per subject. The distribution of scores for all cognitive dimensions are shown in Table 1. Two-tailed student t-test was performed between the two groups, and significant difference was found on PSp, VeL and ViL with a p < 0.05.

As pointed out by previous studies (19, 27), the estimation of cognitive level can be done by either classification or regression, that classification task is to manually classify continuous scores into different discrete categories and predict the probability of which category each case should be in, whereas regression is a direct prediction of scores. In this study, for the classification task, we evenly divided the scores between the maximum and minimum values into n equal parts, i.g., n categories. It is worth noting that since the maximum and minimum values are different for each cognitive assessment, the interval is also different among the n categories. Normally, larger n represents a more fine separation of cognitive levels and greater difficulty in prediction.

2.2.5. External data set

Besides the in-house data set, we also performed experiment on a second data set, which is from the project of HCP-Early Psychosis (HCP-EP) Release 1.1 from Human Connectome Projects¹ and the Adolescent Brain Cognitive Development^SM (ABCD) Study, held in the NIMH Data Archive (NDA). Detailed recruiting criteria can be found in this website.² Detailed cognition measuring methods can be found in this website.³ Only sMRI data was used as input. For the estimation of cognition, we chose six dimensions of cognitive measures that were closest to our in-house dataset, which are the age adjusted scores from NIH Toolbox Dimensional Change Card Sort Test, NIH Toolbox Flanker Inhibitory Control and Attention Test, NIH Toolbox List Sorting Working Memory Test, Pattern Comparison Processing Speed Test, Seidman Auditory CPT test and NIH Toolbox Picture Vocabulary Test. After filtering, a total of 164 subjects had both sMRI data and cognitive scores.

2.3. Proposed method

2.3.1. 3D-CNN multi-task learning framework

In this study, 3D sMRI arrays were directly used as input for classifications, so we applied 3D-CNN models as a deep learning architecture to encode visual features, similar in previous studies (8, 28, 29). Instead of dividing the sMRI array into 2D images and using 2D-CNN (18, 30) for feature encoding, 3D-CNN can consider all inputs at once to better capture local features in the 3D space and contribute to the final classification.

To predict both the cognitive level and the probability of EP for each participant, we further introduced a multi-task learning framework. Based on the same visual features extracted by the 3D-CNN, three independent MLP networks were used as individual subbranches for different tasks, including EP classification, cognitive level classification (CLC) and cognitive level regression (CLR). The complete architecture of our 3D-CNN encoder and multi-task learning framework is depicted in Figure 1B and corresponding details are provided in Table S7 in Supplementary material D. The sequential structure of our 3D-CNN encoder was inspired by the previous study on schizophrenia classification (8).

2.3.2. Multi-channel 3D array input

We consider the GM and WM probability maps as two different feature channels and make channel concatenations to generate a single 3D array as the input to our model. Unlike previous study (8), where different segmentation components were used as multiple inputs and fed into a model in parallel, our multi-channel 3D array helps to reduce the training parameters and retain all the information from GM and WM. In this case, the dimension of input 3D array will be H × W × D × 2, where H, W, D denotes height, width, depth and 2 is the number of channels. The full volume of size 120 × 120 × 120 × 2, rather than smaller volume patches, was used for training and testing. Furthermore, in experiments where only GM or WM is used for training, a single probability map will be replicated once to remain the dimensionality of the input 3D array.

2.3.3. End-to-end training

Our framework is an end-to-end deep learning system and thus several loss functions were used to train the proposed model for parameter updating. Specifically, for classification tasks (i.e., EP and cognitive level classification), the conventional cross entropy (CE) loss is used, which is defined as

\begin{array}{l} L_{C E} = - \sum_{i = 1}^{c} s^{i} log (ŝ^{i}), & (1) \end{array}

where s is the true label, ŝ is the prediction, and c is the number of class. For the task of cognition regression, the mean square error (MSE) loss is used, which is defined as

\begin{array}{l} L_{M S E} = | | g - ĝ | |_{2}^{2}, & (2) \end{array}

where g and ĝ denote ground truth label and prediction, respectively. The final loss function is defined as:

\begin{array}{l} L_{l o s s} = L_{C E - S Z} + L_{C E - C} + L_{M S E} + L_{r e g}, & (3) \end{array}

where $L_{C E - S Z}$ denotes CE loss for EP classification, $L_{C E - C}$ denotes CE loss for cognitive level classification, $L_{M S E}$ denotes MSE loss for cognitive level regression, and $L_{r e g}$ represents the regularization loss [or weight decay (31)] used to avoid overfitting. As an end-to-end framework, training losses are back-propagated from three multi-task subbranches to the 3D-CNN, updating the parameters of the entire network with an optimization algorithm [e.g., Adam (32)]. Finally, through minimizing the $L_{l o s s}$ , the network could learn a nonlinear mapping from the input 3D sMRI array to EP and cognitive state, enabling EP classification and cognitive estimation for unseen individuals.

2.3.4. Gender influence

Since gender differences were found to be important in WM and GM of psychosis (33–35), and due to the uneven gender distribution of the in-house dataset, two experiments were designed to assess how gender difference affects the performance of the DL-based model on cognitive estimation and EP classification. First, gender information was encoded as an orthogonal embedding and explicitly fed into the model along with the sMRI scan. Second, subjects were divided into two gender subgroups, and experiments were conducted separately for each subgroup.

2.4. Competing methods

2.4.1. Deep learning-based model

Apart from 3D-CNN, we also used a 2D-CNN framework, similar to the model of Jiang et al. (18) and Li et al. (5), for comparison. The latest lightweight 2D convolutional architectures, MNasNet (36), and a cumbersome model, ResNet-18 (37), were used as the feature encoders since they have been commonly used in previous studies (3, 5, 8, 38). In a 2D-CNN framework, for each participant, image features are extracted slice by slice and concatenated for final classification, which introduces more computational cost than the 3D-CNN model. Furthermore, since 3D-CNNs do not have pre-trained weights like 2D-CNNs, all 3D-CNNs models were trained from scratch. Nevertheless, results are reported for 2D-CNNs with and without pre-trained weights.⁴

2.4.2. Handcrafted feature-based machine learning

To compare with the proposed DL workflow, we also performed the classification tasks with several latest ML methods. The GM and WM probability maps were flattened into feature vectors and the principal component analysis (PCA) was used for dimensionality reduction to produce proper training inputs for ML models. Besides the WM and GM maps, volumetric and surface analysis was also performed with CAT12 toolkit to calculate region of interest (ROI) volumes and cortical surface thickness as handcrafted features for comparison. We adopted the analysis with default settings and obtained 388 ROI volume features and 219 cortical thickness features after filtering out the null values. The Cobra⁵ and neuromorphometircs⁶ were used as ROI atlas. Dimensionality reduction was also performed on handcrafted features to make them the same size as GM/WM-based features. We selected several popular ML models for comparison, including random forest (RF), supported vector machine (SVM) and gradient boost machine (GBM).

2.5. Implementation details

All models were implemented with the Python (version 3.7) programming language and several free Python-based packages. For ML models, the GBM was implemented with a popular lightGBM⁷ framework and other models were implemented using scikit-learn toolkit (39). The number of estimators in RF model was set as 500 and radial basis function kernel was used in SVM model.

We used PyTorch (version 1.6 stable) as the DL framework to implement all DL-based models. The Adam (32) was used as the optimizer with a starting learning rate of 1e-4, and the learning rate was made to decay by 0.7 after every 60 epochs to help reach optima. Data augmentation (random rotation and flipping) and weight decay of the optimizer (at a rate of 0.02) were used as data set expansion and regularization, respectively, to help prevent overfitting. The batch size was set to 10, and 300 epochs were used. All experiments were conducted on an Ubuntu 18.04 system with two NVIDIA GeForce RTX 2080 Ti graphical processing unit (GPU) and 22 gigabytes memory. The versions of Compute Unified Device Architecture (CUDA) and the driver for the GPU were 10.2 and 460.73.01, respectively. We used a grid search strategy to determine the hyperparameters with learning rates in the range of [1e-3, 1e-4, 1e-5], batch sizes in the range of [4, 8, 10, 12], and weight decay in the range of [0.0, 0.1, 0.2, 0.3, 0.4].

2.6. Evaluation metrics

We used accuracy, F₁-score, specificity and area under curve (AUC) of receiver operating characteristic (ROC) as the metrics to evaluate the classification performance. Specifically, the F₁-score is the harmonic mean between recall (sensitivity) and precision. The accuracy, F₁-score and specificity are respectively defined as $A c c u r a c y (a c c) = \frac{t p + t n}{t p + f n + f p + t n},$ $F_{1} - s c o r e (F_{1}) = \frac{2 \times t p}{2 \times t p + f p + f n}$ and $S p e c i f i c i t y (s p e) = \frac{t n}{f p + t n},$ where tp, fp, tn and fn refer to true positive, false positive, true negative, and false negative, respectively. While F₁-score mainly focus on evaluating prediction performance on positive targets (i.e., the EP cases), the specificity focus on evaluating the negative ones (i.e., the healthy cases). All these metrics range from 0 to 1, with higher metrics indicating better predictive performance achieved by the model. In addition, we adopted mean absolute error (MAE) and coefficient of determination (R²) as metrics to evaluate regression performance, which is defined as $M A E = \frac{1}{m} \sum_{i = 1}^{m} | y_{i} - {\hat{y}}_{i} |$ and $R^{2} = 1 - \frac{\sum_{i = 0}^{m - 1} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 0}^{m - 1} {(y_{i} - ȳ_{i})}^{2}},$ where m denotes number of samples, y and $\hat{y}$ denote ground truth and prediction, respectively.

2.7. Reduce evaluation bias via cross validation

Since the size of the data set is relatively small for a deep learning model, we applied a five-fold cross-validation strategy in this study in order to thoroughly evaluate and avoid overfitting. There were 77 3D sMRI arrays after pre-processing. These samples were divided into five parts equally, and one part of them was selected one by one as the test set and the rest as the training set. Stratified sampling was used to ensure that the gender ratio in the training/test groups was the same. After that, all metrics are presented as the mean and standard deviation of the five experiments. In this work, the multi-task deep learning framework accomplished two tasks including cognitive estimation and EP classification.

3. Results

3.1. Results for cognition estimation and EP classification

3.1.1. Cognition estimation

We first evaluate the cognitive estimation performance of the proposed method and competing methods in terms of CLC task, of which results are shown in Table S1 in Supplementary material B. Specifically, the 3D-CNN model achieved F₁-scores of 70.1 ± 3.5%, 51.9 ± 8.1%, 31.9 ± 7.5%, 16.2 ± 3.7% in the two-, three-, five-, and 10-categorized CLC tasks, respectively. Furthermore, we present the classification accuracy for each cognition estimation dimension while n = 2 in Figure 2 and the regression results for cognitive estimation (i.e., CLR) in Table S1 in Supplementary material B. The 3D-CNN model achieved a R² of –0.878 ± 0.121 and a MAE of 8.567 ± 1.950, while the Volume + SVM combination achieved the best CLR results with a R² of -0.086 ± 0.139 and a MAE of 7.299 ± 1.735. We also presented results on an external data set (Table S4 in Supplementary material B), discussed the effects of using Huber loss on the CLR in Tables S2, S5 in Supplementary material B, and discussed the effects of gender differences on the cognition estimation (Table S8 in Supplementary material D).

FIGURE 2

Figure 2. Accuracy of our model in two-categorized CLC task compared with different ML and DL counterparts in six cognitive estimation dimensions. All the DL models shown were trained from scratch.

3.1.2. EP classification

For the second task of EP classification, our model was compared with several latest counterparts (5, 40–43), of which models were re-implemented based on the settings of the original publications. The results of EP classification are shown in Figure 3 and Table 2. Specifically, the Thickness features + SVM combination achieved the best results in ML methods with an accuracy of 58.4 ± 9.0%, a F₁-score of 60.8 ± 10.6%, and a specificity of 60.4 ± 10.1%, while the proposed method achieved an accuracy of 74.9 ± 4.3%, a F₁-score of 74.5 ± 4.2%, and a specificity of 82.3 ± 6.3% with the inclusion of cognitive estimation. We also presented results on an external data set (Table S4 in Supplementary material B), discussed the computational costs of models in Table S6 in Supplementary material C, and discussed the effects of gender differences on the EP classification (Table S9 in Supplementary material D).

FIGURE 3

Figure 3. Performance of ROC curves for EP/HC classification with five-fold cross-validation. The proposed model used GM map as input.

TABLE 2

Table 2. Comparison on sMRI-based studies for EP classification.

As we hypothesized that the introduction of a cognitive classification task could bring features about individual brain structure to the DL model, it remains unclear whether more classification categories could lead to more discriminative features for EP classification. Therefore, we divided cognitive scores into different number of categories in the CLC subtask and assessed how this would affect classification performance for EP, the results are shown in Figure S1 in Supplementary material B.

3.1.3. Validation on ABCD HCP-EP data set

The experiments of cognition estimation and EP classification were also performed on the external ABCD HCP-EP data set. For cognition estimations (Table S3 in Supplementary material B), the proposed method obtained F₁-scores of 81.6 ± 1.8%, 61.4 ± 5.9%, 40.3 ± 6.8% on the two-, three-, and five-categorized CLC tasks, respectively. A R² of 0.074 ± 0.499% was achieved by the proposed method on the CLR task. For EP classification (Table S4 in Supplementary material B), the proposed method achieved an accuracy of 75.9 ± 5.3% and an F₁-score of 84.1 ± 5.2% when using WM and GM as inputs, and achieved an accuracy of 75.8 ± 6.1% and an F₁-score of 84.3 ± 5.1% when using only GM as input. After the inclusion of cognition estimation, the accuracy was improved by 2.2% and the F₁-score was improved by 3.1% when using WM and GM as inputs, and the accuracy was improved by 2.9% and the F₁-score was improved by 2.8% when using only GM as input.

3.1.4. Gender influence study

Then, the influence of gender difference was elaborated via two experiments. For the first experiment where gender information was fed into the model along with sMRI scan (Table S8 in Supplementary material D), the 3D-CNN model achieved a R² of –0.885 ± 0.126 and a F₁-score of 70.0 ± 3.5 on cognition estimation task after adding the gender embeddings. The metric difference is 0.007 (R²) and 0.1 (F₁-score) compared to the 3D-CNN model without the gender information. For other DL-based methods, the performance differences are: for Image + MNasNet, 0.002 (R²) and 0.1 (F₁-score); for Image + ResNet-18, 0.004 (R²) and 0.1 (F₁-score). Besides, the proposed method with cognition estimation and gender information achieved an accuracy of 74.8 ± 4.3 and a F₁-score of 74.5 ± 4.3 on EP classification task. The metric difference is 0.1 (acc) and 0.0 (F₁-score) compared to the one without the gender information. For the second experiment, subjects were divided into two subgroups based on gender and results of EP classification task were presented (Table S9 in Supplementary material D). In the male subgroup, the proposed method using GM input achieved an accuracy of 74.6 ± 6.7 and a F₁-score of 74.1 ± 6.0 with the inclusion of cognition estimation subtask, whereas accuracy decreased to 70.8 ± 6.4 and F₁-score decreased to 70.0 ± 5.6 without the inclusion. In the female subgroup, the proposed method using GM input achieved an accuracy of 68.7 ± 10.2 and a F₁-score of 69.0 ± 8.9 with the inclusion of cognition estimation subtask, whereas accuracy decreased to 66.4 ± 9.9 and F₁-score decreased to 66.1 ± 10.4 without the inclusion.

3.1.5. Ablation study

Furthermore, ablation studies on WM/GM inputs and CLC/CLR subtasks were conducted. First, we evaluated the effect of using different sMRI images (i.e., WM or GM images) as input on the EP classification to evaluate how they contribute to the classification in the context of DL, the results are shown in Table 3. With both WM and GM as inputs, the proposed method achieved F₁-scores of 70.5 ± 4.1%, 72.5 ± 4.0%, 73.5 ± 4.0%, and 74.2 ± 3.0% for EP classification, EP classification with CLC subtask, EP classification with CLR subtask, and EP classification with CLC and CLR subtasks, respectively. The best F₁-score result of 74.5 ± 4.2% was obtained when using GM as input with the CLR as the subtask. We then evaluated how different ways of introducing the cognitive assessment subtask (i.e., CLC or CLR) contributed to the classification of EP, the results are shown in Table 4. Specifically, our model with the CLR subtask achieved the best EP classification results with an accuracy of 74.9 ± 4.3%, F₁-scores of 74.5 ± 4.2%, specificity of 82.3 ± 6.3%, and AUC of 71.1 ± 4.1%. With only the CLC subtask, the results decreased with an accuracy of 71.4 ± 3.7%, F₁-scores of 70.2 ± 5.4%, specificity of 74.4 ± 5.4%, and AUC of 67.4 ± 4.5%.

TABLE 3

Table 3. Results of F₁-score (%) for EP classification.

TABLE 4

Table 4. EP classification performance of our model when introducing different cognitive estimation subtasks, using GM images as input.

3.1.6. Qualitative illustration

Last but not least, our proposed framework could potentially identify brain regions that may be associated with psychosis, thus we present the attention maps using GradCam++ (44) and GradCam algorithms (45, 46) in Figure 4 to illustrate brain structures of importance. Besides structural biomarkers for psychosis, we also demonstrate the attention maps for CLC in Figures 4B–G, I–N.

FIGURE 4

Figure 4. Visualization of the discriminative positions identified by the proposed model on (A, H) EP classification and (B–G, I–N) CLC tasks with attentional weights. The results were shown as the mean of all cases in the data set. We used both WM and GM images as input and n was set to two for CLC.

4. Discussion

Despite the recognized brain structural alterations (47–49) and cognitive deficits (13–15) in schizophrenia, no studies have performed sMRI-based cognitive estimation in EP, nor have cognitive measures been incorporated into EP classification. In the present work, a multi-task deep learning framework using sMRI was used to bridge sMRI and cognitive estimation for improving the classification performance of EP, which can automatically capture structural features from 3D sMRI scans for EP classification and provide cognition as supporting evidence at individual level within a unified framework. While most of ML-based classifiers relied on features of time-consuming volumetric or surface based analysis, the proposed method performs EP/HC classifications and cognition estimation using only sMRI as input. By comparison with the latest models and ablation studies, we revealed the feasibility of automatic cognitive estimation at the individual level and demonstrated the implicit adoption of cognitive measures as additional information could facilitate EP classification from healthy controls. Furthermore, the main structural contributors involved in the process of EP classification and cognitive estimation are identified.

4.1. Cognitive estimation performance

For CLC task, our model achieves better CLC performance in most cases. For example, our model obtains the best F₁-score of 70.1% on the two-categorized (i.e., n = 2) CLC task. Same results can be observed on the three- and ten-categorized (n = 3 and n = 10) CLC that our method outperforms all other counterparts with significant margins. Although in the only case (n = 5) our method did not get the first place, we still got the second best performance. Based on these results, it can be seen that the proposed DL model was able to classify individuals' cognitive states into groups using sMRI and achieved promising performance on two-categorized CLC task with higher accuracy than chance.

Specifically for the two-categorized CLC task, all DL-based models achieved better performance than ML-based models in all cognitive dimensions. Although the DL-based models using sMRI images as input performed similarly across the four cognitive dimensions (PSp, Vig, ViL, and PSo), it is noteworthy that our method achieved significant improvements in the WMe and VeL dimensions. Thus, our method performs most convincingly for CLC task in all six dimensions and even achieves an accuracy of more than 80% in some dimensions (PSp, ViL, and PSo). Since our method uses sMRI images directly as input without further volumetric and cortical surface based analysis, it achieves both the overall best classification performance and efficiency, both of which are crucial for clinical translations.

However, the CLR performance of all models was worse than expected, even worse than random guesses (R² ≤ 0.0). A possible reason for the poor regression performance may be due to the limited sample size of the data set (50). By comparing the performance of DL and ML models, it can be seen that DL models generally performed worse than ML models. This suggests that DL models may be more sensitive to the lack of samples (19), and that DL models may be more suitable for classification rather than regression in a sample-limited context.

Furthermore, a considerable performance improvement is observed by comparing the results of the in-house dataset with the external ABCD HCP-EP dataset (Table S3 in Supplementary material B). First, for the two-categorized CLC task, our method achieved an F₁-score of 81.6 ± 1.8 on the external dataset, which is an 11.5% improvement compared to the F₁-score on the in-house dataset (70.1 ± 3.5). Second, for the CLR task, most DL-based methods achieved better than random guesses (R² > 0.0) results and our method achieved the best one (0.074 ± 0.499), which improved by 0.952 compared to the R² on in-house dataset (–0.878 ± 0.121). The improved performance is most likely due to the fact that the external dataset (n = 164) has more than twice the amount of data as the in-house dataset (n = 77), and more data allows the model to better grasp structural information from the input and learn correlations between sMRI scans and cognitive levels. Given that the data from the external dataset were obtained in a different imaging pipeline and were composed of subjects from cohorts with different age and gender distributions, the results again demonstrate the validity of individual-level cognitive estimation using DL-based models, especially when more data are available.

4.2. Early psychosis classification performance

For EP classification, our proposed model generally outperforms the other five competing methods in all metrics. For instance, our model using solely sMRI images (with GM as input) achieved the best F₁-score (74.5%) compared to ML-based models using volumetric features [54.1% (40)] and cortical thickness features [55.5% (42) and 60.8% (43)]. In addition, our 3D-CNN model also achieves better performance in all metrics compared to 2D-CNN (5), indicating that features are extracted directly from 3D sMRI arrays more efficiently than from 2D slices. Finally, we compared the performance of our model with and without cognitive estimation as a subtask. By adding cognitive estimation, the accuracy, F₁-score and specificity were improved by 3.9, 4.4, and 8.5%, respectively, when GM was used as input. And similar improvements are seen when WM and GM were used as inputs, by 2.9, 3.7, and 4.8% on the accuracy, F₁-score and specificity, respectively.

In terms of AUC, the cognitive estimation subtask brought a 4.8% improvement and also achieved the best classification performance (71.1%) of all models, further demonstrating its validity. This is consistent with the idea in previous studies that the association between brain abnormalities and cognitive symptoms may exist at a deep and abstract level and thus can be effectively captured by DL methods, leading to enhanced performance in EP classification (51, 52). These results demonstrate the effectiveness of using 3D-CNN and involving a cognitive estimation subtask for promising EP classification performance.

In addition, EP classification experiments were conducted on the external ABCD HCP-EP dataset to assess the robustness of the proposed method, and the results are presented in Table S4 in Supplementary material B. In general, the EP classification results of the proposed method on the external dataset are better than that on the in-house dataset, with an improvement of >4.6% on accuracy and >13% on F₁-score when using WM and GM as inputs, and >3.8% on accuracy and >12.6% on F₁-score when using only GM as input. Since the external dataset has more subjects than the in-house dataset, such an improvement suggests that even higher performance can be expected when more data is available. And more importantly, the proposed method also achieved higher EP classification accuracy and F₁-score with the inclusion of the cognition estimation on the external dataset. Similar performance improvements from the inclusion of cognitive estimation were observed in both the in-house and external datasets, which again validates the effectiveness of the cognitive estimation subtask for facilitating EP classification.

4.3. Impact of cognition classification category quantity

The EP classification performance is largely unaffected in terms of F₁ score and accuracy, while the specificity could be improved when n is set to ten. Therefore, in general, introducing a more challenging context in CLC subtask does not bring more discriminative information to the classification of EP. This may be due to the sample limitation in our study, when n is set to a large number, some categories may not have a sample at all. However, the improvement in specificity when n = 10 suggests that a larger number of categories may lead to better EP classification performance in the presence of abundant data.

4.4. Influence of WM/GM inputs

The model with GM as input outperformed the model with WM as input, with an improved F₁ score of ≥5.6. This is consistent with previous results that EP causes significant changes in GM (48), while our results further indicate that changes in GM are sufficiently pronounced in EP and can significantly affect the performance of the automated classification tools. Even so, the simultaneous use of WM and GM achieves the best performance in most tasks, confirming the presence of both WM and GM alterations in patients with EP. Therefore, despite the best result was obtain when using only GM as input (i.e., 74.5% for EP + CLR), the inclusion of both GM and WM maps generally resulted in better classification performance for EP.

4.5. Influence of gender difference

Regarding the first experiment on gender difference that using gender information as input, the explicit inclusion has little effect on both cognitive estimation and EP classification tasks. For the second experiment that divided subjects into two subgroups, the overall performance decreased and the standard deviation increases sharply due to the small number of samples available for training the model in each subgroup. The model performance in the male subgroup are generally better than that in the female subgroup, as female subgroup has much fewer samples. Notably, in both subgroups, the EP classification performance still improved after the inclusion of cognitive estimation, suggesting that the effectiveness of including cognition estimation on facilitating EP classification was consistent, independent of gender differences.

4.6. Influence of CLC and CLR subtasks

Both CLC and CLR brought improvement on EP classification, while CLR seems to be more effective than CLC. The model incorporating the CLR subtask achieved the best performance on all metrics, with a significant gap compared to the other models. However, performance degrades when CLC is involved in addition to CLR, suggesting that the two subtasks may be incompatible. One possible reason for this is that some discriminative brain regions of the cognitive estimation dimensions may differ from the EP, thus introducing noisy features in the training. In contrast, the regression task did not bring discriminative information, so the features of CLR were more compatible than those of CLC in EP classification. In short, at least for EP classifications, the regression subtask is more informative than classification, but for other diseases the subtask needs to be selected on the merits (53, 54).

4.7. Interpretable sMRI biomarkers and clinical potentials

The entire GM structure contributes mostly to the EP classification, suggesting that more discriminative features are found in GM than in WM, which is in line with the results of better classification performance of the model using GM shown in the ablation study. Furthermore, for regions highlighted by GradCam of EP, saliency appears in the frontal and temporal lobe regions, as well as putamen, head of caudate nucleus, and thalamus. These regions contributed the most to our model in classifying a subject as an patient with EP or a healthy subject, suggesting that structural features in these regions are most likely to be discriminative for psychosis. Indeed, all of these regions recognized by our model are highly consistent with those reported in previous volumetric and functional studies. Alterations in gray matter, the frontal lobe, putamen, head of caudate nucleus, and thalamus were observed in patients with schizophrenia (55–58) and cognition deficits (59) in group-level volumetric analysis, as well as fMRI studies (60–62).

Interestingly, these regions are also consistent with those implicated in parvalbumin-expressing interneuron dysfunction (63–69), which is one core of schizophrenia pathophysiology, affecting neuronal synchronization and thalamocortical networks, and leading to cognitive deficits as well as hyperdopaminergia related to positive symptoms [reviewed herein (70)]. Taken together, our interpretable results indicate the potential of identifying biomarkers from sMRI by DL methods.

Moreover, some specific regions are also recognized as discriminative for the estimation of cognition level in the CLC subtask. Taking working memory as an example, the thalamus and cerebellum were highlighted by the DL model with the highest significance using GradCam, and these regions have also been proved to be associated with working memory function in the previous fMRI studies (71, 72). Similarly, in the result of ViL using GradCam, the highlighted regions of occipital lobe, thalamus, and cerebellum for visual learning were also considered associated to visual functions in fMRI studies (73–75). Furthermore, it is worth noting that the highlighted regions are not only different among the six different cognitive estimation dimensions, but differ significantly from those for EP classification. This could explain why CLC brings less improvement in EP classification than CLR, since the discriminative regions are different, the model may not be able to coordinate these features to accomplish both tasks simultaneously.

4.8. Limitation and future work

Although our proposed method achieves improved performance in EP classification and provides biomarkers with a high degree of interpretability, there are still some limitations that may affect the generalizability of our approach. First, the study was conducted at a single site and did not take into account the different ethnic composition and sMRI scanning settings, so multi-site studies are needed for further validation. Second, the EP subjects in our study received medication, which may also leads to structural alterations in the brain, thus requiring the use of a non-medicated sample in our future studies to rule out medication interference. Third, the study on the impact of gender differences may require a larger female group to further validate the proposed method.

Despite these limitations, our results also lead to many interesting directions for future research. For example, since only EP is studied in this work, whether the cognitive estimation subtask helpful for improving classification performance for other psychiatric disorders could be explored. And, as we demonstrated that implicitly introducing cognitive features in the DL model helps EP classification, the question is raised whether it is better to incorporate such additional features explicitly (i.e., as input) or implicitly (e.g., as output) into the workflow. Also, since deep learning and implicit information introduction can enhance classification, with only sMRI as a single input, more other relevant features can be introduced into the model in the same way with the aim of further improving classification performance and providing interpretable evidence to aid clinical translation. Moreover, if validated in larger cohorts of patients at the early phase of psychosis, this approach could open the way to prediction of cognitive deficit in prospective longitudinal study with patients in their prodromal phase.

5. Conclusion

In this study, we propose a multitask DL framework for EP classification based on sMRI images. By introducing cognitive estimation as a subtask, the proposed method is able to estimate the cognitive state of an individual and improve the classification performance of EP by an appreciable margin. Experimental results show that our method can not only achieve classification accuracy that exceeds that of the latest similar methods, but also identify discriminative regions in sMRI images as interpretable evidence.

Data availability statement

The in-house dataset presented in this article are not readily available because the data that support the findings of this study are available from the corresponding author upon reasonable request. Requests to access the datasets should be directed to LX, bGlqaW5nLnhpbkBlcGZsLmNo. For the publicly available ABCD HCP-EP dataset, please request access as per the official NIMH requirements.

Ethics statement

The studies involving human participants were reviewed and approved by Commission Cantonale d'éthique de la Recherché sur l'être Humain-CER-VD. The patients/participants provided their written informed consent to participate in this study. The studies involving human participants were reviewed and approved by Cantonal Ethics Commission for Research on Human Beings. The patients/participants provided their written informed consent to participate in this study.

Author contributions

YW: conceptualization, methodology, software, investigation, writing—original draft, and review. CZ: conceptualization, methodology, writing—review, editing, supervision, and funding acquisition. LC: supervision and funding acquisition. YD: methodology and investigation. MC and PC: data curation. RJ: data curation, writing—review, and editing. KD: data curation, writing—review, editing, resources, and funding acquisition. LX: conceptualization, writing—review, editing, supervision, resources, data curation, and funding acquisition. All authors contributed to the article and approved the submitted version.

Funding

This work was supported by the Sichuan Science and Technology Program (Nos. 2019YJ0176/2019YJ0177/2019YFQ-0005), China Scholarship Council, and the National Center of Competence in Research (NCCR) SYNAPSY-The Synaptic Bases of Mental Diseases from the Swiss National Science Foundation (n° 51AU40_185897 to KQD &PC). We acknowledge access to the facilities and expertise of the CIBM Center for Biomedical Imaging, a Swiss research center of excellence founded and supported by Lausanne University Hospital (CHUV), University of Lausanne (UNIL), EPFL, University of Geneva (UNIGE), and Geneva University Hospitals (HUG).

Acknowledgments

Data used in the preparation of this article were obtained from the Adolescent Brain Cognitive Development^SM (ABCD) Study (https://abcdstudy.org), held in the NIMH Data Archive (NDA). This is a multisite, longitudinal study designed to recruit more than 10,000 children age 9–10 and follow them over 10 years into early adulthood. The ABCD Study^® is supported by the National Institutes of Health and additional federal partners under award numbers U01DA041048, U01DA050989, U01DA051016, U01DA041022, U01DA051018, U01DA051037, U01DA050987, U01DA041174, U01DA041106, U01DA041117, U01DA041028, U01DA041134, U01DA050988, U01DA051039, U01DA041156, U01DA041025, U01DA041120, U01DA051038, U01DA041148, U01DA041093, U01DA041089, U24DA041123, and U24DA041147. A full list of supporters is available at https://abcdstudy.org/federal-partners.html. A listing of participating sites and a complete listing of the study investigators can be found at https://abcdstudy.org/consortium_members/. ABCD consortium investigators designed and implemented the study and/or provided data but did not necessarily participate in the analysis or writing of this report.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Author disclaimer

This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or ABCD consortium investigators. The ABCD data repository grows and changes over time. The ABCD data used in this report came from HCP-Early Psychosis (HCP-EP) Release 1.1. Study DOIs can be found at [http://dx.doi.org/10.15154/1528333].

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2022.1075564/full#supplementary-material

Footnotes

1. ^https://www.humanconnectome.org/study/human-connectome-project-for-early-psychosis

2. ^https://nda.nih.gov/ccf/

3. ^https://www.humanconnectome.org/storage/app/media/documentation/HCP-EP1.1/HCP-EP_Release_1.1_Manual.pdf

4. ^pre-trained weights provided by torchvision package (version 0.7.0).

5. ^http://cobralab.ca/atlases/

6. ^https://scalablebrainatlas.incf.org/human/NMM1103

7. ^https://github.com/microsoft/LightGBM

References

1. Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu Rev Biomed Eng. (2017) 19:221–48. doi: 10.1146/annurev-bioeng-071516-044442

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. (2020) 395:1579–86. doi: 10.1016/S0140-6736(20)30226-9

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Noor MBT, Zenia NZ, Kaiser MS, Al Mamun S, Mahmud M. Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer's disease, Parkinson's disease and schizophrenia. Brain Inf. (2020) 7:1–21. doi: 10.1186/s40708-020-00112-2

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Noor MBT, Zenia NZ, Kaiser MS, Mahmud M, Al Mamun S. Detecting neurodegenerative disease from MRI: a brief review on a deep learning perspective. In: International Conference on Brain Informatics. Springer (2019). p. 115–25.

Google Scholar

5. Li Z, Li W, Wei Y, Gui G, Zhang R, Liu H, et al. Deep learning based automatic diagnosis of first-episode psychosis, bipolar disorder and healthy controls. Comput Med Imaging Graph. (2021) 89:101882. doi: 10.1016/j.compmedimag.2021.101882

PubMed Abstract | CrossRef Full Text | Google Scholar

6. de Filippis R, Carbone EA, Gaetano R, Bruni A, Pugliese V, Segura-Garcia C, et al. Machine learning techniques in a structural and functional MRI diagnostic approach in schizophrenia: a systematic review. Neuropsychiatr Dis Treat. (2019) 15:1605. doi: 10.2147/NDT.S202418

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Oh J, Oh BL, Lee KU, Chae JH, Yun K. Identifying schizophrenia using structural MRI with a deep learning algorithm. Front Psychiatry. (2020) 11:16. doi: 10.3389/fpsyt.2020.00016

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Hu M, Sim K, Zhou JH, Jiang X, Guan C. Brain MRI-based 3D convolutional neural networks for classification of schizophrenia and controls. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). Montreal, QC: IEEE (2020). p. 1742–5.

Google Scholar

9. Fusar-Poli P, McGuire P, Borgwardt S. Mapping prodromal psychosis: a critical review of neuroimaging studies. Eur Psychiatry. (2012) 27:181–91. doi: 10.1016/j.eurpsy.2011.06.006

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Vieira S, Gong Qy, Pinaya WH, Scarpazza C, Tognin S, Crespo-Facorro B, et al. Using machine learning and structural neuroimaging to detect first episode psychosis: reconsidering the evidence. Schizophr Bull. (2020) 46:17–26. doi: 10.1093/schbul/sby189

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Cortes-Briones JA, Tapia-Rivas NI, D'Souza DC, Estevez PA. Going deep into schizophrenia with artificial intelligence. Schizophr Res. (2021) 245:122–40. doi: 10.1016/j.schres.2021.05.018

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Sadeghi D, Shoeibi A, Ghassemi N, Moridian P, Khadem A, Alizadehsani R, et al. An overview on artificial intelligence techniques for diagnosis of schizophrenia based on magnetic resonance imaging modalities: methods, challenges, and future works. arXiv preprint arXiv:210303081. (2021) doi: 10.1016/j.compbiomed.2022.105554

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Mohs RC. Cognition in schizophrenia: natural history, assessment, and clinical importance. Neuropsychopharmacology. (1999) 21:S203–10. doi: 10.1016/S0893-133X(99)00120-7

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Sommer IE, Bearden CE, Van Dellen E, Breetvelt EJ, Duijff SN, Maijer K, et al. Early interventions in risk groups for schizophrenia: what are we waiting for? npj Schizophrenia. (2016) 2:1–9. doi: 10.1038/npjschz.2016.3

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Bora E, Yalincetin B, Akdede BB, Alptekin K. Duration of untreated psychosis and neurocognition in first-episode psychosis: a meta-analysis. Schizophr Res. (2018) 193:3–10. doi: 10.1016/j.schres.2017.06.021

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Spasov S, Passamonti L, Duggento A, Liò P, Toschi N, Initiative ADN, et al. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to Alzheimer's disease. Neuroimage. (2019) 189:276–87. doi: 10.1016/j.neuroimage.2019.01.031

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Nguyen M, He T, An L, Alexander DC, Feng J, Yeo BT, et al. Predicting Alzheimer's disease progression using deep recurrent neural networks. Neuroimage. (2020) 222:117203. doi: 10.1016/j.neuroimage.2020.117203

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Jiang J, Kang L, Huang J, Zhang T. Deep learning based mild cognitive impairment diagnosis using structure MR images. Neurosci Lett. (2020) 730:134971. doi: 10.1016/j.neulet.2020.134971

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Sui J, Jiang R, Bustillo J, Calhoun V. Neuroimaging-based individualized prediction of cognition and behavior for mental disorders and health: methods and promises. Biol Psychiatry. (2020) 88:818–28. doi: 10.1101/2020.02.22.961136

PubMed Abstract | CrossRef Full Text | Google Scholar

20. De Marco M, Beltrachini L, Biancardi A, Frangi AF, Venneri A. Machine-learning support to individual diagnosis of mild cognitive impairment using multimodal MRI and cognitive assessments. Alzheimer Dis Assoc Disord. (2017) 31:278–86. doi: 10.1097/WAD.0000000000000208

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Liu K, Chen K, Yao L, Guo X. Prediction of mild cognitive impairment conversion using a combination of independent component analysis and the cox model. Front Hum Neurosci. (2017) 11:33. doi: 10.3389/fnhum.2017.00033

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Khatri U, Kwon GR. An efficient combination among sMRI, CSF, cognitive Score, and APOE ε4 biomarkers for classification of AD and MCI using extreme learning machine. Comput Intell Neurosci. (2020) 2020:8015156. doi: 10.1155/2020/8015156

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Baumann PS, Crespi S, Marion-Veyron R, Solida A, Thonney J, Favrod J, et al. T reatment and E arly I ntervention in P sychosis P rogram (TIPP-L ausanne): implementation of an early intervention programme for psychosis in S witzerland. Early Interv Psychiatry. (2013) 7:322–8. doi: 10.1111/eip.12037

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Marques JP, Kober T, Krueger G, van der Zwaag W, Van de Moortele PF, Gruetter R. MP2RAGE, a self bias-field corrected sequence for improved segmentation and T1-mapping at high field. Neuroimage. (2010) 49:1271–81. doi: 10.1016/j.neuroimage.2009.10.002

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Kern RS, Nuechterlein KH, Green MF, Baade LE, Fenton WS, Gold JM, et al. The MATRICS Consensus Cognitive Battery, part 2: co-norming and standardization. Am J Psychiatry. (2008) 165:214–20. doi: 10.1176/appi.ajp.2007.07010043

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Nuechterlein KH, Green MF, Kern RS, Baade LE, Barch DM, Cohen JD, et al. The MATRICS consensus cognitive battery, part 1: test selection, reliability, and validity. Am J Psychiatry. (2008) 165:203–13. doi: 10.1176/appi.ajp.2007.07010042

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Rutherford S. The promise of machine learning for psychiatry. Biol Psychiatry. (2020) 88:e53–5. doi: 10.1016/j.biopsych.2020.08.024

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Oh K, Kim W, Shen G, Piao Y, Kang NI, Oh IS, et al. Classification of schizophrenia and normal controls using 3D convolutional neural network and outcome visualization. Schizophr Res. (2019) 212:186–95. doi: 10.1016/j.schres.2019.07.034

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Qureshi MNI, Oh J, Lee B. 3D-CNN based discrimination of schizophrenia using resting-state fMRI. Artif Intell Med. (2019) 98:10–17. doi: 10.1016/j.artmed.2019.06.003

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Qiu Y, Lin QH, Kuang LD, Zhao WD, Gong XF, Cong F, et al. Classification of schizophrenia patients and healthy controls using ICA of complex-valued fMRI data and convolutional neural networks. In: International Symposium on Neural Networks. Springer (2019). p. 540–7.

Google Scholar

31. Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, et al. CE-Net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging. (2019) 38: 2281–92. doi: 10.1109/TMI.2019.2903562

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Kingma DP, Ba J. Adam, a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR). vol. 1412 (2015). Available online at: https://dblp.org/rec/journals/corr/KingmaB14.bib

Google Scholar

33. Lang XE, Zhu D, Zhang G, Du X, Jia Q, Yin G, et al. Sex difference in association of symptoms and white matter deficits in first-episode and drug-naive schizophrenia. Transl Psychiatry. (2018) 8:1–8. doi: 10.1038/s41398-018-0346-9

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Guma E, Devenyi GA, Malla A, Shah J, Chakravarty MM, Pruessner M. Neuroanatomical and symptomatic sex differences in individuals at clinical high risk for psychosis. Front Psychiatry. (2017) 8:291. doi: 10.3389/fpsyt.2017.00291

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Bora E, Fornito A, Yücel M, Pantelis C. The effects of gender on grey matter abnormalities in major psychoses: a comparative voxelwise meta-analysis of schizophrenia and bipolar disorder. Psychol Med. (2012) 42:295–307. doi: 10.1017/S0033291711001450

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, et al. Mnasnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA: IEEE (2019). p. 2820–8.

Google Scholar

37. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV: IEEE (2016). p. 770–8.

PubMed Abstract | Google Scholar

38. Hassantabar S, Zhang J, Yin H, Jha NK. MHDeep: mental health disorder detection system based on wearable sensors and artificial neural networks. ACM Trans Embed Comput Syst. (2022) 21:1–22. doi: 10.1145/3527170

CrossRef Full Text | Google Scholar

39. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. (2011) 12:2825–30. Available online at: https://scikitlearn.org/stable/about.html#citing-scikit-learn

Google Scholar

40. Lu X, Yang Y, Wu F, Gao M, Xu Y, Zhang Y, et al. Discriminative analysis of schizophrenia using support vector machine and recursive feature elimination on structural MRI images. Medicine. (2016) 95:3973. doi: 10.1097/MD.0000000000003973

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Pinaya WH, Gadelha A, Doyle OM, Noto C, Zugman A, Cordeiro Q, et al. Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia. Sci Rep. (2016) 6:1–9. doi: 10.1038/srep38897

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Xiao Y, Yan Z, Zhao Y, Tao B, Sun H, Li F, et al. Support vector machine-based classification of first episode drug-naïve schizophrenia patients and healthy controls using structural MRI. Schizophr Res. (2019) 214:11–17. doi: 10.1016/j.schres.2017.11.037

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Greenstein D, Weisinger B, Malley JD, Clasen L, Gogtay N. Using multivariate machine learning methods and structural MRI to classify childhood onset schizophrenia and healthy controls. Front Psychiatry. (2012) 3:53. doi: 10.3389/fpsyt.2012.00053

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN. Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, NV: IEEE (2018). p. 839–47.

Google Scholar

45. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE (2017). p. 618–26.

Google Scholar

46. Gotkowski K, Gonzalez C, Bucher A, Mukhopadhyay A. M3d-CAM: a PyTorch Library to Generate 3D Data Attention Maps for Medical Deep Learning. Springer Fachmedien Wiesbaden (2020). p. 217–22. doi: 10.1007/978-3-658-33198-6_52

CrossRef Full Text | Google Scholar

47. Shepherd AM, Laurens KR, Matheson SL, Carr VJ, Green MJ. Systematic meta-review and quality assessment of the structural brain alterations in schizophrenia. Neurosci Biobehav Rev. (2012) 36:1342–56. doi: 10.1016/j.neubiorev.2011.12.015

PubMed Abstract | CrossRef Full Text | Google Scholar

48. Vita A, De Peri L, Deste G, Sacchetti E. Progressive loss of cortical gray matter in schizophrenia: a meta-analysis and meta-regression of longitudinal MRI studies. Transl Psychiatry. (2012) 2:e190–0. doi: 10.1038/tp.2012.116

PubMed Abstract | CrossRef Full Text | Google Scholar

49. Brent BK, Thermenos HW, Keshavan MS, Seidman LJ. Gray matter alterations in schizophrenia high-risk youth and early-onset schizophrenia: a review of structural MRI findings. Child Adolesc Psychiatr Clin. (2013) 22:689–714. doi: 10.1016/j.chc.2013.06.003

PubMed Abstract | CrossRef Full Text | Google Scholar

50. Chauhan S, Vig L, De Filippo De Grazia M, Corbetta M, Ahmad S, Zorzi M. A comparison of shallow and deep learning methods for predicting cognitive performance of stroke patients from MRI lesion images. Front Neuroinform. (2019) 13:53. doi: 10.3389/fninf.2019.00053

PubMed Abstract | CrossRef Full Text | Google Scholar

51. Plis SM, Hjelm DR, Salakhutdinov R, Allen EA, Bockholt HJ, Long JD, et al. Deep learning for neuroimaging: a validation study. Front Neurosci. (2014) 8:229. doi: 10.3389/fnins.2014.00229

PubMed Abstract | CrossRef Full Text | Google Scholar

52. Vieira S, Pinaya WH, Mechelli A. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications. Neurosci Biobehav Rev. (2017) 74:58–75. doi: 10.1016/j.neubiorev.2017.01.002

PubMed Abstract | CrossRef Full Text | Google Scholar

53. Liu M, Zhang J, Adeli E, Shen D. Deep multi-task multi-channel learning for joint classification and regression of brain status. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer (2017). p. 3–11.

PubMed Abstract | Google Scholar

54. Harutyunyan H, Khachatrian H, Kale DC, Ver Steeg G, Galstyan A. Multitask learning and benchmarking with clinical time series data. Sci Data. (2019) 6:1–18. doi: 10.1038/s41597-019-0103-9

PubMed Abstract | CrossRef Full Text | Google Scholar

55. Alemán-Gómez Y, Najdenovska E, Roine T, Fartaria MJ, Canales-Rodríguez EJ, Rovó Z, et al. Partial-volume modeling reveals reduced gray matter in specific thalamic nuclei early in the time course of psychosis and chronic schizophrenia. Hum Brain Mapp. (2020) 41:4041–61. doi: 10.1002/hbm.25108

PubMed Abstract | CrossRef Full Text | Google Scholar

56. Haijma SV, Van Haren N, Cahn W, Koolschijn PCM, Hulshoff Pol HE, Kahn RS. Brain volumes in schizophrenia: a meta-analysis in over 18 000 subjects. Schizophr Bull. (2013) 39:1129–38. doi: 10.1093/schbul/sbs118

PubMed Abstract | CrossRef Full Text | Google Scholar

57. Wright IC, Rabe-Hesketh S, Woodruff PW, David AS, Murray RM, Bullmore ET. Meta-analysis of regional brain volumes in schizophrenia. Am J Psychiatry. (2000) 157:16–25. doi: 10.1176/ajp.157.1.16

PubMed Abstract | CrossRef Full Text | Google Scholar

58. Tao H, Wong GH, Zhang H, Zhou Y, Xue Z, Shan B, et al. Grey matter morphological anomalies in the caudate head in first-episode psychosis patients with delusions of reference. Psychiatry Res Neuroimaging. (2015) 233:57–63. doi: 10.1016/j.pscychresns.2015.04.011

PubMed Abstract | CrossRef Full Text | Google Scholar

59. Antonova E, Kumari V, Morris R, Halari R, Anilkumar A, Mehrotra R, et al. The relationship of structural alterations to cognitive deficits in schizophrenia: a voxel-based morphometry study. Biol Psychiatry. (2005) 58:457–67. doi: 10.1016/j.biopsych.2005.04.036

PubMed Abstract | CrossRef Full Text | Google Scholar

60. Lord LD, Allen P, Expert P, Howes O, Broome M, Lambiotte R, et al. Functional brain networks before the onset of psychosis: a prospective fMRI study with graph theoretical analysis. Neuroimage Clin. (2012) 1:91–8. doi: 10.1016/j.nicl.2012.09.008

PubMed Abstract | CrossRef Full Text | Google Scholar

61. Argyelan M, Gallego JA, Robinson DG, Ikuta T, Sarpal D, John M, et al. Abnormal resting state FMRI activity predicts processing speed deficits in first-episode psychosis. Neuropsychopharmacology. (2015) 40:1631–9. doi: 10.1038/npp.2015.7

PubMed Abstract | CrossRef Full Text | Google Scholar

62. Giraldo-Chica M, Woodward ND. Review of thalamocortical resting-state fMRI studies in schizophrenia. Schizophr Res. (2017) 180:58–63. doi: 10.1016/j.schres.2016.08.005

PubMed Abstract | CrossRef Full Text | Google Scholar

63. Bernstein HG, Krause S, Krell D, Dobrowolny H, Wolter M, Stauch R, et al. Strongly reduced number of parvalbumin-immunoreactive projection neurons in the mammillary bodies in schizophrenia: further evidence for limbic neuropathology. Ann N Y Acad Sci. (2007) 1096:120–7. doi: 10.1196/annals.1397.077

PubMed Abstract | CrossRef Full Text | Google Scholar

64. Kilonzo VW, Sweet RA, Glausier JR, Pitts MW. Deficits in glutamic acid decarboxylase 67 immunoreactivity, parvalbumin interneurons, and perineuronal nets in the inferior colliculus of subjects with schizophrenia. Schizophr Bull. (2020) 46:1053–9. doi: 10.1093/schbul/sbaa082

PubMed Abstract | CrossRef Full Text | Google Scholar

65. Pantazopoulos H, Woo TUW, Lim MP, Lange N, Berretta S. Extracellular matrix-glial abnormalities in the amygdala and entorhinal cortex of subjects diagnosed with schizophrenia. Arch Gen Psychiatry. (2010) 67:155–66. doi: 10.1001/archgenpsychiatry.2009.196

PubMed Abstract | CrossRef Full Text | Google Scholar

66. Steullet P, Cabungcal JH, Bukhari SA, Ardelt MI, Pantazopoulos H, Hamati F, et al. The thalamic reticular nucleus in schizophrenia and bipolar disorder: role of parvalbumin-expressing neuron networks and oxidative stress. Mol Psychiatry. (2018) 23:2057–65. doi: 10.1038/mp.2017.230

PubMed Abstract | CrossRef Full Text | Google Scholar

67. Tsubomoto M, Kawabata R, Zhu X, Minabe Y, Chen K, Lewis DA, et al. Expression of transcripts selective for GABA neuron subpopulations across the cortical visuospatial working memory network in the healthy state and schizophrenia. Cereb Cortex. (2019) 29:3540–50. doi: 10.1093/cercor/bhy227

PubMed Abstract | CrossRef Full Text | Google Scholar

68. Hashimoto T, Bazmi HH, Mirnics K, Wu Q, Sampson AR, Lewis DA. Conserved regional patterns of GABA-related transcript expression in the neocortex of subjects with schizophrenia. Am J Psychiatry. (2008) 165:479–89. doi: 10.1176/appi.ajp.2007.07081223

PubMed Abstract | CrossRef Full Text | Google Scholar

69. Maloku E, Covelo IR, Hanbauer I, Guidotti A, Kadriu B, Hu Q, et al. Lower number of cerebellar Purkinje neurons in psychosis is associated with reduced reelin expression. Proc Nat Acad Sci USA. (2010) 107:4407–11. doi: 10.1073/pnas.0914483107

PubMed Abstract | CrossRef Full Text | Google Scholar

70. Cuenod M, Steullet P, Cabungcal JH, Dwir D, Khadimallah I, Klauser P, et al. Caught in vicious circles: a perspective on dynamic feed-forward loops driving oxidative stress in schizophrenia. Mol Psychiatry. (2022) 27:1886–97. doi: 10.1038/s41380-021-01374-w

PubMed Abstract | CrossRef Full Text | Google Scholar

71. Bor J, Brunelin J, Sappey-Marinier D, Ibarrola D, d'Amato T, Suaud-Chagny MF, et al. Thalamus abnormalities during working memory in schizophrenia. An fMRI study. Schizophr Res. (2011) 125:49–53. doi: 10.1016/j.schres.2010.10.018

PubMed Abstract | CrossRef Full Text | Google Scholar

72. Guell X, Gabrieli JD, Schmahmann JD. Triple representation of language, working memory, social and emotion processing in the cerebellum: convergent evidence from task and seed-based resting-state fMRI analyses in a single large cohort. Neuroimage. (2018) 172:437–49. doi: 10.1016/j.neuroimage.2018.01.082

PubMed Abstract | CrossRef Full Text | Google Scholar

73. Poldrack RA, Desmond JE, Glover GH, Gabrieli J. The neural basis of visual skill learning: an fMRI study of mirror reading. Cereb Cortex. (1998) 8:1–10. doi: 10.1093/cercor/8.1.1

PubMed Abstract | CrossRef Full Text | Google Scholar

74. de Bourbon-Teles J, Bentley P, Koshino S, Shah K, Dutta A, Malhotra P, et al. Thalamic control of human attention driven by memory and learning. Curr Biol. (2014) 24:993–9. doi: 10.1016/j.cub.2014.03.024

PubMed Abstract | CrossRef Full Text | Google Scholar

75. Olsson CJ, Jonsson B, Nyberg L. Learning by doing and learning by thinking: an fMRI study of combining motor and mental training. Front Hum Neurosci. (2008) 2:5. doi: 10.3389/neuro.09.005.2008

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: early psychosis, cognition estimation, classification, deep learning, structural MRI (sMRI), schizophrenia, cognition function

Citation: Wen Y, Zhou C, Chen L, Deng Y, Cleusix M, Jenni R, Conus P, Do KQ and Xin L (2023) Bridging structural MRI with cognitive function for individual level classification of early psychosis via deep learning. Front. Psychiatry 13:1075564. doi: 10.3389/fpsyt.2022.1075564

Received: 20 October 2022; Accepted: 21 December 2022;
Published: 10 January 2023.

Edited by:

Tianhong Zhang, Shanghai Jiao Tong University, China

Reviewed by:

Yingying Tang, Shanghai Jiao Tong University, China
Wenjun Su, New York University, United States

Copyright © 2023 Wen, Zhou, Chen, Deng, Cleusix, Jenni, Conus, Do and Xin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lijing Xin, yes bGlqaW5nLnhpbkBlcGZsLmNo

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.