- 1Aging and Health Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan
- 2NVIDIA AI Technology Center, NVIDIA, Taipei, Taiwan
- 3Institute of Neuroscience, National Yang Ming Chiao Tung University, Taipei, Taiwan
- 4Center for Geriatrics and Gerontology, Taipei Veterans General Hospital, Taipei, Taiwan
- 5Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan
Brain age is an imaging-based biomarker with excellent feasibility for characterizing individual brain health and may serve as a single quantitative index for clinical and domain-specific usage. Brain age has been successfully estimated using extensive neuroimaging data from healthy participants with various feature extraction and conventional machine learning (ML) approaches. Recently, several end-to-end deep learning (DL) analytical frameworks have been proposed as alternative approaches to predict individual brain age with higher accuracy. However, the optimal approach to select and assemble appropriate input feature sets for DL analytical frameworks remains to be determined. In the Predictive Analytics Competition 2019, we proposed a hierarchical analytical framework which first used ML algorithms to investigate the potential contribution of different input features for predicting individual brain age. The obtained information then served as a priori knowledge for determining the input feature sets of the final ensemble DL prediction model. Systematic evaluation revealed that ML approaches with multiple concurrent input features, including tissue volume and density, achieved higher prediction accuracy when compared with approaches with a single input feature set [Ridge regression: mean absolute error (MAE) = 4.51 years, R2 = 0.88; support vector regression, MAE = 4.42 years, R2 = 0.88]. Based on this evaluation, a final ensemble DL brain age prediction model integrating multiple feature sets was constructed with reasonable computation capacity and achieved higher prediction accuracy when compared with ML approaches in the training dataset (MAE = 3.77 years; R2 = 0.90). Furthermore, the proposed ensemble DL brain age prediction model also demonstrated sufficient generalizability in the testing dataset (MAE = 3.33 years). In summary, this study provides initial evidence of how-to efficiency for integrating ML and advanced DL approaches into a unified analytical framework for predicting individual brain age with higher accuracy. With the increase in large open multiple-modality neuroimaging datasets, ensemble DL strategies with appropriate input feature sets serve as a candidate approach for predicting individual brain age in the future.
Introduction
The trajectory of healthy brain aging is characterized by a complex dynamic process with progressive and regressive changes in brain structure and function (1–3). Previous group level neuroimaging studies have identified potential relationships between aging processes and regional characteristics of the brain (2, 4–6) and suggested that these aging-related alterations in the human brain may be associated with the incidence of several neurodegenerative diseases (7, 8). Therefore, the method of obtaining aging-related bio-signatures with higher reliability based on different brain characteristics is of particular importance.
In the last decade, the novel concept of “biological brain age” has emerged and served as a candidate quantitative index for assessing individual brain health throughout the entire lifespan (6, 9). Several studies have estimated brain age using extensive neuroimaging data from healthy participants with different machine learning (ML) approaches (10, 11). Furthermore, several public health-orientated studies have demonstrated the potential interrelationships between individual brain age and mortality risk, grip strength, and physical activity (9, 12). Several clinically oriented studies have supported the clinical relevance of brain age in several neurodevelopmental and neurodegenerative disorders, including Alzheimer's disease (10, 13), schizophrenia (14, 15), and traumatic brain injury (11). Additionally, an individual's biological brain age has been proposed as a prognostic indicator for treatments and interventions in several neurological diseases (16–18). Although the concept of biological brain age has been widely applied in neuroscience, public health, and clinical research, the optimal approach to construct predictive models of brain age with higher reliability and accuracy remains a challenge.
In addition to conventional ML approaches, several end-to-end deep learning (DL) analytical frameworks have recently been proposed as alternative approaches with significant potential for predicting individual brain age and disease classification with higher prediction accuracy (19, 20). Compared with previous conventional ML-based brain age estimators, these end-to-end DL approaches omit various image preprocessing steps and feature extraction procedures which are highly dependent on software package selection and image quality. Several DL-based studies have demonstrated the superior predictive performance of this approach using single imaging modalities as the input feature set for estimating individual brain age with minimal image preprocessing procedures and feature extraction steps (19, 21, 22). However, different imaging modalities of brain MRI are associated with distinct tissue properties and provide rich information for the characterization of individual brain changes across the entire lifespan (5, 23). Ensemble learning is an effective general-purpose ML paradigm that combines prediction of individual models to achieve better performance (24, 25). Using the ensemble learning approach, different brain MRI imaging modalities can be seamlessly unified into a single predictive model while reducing overfitting and improving predictive performance. Nevertheless, the optimal approach to select and assemble appropriate input feature sets for DL analytical frameworks remains to be determined.
In the Predictive Analytics Competition (PAC) 2019 which aimed to develop the best predictive brain age model from healthy subjects based on structural magnetic resonance imaging (sMRI) data, we explored the possibility of an ensemble DL-based framework for predicting individual brain age. The two objectives of the competition were: (1) to accomplish the smallest mean absolute error (MAE) for predicted brain age and (2) to accomplish the smallest MAE while maintaining the Spearman correlation between the predicted brain age difference (calculated as predicted brain age minus chronological age) and chronological age below 0.1. To achieve these two objectives, we first investigated the potential contribution of different input feature sets to predict individual brain age with two widely used conventional ML approaches. This empirical evidence served as a baseline comparison for the subsequent ensemble DL-based predictive model. We subsequently constructed two distinct ensemble DL-based brain age models with multiple input feature sets and objective-specific regularization functions to obtain acceptable simulation results in this timely competition.
Methods
Structural MRI Data and General Image Preprocessing
The dataset of the 2019 PAC contained original T1-weighted structural MRI brain images from 2,640 subjects with correct age labels (https://www.photon-ai.com/pac2019). All T1-weighted brain scans were visually assessed for scan quality. Images with apparent image artifacts or gross brain abnormalities including trauma, tumors, and hemorrhagic or infarct lesions were excluded by two experienced researchers. This quality screening procedure excluded 157 participants from subsequent image preprocessing. The final training data consisted of 2,483 participants that encompassed a wide age range from 17 to 90 years. To obtain an unbiased brain age estimator and evaluate its generalizability, we randomly allocated the provided training set (N = 2,483, age = 36.41 ± 16.37 years, age range = 17–90 years, 1,150 males) into a training sample (N = 2,198, age = 36.39 ± 16.33 years, age range = 17–90 years, 1,012 males) and a hold-out validation sample (N = 285; age = 36.52 ± 16.74 years, age range = 18–90 years, 138 males). An additional 660 subjects without age labels formed an independent external dataset for the final benchmarking. The results of this external dataset determined the final challenge scores from two distinct perspectives mentioned above.
To investigative the interrelationships between different input feature sets and prediction performance of the brain age estimator, we used the recently proposed enhanced Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra voxel-based morphometry (DARTEL-VBM) analytical pipeline to extract multiple input feature sets including gray matter volume (GMV, modulated gray matter segments), gray matter density (GMD, unmodulated gray matter segments), white matter volume (WMV, modulated white matter segments), and white matter density (WMD, unmodulated white matter segments) information from original T1-weighted anatomical scans of each individual. This modified DARTEL-VBM approach which integrated enhanced subcortical tissue probability maps produced more accurate subcortical tissue segmentation results when compared with the original VBM approach (26). The detailed enhanced DARTEL-VBM analytical pipeline has been documented in our previous clinical study (27). The entire image processing pipeline was performed using Statistical Parametric Mapping software (SPM12, version 7487, Wellcome Institute of Neurology, University College London, UK) using MATLAB (R2016a, Mathworks, Natick, MA). Finally, the individual Montreal Neurological Institute (MNI) space GMV, GMD, WMV, and WMD as well as native space and MNI space T1-weightd images (only for ensemble DL framework) were used as candidate input feature sets for subsequent brain age prediction analysis.
Additional Feature Extraction Strategies for Conventional Machine Learning
To reduce computation costs and avoiding overfitting, we used two additional feature extraction strategies for ML-based brain age prediction models. First, following previous parcel-wise predictive analytical studies, we used a predefined composite brain atlas to extract the average GMV and GMD of each region of interest (ROI) from the preprocessed input features sets. This composite brain atlas included 400 cortical regions based on Schaefer's functional parcellation (28) and 42 subcortical and cerebellar structures from the Harvard-Oxford subcortical atlas and spatially unbiased infratentorial template (29). This feature extraction strategy yielded 442 × 2 = 884 structural features from both GMV and GMD as input feature sets for the subsequent ML-based predictive analyses of brain age.
In addition to the parcel-wise feature extraction strategy, we applied multivariate spatial independent component analysis (sICA) and spatial regression analysis as a secondary feature extraction strategy to obtain corresponding input feature sets across study participants. The details of the sICA-based feature extraction procedure have been described in our previous work (30, 31). Briefly, the preprocessed MNI space GMV, GMD, WMV, and WMD maps of the training dataset were concatenated as 4D datasets, respectively. For unbiased comparison with the aforementioned parcel-wise approach, the Multivariate Exploratory Linear Optimized Decomposition into Independent Components (MELODIC; FSL v5.0.9; http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/) tool was applied for each original input feature set to decompose concatenated 4D dataset into 400 spatially distinct components (voxel-by-component) with corresponding weighted parameters (component-by-subject) in the training sample. Subsequently, we applied spatial regression analysis of the 4D GMV, GMD, WMV, and WMD datasets against corresponding unthresholded 400 IC maps to calculate the final integrity scores (beta weights) of each IC (32). This sICA-based feature extraction strategy yielded 400 × 4 = 1,600 integrity scores from both GM and WM as input feature sets for the subsequent ML-based predictive analyses of brain age.
Construction of Brain Age Model From Conventional Machine Learning and Deep Learning Frameworks
Conventional Machine Learning Framework
Two widely used conventional ML algorithms, namely ridge regression and support vector regression (SVR), were first applied to investigate the interrelationships between different input feature sets and predictive performance of the constructed brain age estimators (33). Ridge regression is an L2-norm regularization linear regression approach which penalizes the magnitude of coefficients of input features and prevents overfitting during model fitting (34). In contrast, SVR is a kernel-based regression algorithm which transforms input data from the original space into a high-dimensional space with a specific kernel function (35, 36). In this study, we used the radial basis function (RBF) kernel which is effective for modeling the nonlinear relationship between input features across training samples. Ridge regression and SVR were performed using the scikit-learn library (37). In the training sample, we applied a nested 10-fold cross-validation scheme to determine the optimal regularization parameter (λ for ridge regression) and hyperparameters (C and gamma parameter for SVR) of each ML algorithm in the inner loop and then evaluated the predictive performance of the constructed brain age estimators in the outer loop (38, 39). Specifically, in the inner 10-fold cross-validation loop, we selected the λ parameter from among five values (0.001, 0.01, 1, 10, and 100) for ridge regression and selected the C parameter and gamma parameter individually from among seven values (0.001, 0.01, 0.1, 1, 10, 100, and 1,000) for SVR to obtain the optimal parameter for each ML algorithm. For each inner 10-fold cross-validation loop, the parameters of each ML algorithm were optimized using GridSearchCV function with the “neg_mean_absolute_error” scoring parameter in the scikit-learning package (40). In the outer 10-fold cross-validation loop, we constructed the brain age estimators with the optimal parameters to estimate individual brain age. In the proposed ML-based framework, we systematically evaluated the predictive performance of the constructed brain age estimators with different combinations ranging from two feature extraction methods (parcel-wise and sICA), four input feature sets (GMV, WMV, GMD, and WMD), and two ML algorithms (ridge regression and SVR). After selecting the optimal parameters for each potential combination for the training sample, the entire training sample was used to construct the final ML-based brain age estimators which were then applied to the hold-out validation sample to evaluate the generalizability of the constructed ML-based brain age predictive models. Notably, these regression-based approaches were subject to the phenomenon of “regression toward the mean” (41). To account for this phenomenon and meet the criteria of Objective 2, we performed an additional chronological age-brain age bias correction in the conventional ML framework to adjust the predicted brain age of each individual (9). These individual residualized brain ages were used as inputs for Objective 2.
Ensemble Deep Learning Framework
We proposed the ensemble DL framework using 26 layers of the 3D residual neural network (ResNet) composed of 3D convolution blocks by stacking a 3 × 3 × 3 convolution operation, ReLU activation function (42), batch normalization layer (43), and dropout technique (44) to construct the brain age predictive models (Figure 1). Previous studies have indicated data augmentation approach might expand the diversity of data properties and further improve the prediction performance (45–47). Therefore, to achieve superior prediction performance of the constructed ensemble DL model, we also generated novel synthetic assisted T2-weighted fluid-attenuated inversion recovery (FLAIR) images as an additional input feature set for the proposed DL models. The assisted T2-weighted FLAIR images were synthesized using an in-house U-Net generator that was trained from the BraTS dataset (https://www.med.upenn.edu/cbica/brats2019/data.html). The detailed methods of the synthesized T2-weighted FLAIR images are presented in Supplementary Methods. During this competition, the pretrained generator was applied to synthetic assisted T2-weighted FLAIR images from given raw T1-weighted images of each individual. Furthermore, we applied two regularization techniques as the loss function in the brain age model for two objectives in the competition. For the first regularization method, we used covariance matrix minimization between the error of predicted brain age Y′ to ground truth Y from chronological age:
where Lcov is the covariance loss function; Y′ ∈ RN is the N dimensional column vector containing an individual predicted brain age where N is the number of samples in the mini-batch; Y ∈ RN is similar to Y′ but contains individual chronological age; ⊗ is the outer product; and T is the transpose operation on the vector.
Figure 1. An illustration of the proposed 26-layer residual architecture for the competition. The proposed DL-based prediction model is composed of 12 residual blocks followed by a global average pooling and a fully connected linear layer to map the latent space information to individual predictive brain age. Each residual block includes two 3D convolutional layers and a residual shortcut. The stride 2 blocks reduce the output resolution of width, height, and depth to half of its inputs. An additional 1 × 1 convolution is also deployed in the shortcut to match the behavior of stride 2 blocks. BatchNorm, batch normalization; Conv, convolutional; Globalavgpool, global average pooling.
The second regularization method for the loss function minimized the ranking relationship in each mini-batch, the minimum correlation bias of predicted brain age Y′, and chronological age Y:
where Lrank is the ranking relationship loss function; N, Y′, Y, and T follow the same definitions as those in Eq. (1).
Based on the systematic evaluation of the ML approaches, we separately deployed two ensemble DL models for two objectives. In the data preprocessing, individual raw T1-weighted images and corresponding MNI space T1-weighted images were rescaled to the intensity range with [0, 1] by dividing the maximum value within the whole training data. All multi-modality brain images were further reshaped into the image dimensions of 121 × 145 × 121 with voxel size of 1.5 mm and stacked into a 4D input structure by using full-image processing. Furthermore, the DL-based brain age prediction models were constructed using a stochastic gradient descent (SGD) (48) optimizer with the following parameters: learning rate of 0.1, weight decay of 0.0005, and momentum of 0.9. The learning rate decays by a factor of 10 at 50 and 75% of the training progress. All models served for Objective 1 and 2 were trained in a total epoch of 300 and batch size of 8.
For the ensemble DL model of Objective 1, we assembled five different models including: (1) three channel inputs (raw T1-weighted images, GMV, and WMV), (2) three channel inputs (raw T1-weighted images, GMV, and WMV) with second regularization method, (3) three channel inputs (raw T1-weighted images, GMV, and WMV) with additional sex information, (4) three channel inputs (MNI space T1-weighted images, GMD, WMD), and (5) four channel inputs (MNI space T1-weighted images, GMD, WMD, and assisted T2-weighted FLAIR images) on the last layer before the fully connected regressor. We preserved the model by using the checkpoint with the smallest MAE during training. The predicted brain age was assembled by median aggregation from five distinct DL models.
For the ensemble DL model of Objective 2, the dropout rate was applied in the DL model to obtain the relative unbiased predictions. Consequently, the single 4D input structure features (raw T1-weighted images, GMV, and WMV), three different dropout rates (0.1, 0.15, and 0.2), and two different regularization methods within 300 epochs were used to construct the DL-based model. We further ranked the model from lowest to highest according to the MAE (below 3.8 years) and Spearman correlation (lower than 0.1) of 1800 checkpoints (6 configurations × 300 epochs) and selected the top eight models as the final set of predictive models for the competition. Finally, the predicted brain age was assembled by median aggregation from the eight DL models. The whole training sample was used to construct these two DL-based brain age estimators and subsequently applied to the hold-out validation sample to evaluate the generalizability of the DL-based brain age predictive models.
Assessment of Prediction Performance
For Objective 1, the predictive performance of the constructed brain age models was evaluated using multiple quantitative indices including mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2) between predicted brain age and chorological brain age of the hold-out validation sample. The Kullback–Leibler divergence (KLD) was calculated as a measure to quantify the difference between the probability distributions of chronological age and predicted brain age of the hold-out validation sample.
For Objective 2, the bias prediction was evaluated using the Spearman rank correlation (Spearman's rho) between the predicted brain age difference and the chronological age of the hold-out validation sample.
Results
Predictive Performance of Brain Age Estimators Using Conventional Machine Learning Approaches in the Training Sample
In the conventional ML frameworks, we first demonstrated that the predictive performance of sICA-based brain age estimators was generally superior to that of parcel-based brain age estimators irrespective of tissue volume and density information as input feature sets. This result suggested that a data-driven feature extraction strategy was superior to a knowledge-driven approach for predicting individual brain age. Within each conventional ML framework, our results demonstrate that the predictive performance of the constructed brain age estimator combining multiple input feature sets outperformed those with a single input feature set. When considering the single input feature set for predicting individual brain age, the brain age estimator which used individual GMD maps as the input feature set achieved better predictive performance than the feature set of the estimator using GMV maps (Table 1). Using the sICA feature extraction strategy with four distinct feature sets (GMV, WMV, GMD, and WMD), the final constructed brain age estimator exhibited the best performance for predicting individual brain age in the training sample (ridge regression: MAE = 4.50 years, R2 = 0.88; SVR: MAE = 4.20 years, R2 = 0.94). These empirical results served as baseline conditions for comparing the results of ensemble DL approaches.
Table 1. Exploring the prediction accuracy of the cross-validation in the training sample using conventional machine learning frameworks.
Predictive Performance of Brain Age Estimators Using Ensemble Deep Learning Approaches in the Training Sample
Based on the results of the conventional ML-based brain age estimators, we used multiple input feature sets to construct five distinct DL-based brain age estimators with different combinations. The final individual predicted brain age ensembled using the outputs of five distinct DL-based brain age estimators with the median aggregation approach. The predictive performance of the proposed ensemble DL Model 1 was superior to that of other sICA-based ML approaches (DL: MAE = 2.81 years, R2 = 0.94; Table 2).
Table 2. Exploring the prediction accuracy of the cross-validation in the training sample using deep learning frameworks.
Generalizability of Brain Age Estimators to the Validation Sample
Compared with the results of conventional ML approaches that used multiple input feature sets with sICA feature extraction strategy (ridge regression: MAE = 4.51 years, R2 = 0.88; SVR: MAE = 4.42 years, R2 = 0.88), the results in the validation sample suggested that each single DL-based brain age estimator provided more accurate predictions when compared with the conventional ML approaches (Model 1–1, MAE = 3.42 years; Model 1–2, MAE = 3.54 years; Model 1–3, MAE = 3.86 years; Model 1–4, MAE = 3.38 years; Model 1–5, MAE = 3.38 years), and ensemble DL Model 1 exhibited satisfactory generalizability to the validation sample (MAE = 3.77 years and R2 = 0.90) (Figure 2). In addition, using KLD as a measure for quantifying the distance between the density distribution of individual predicted brain age and chronological age, ensemble DL Model 1 produced more precise one-to-one correspondence when compared with conventional ML approaches (ensemble DL Model 1: KLD = 0.0125; ridge regression: KLD = 0.037; SVR: KLD = 0.034), especially in the validation sample with limited middle-to-late adulthood data (Figure 3).
Figure 2. Prediction accuracy in validation sample for conventional machine learning and deep learning frameworks. Scatterplots depict the detailed data distribution of predicted brain age and chronological age for the validation sample. DL, deep learning; MAE, mean absolute error; R2, coefficient of determination; SVR, support vector regression.
Figure 3. The distribution of predicted age and chronological age. The raw data distribution of chronological age and three predicted ages is presented at the top. Each bin shows the number of subjects. The overlapping distribution between chronological age and three predicted ages is presented at the bottom. DL, deep learning; KLD, Kullback–Leibler divergence; SVR, support vector regression.
Brain Age Bias of Different Brain Age Estimators
The prediction accuracy of with/without bias-adjustment in the validation sample is presented in Table 3. The predicted brain age with Cole's bias correction method weakened the association between predicted brain age differences and chronological age in conventional ML and DL approaches (ridge regression: −0.0247; SVR: −0.049, ensemble DL Model 1: −0.057). However, ensemble DL Model 2 which used the covariance loss function and ranking relationship loss function as regularization methods provided accurate predictive performance for minimizing the correlation and achieving the smallest MAE (ensemble DL Model 2: −0.01).
Application of Brain Age Estimators to an External Testing Dataset
After systematic evaluations, we first trained two distinct ensemble DL models which targeted two different objectives of the 2019 PAC using the whole training dataset and further applied the constructed brain age estimators to an external testing dataset. For Objective 1 which aimed for the smallest MAE between individual predicted brain age and chronological age, the predictive performance of ensemble DL Model 1 was a MAE of 3.33 years with Spearman's rank correlation of −0.39. For Objective 2 which aimed for the smallest MAE while concurrently maintaining the Spearman correlation below 0.1, the predictive performance of ensemble DL Model 2 was a MAE of 3.94 years and Spearman's rank correlation of −0.013. In conclusion, we ranked the fourth place from a total of 79 teams in both objectives in the 2019 PAC.
Discussion
In this study, we first provided empirical evidence of the possible relationship between different input feature sets and predictive performance of brain age estimation under the conventional ML-based framework. More specifically, we demonstrated that an sICA feature extraction strategy that integrated multiple features exhibited superior performance at predicting individual brain age than the parcel-wise approach. On the other hand, compared with conventional ML approaches, ensemble DL frameworks which integrated with multiple input feature sets and objective-specific regularization functions demonstrated superior predictive performance while concurrently minimizing the MAE and the correlation with chronological age in the same analytical framework.
In general, the basic analytical steps of constructing brain age estimation model included image preprocessing, feature extraction, and algorithm selection. Each step may have substantial influences on the predictive performance of the constructed brain age estimator. For T1-weighted images, multiple structural features, including tissue volume, tissue density, deformation field, cortical thickness, and surface area could be derived using different image preprocessing pipelines. Although GMV and cortical thickness of the human brain are the two most common input features for constructing brain age prediction models (10, 11, 49), previous studies also indicated that the changes in GMD play a specific role in both the developmental and aging period (5, 50). This also implies that GMD could serve as a potential candidate feature for predicting individual brain age. Our systematic evaluation also demonstrated that the prediction performance of the brain age prediction model using GMD was better than that using GMV. On the other hand, different image modalities of the brain MRI could capture the specific tissue properties of the human brain further related to the aging-associated patterns (51–53). In line with previous multimodal brain age studies, our results also support the notion that the predictive performance of the constructed brain age estimator which combined multiple input feature sets could outperform those with a single input feature set (53, 54).
Additionally, to reduce computation cost and overfitting problem in the conventional ML-based predictive framework, the feature extraction procedure was considered as an important element in predictive individual brain age. The advantage of the extracted features exerted a large impact on the performance of the prediction model (55). Compared with the use of a predefined atlas for feature extraction, the use of data-driven methods such as ICA enabled us to identify the large-scale network-wise structural covariance pattern of the structural MRI across study participants. The structural covariance is one way to measure large-scale brain morphometrical coordination profiles by estimating the similarity of tissue morphometrical features between different brain regions across participants (56). This approach is based on the notion that brain regions which interconnect with each other tend to be synchronized in maturation in a similar way, possibly due to shared neurotrophic and genetic factors (57). Using this analytical approach, previous studies also demonstrated that these identified large-scale structural covariance patterns of the human brain are highly associated with different neuropsychiatric disorders, neurodegenerative diseases, and the healthy aging process (32, 58–60). In line with previous brain age study which mainly focused on middle-to-late adulthood (31), the current study also demonstrated that the sICA-based feature extraction strategy could identify meaningful large-scale structural covariance patterns for estimating individual brain age with higher prediction accuracy.
Although the selection of the ML algorithm may affect the predictive performance of brain age estimation (51), the improvement of prediction performance was limited in this study. Potentially, the effective strategy of feature extraction method and input multimodality feature could lead to superior prediction performance, even using a relatively simpler ML algorithm. To sum up, the optimal brain age prediction, through the understanding of data characteristics and selection of machine learning strategy, could improve prediction ability. However, limitations of ML must be considered. For nonuniformly distributed data with insufficient data on middle-to-late adulthood, brain age estimation using the conventional ML-based approaches failed to achieve more accurate brain age prediction. Overcoming data bias stemming from insufficient data is an important issue for brain age estimation that should be addressed in the future. Increasing size of datasets, modifying training strategies, or using domain adaptation are possible strategies to resolve this problem.
The conventional ML algorithms may experience difficulties in engineering features to extract meaningful representations as model inputs. This process often heavily leverages prior knowledge from domain experts. However, even carefully designed feature engineering, dimensional reduction, and the loss of information may be hard to balance and are nontrivial issues to consider when seeking the optimal solution. In contrast, DL leverages the gradient descent algorithm to automatically search for a series of nonlinear transformations for feature extraction (61), which is more efficient and has the ability to obtain the optimal representation from the least preprocessed raw input data for a specific task. Additionally, in combination with ensemble learning, DL-based brain age estimation achieved superior predictive performance (62, 63). Our results also demonstrate that with a well-trained brain age model, DL could improve predictive accuracy and decrease prediction bias. Furthermore, DL was efficient at handling large-scale datasets because of the first-order gradient descent optimization algorithm. Although we demonstrated that the ensemble DL framework exhibited superior predictive performance when compared with the ML-based framework, the design of the DL framework has scope for further refinements. The model design space is extensive, and DL experts typically design based on their own experience. A recent AutoML pipeline (64) may reduce the required effort and automatically determine the optimal model. The framework of DL in brain age prediction or other neuroimaging analyses requires the concerted effort and collaborations of neuroscientists and data scientists.
Conclusion
In summary, ensemble DL-based brain age prediction models which combined multiple input feature sets and objective-specific regularization functions provide more accurate predictive performance, decrease the bias of chronological age, and maintain correspondence even with insufficient data. Our study provides valuable insight into ML approaches and DL frameworks in brain age prediction. Our findings may facilitate the development of training strategies for brain age prediction models in the future.
Data Availability Statement
Requests to access the datasets should be directed to Tim Hahn (hahnt@wwu.de) and Ramona Leenings (leenings@uni-muenster.de).
Author Contributions
C-YK, T-MT, P-LL, C-WT, C-YC, L-KC, C-KL, K-HC, SS, and C-PL contributed to the conception, design, and interpretation of data. C-YK, P-LL, and K-HC performed the image preprocessing and conventional machine learning approaches. T-MT, C-WT, C-YC, and C-KL performed the deep learning approaches. C-YK and P-LL contributed to the creation of the figures. C-YK, T-MT, P-LL, C-KL, and K-HC participated in drafting the manuscript. All authors have read and approve of the final version of the manuscript.
Funding
This work was supported by the Aging and Health Research Center at National Yang Ming University, Taiwan (MOST 110-2634-F-010-001); Center for Geriatrics and Gerontology of Taipei Veterans General Hospital of Taiwan (MOST 108-2321-B-010-013-MY2); Ministry of Science and Technology, Taiwan (MOST 107-2221-E-010-010-MY3; MOST 108-2420-H-010-001; MOST 108-2321-B-010-010-MY2; MOST 110-2321-B-010-004); The Brain Center at National Yang-Ming University, Taiwan (109BRC-B501); Veterans General Hospitals University System of Taiwan (VGHUST109-V1-3-3); The Brain Research Center, National Yang-Ming University from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE), Taipei, Taiwan.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
We thank all the partners who participated in this study. The author would like to thank NVIDIA AI Technology Center (NVAITC) for discussing and training deep learning approaches. We acknowledge support from the Open Access Publication Fund of Aging and Health Research Center, National Yang-Ming University, Taipei, Taiwan.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2021.626677/full#supplementary-material
References
1. Chan MY, Park DC, Savalia NK, Petersen SE, Wig GS. Decreased segregation of brain systems across the healthy adult lifespan. Proc Natl Acad Sci U S A. (2014) 111:E4997–5006. doi: 10.1073/pnas.1415122111
2. Lemaitre H, Goldman AL, Sambataro F, Verchinski BA, Meyer-Lindenberg A, Weinberger DR, et al. Normal age-related brain morphometric changes: nonuniformity across cortical thickness, surface area and gray matter volume? Neurobiol Aging. (2012) 33:617.e1–9. doi: 10.1016/j.neurobiolaging.2010.07.013
3. Sala-Llonch R, Bartres-Faz D, Junque C. Reorganization of brain networks in aging: a review of functional connectivity studies. Front Psychol. (2015) 6:663. doi: 10.3389/fpsyg.2015.00663
4. Biswal BB, Mennes M, Zuo XN, Gohel S, Kelly C, Smith SM, et al. Toward discovery science of human brain function. Proc Natl Acad Sci U S A. (2010) 107:4734–9. doi: 10.1073/pnas.0911855107
5. Sowell ER, Peterson BS, Thompson PM, Welcome SE, Henkenius AL, Toga AW. Mapping cortical change across the human life span. Nat Neurosci. (2003) 6:309–15. doi: 10.1038/nn1008
6. Westlye LT, Walhovd KB, Dale AM, Bjornerud A, Due-Tonnessen P, Engvig A, et al. Life-span changes of the human brain white matter: diffusion tensor imaging (DTI) and volumetry. Cereb Cortex. (2010) 20:2055–68. doi: 10.1093/cercor/bhp280
7. Driscoll I, Davatzikos C, An Y, Wu X, Shen D, Kraut M, et al. Longitudinal pattern of regional brain volume change differentiates normal aging from MCI Neurol. (2009) 72:1906–13. doi: 10.1212/WNL.0b013e3181a82634
8. Fjell AM, Westlye LT, Grydeland H, Amlien I, Espeseth T, Reinvang I, et al. Alzheimer disease neuroimaging: critical ages in the life course of the adult brain: nonlinear subcortical aging. Neurobiol Aging. (2013) 34:2239–47. doi: 10.1016/j.neurobiolaging.2013.04.006
9. Cole JH, Ritchie SJ, Bastin ME, Valdes Hernandez MC, Munoz Maniega S, Royle N, et al. Brain age predicts mortality. Mol Psychiatry. (2018) 23:1385–92. doi: 10.1038/mp.2017.62
10. Franke K, Ziegler G, Kloppel S, Gaser C, Alzheimer's disease neuroimaging Initiative. Estimating the age of healthy subjects from T1-weighted MRI scans using kernel methods: exploring the influence of various parameters. Neuroimage. (2010) 50:883–92. doi: 10.1016/j.neuroimage.2010.01.005
11. Cole JH, Leech R, Sharp DJ, Alzheimer's Disease Neuroimaging Initiative. Prediction of brain age suggests accelerated atrophy after traumatic brain injury. Ann Neurol. (2015) 77:571–81. doi: 10.1002/ana.24367
12. Steffener J, Habeck C, O'Shea D, Razlighi Q, Bherer L, Stern Y. Differences between chronological and brain age are related to education and self-reported physical activity. Neurobiol Aging. (2016) 40:138–44. doi: 10.1016/j.neurobiolaging.2016.01.014
13. Gaser C, Franke K, Kloppel S, Koutsouleris N, Sauer H, Alzheimer's Disease Neuroimaging Initiative. BrainAGE in mild cognitive impaired patients: predicting the conversion to Alzheimer's disease. PLoS One. (2013) 8:e67346. doi: 10.1371/journal.pone.0067346
14. Koutsouleris N, Davatzikos C, Borgwardt S, Gaser C, Bottlender R, Frodl T, et al. Accelerated brain aging in schizophrenia and beyond: a neuroanatomical marker of psychiatric disorders. Schizophr Bull. (2014) 40:1140–53. doi: 10.1093/schbul/sbt142
15. Schnack HG, van Haren NE, Nieuwenhuis M, Hulshoff Pol HE, Cahn W, Kahn RS. Accelerated brain aging in schizophrenia: a longitudinal pattern Recognition Study. Am J Psychiatry. (2016) 173:607–16. doi: 10.1176/appi.ajp.2015.15070922
16. Kolenic M, Franke K, Hlinka J, Matejka M, Capkova J, Pausova Z, et al. Obesity, dyslipidemia and brain age in first-episode psychosis. J Psychiatr Res. (2018) 99:151–8. doi: 10.1016/j.jpsychires.2018.02.012
17. Franke K, Hagemann G, Schleussner E, Gaser C. Changes of individual BrainAGE during the course of the menstrual cycle. Neuroimage. (2015) 115:1–6. doi: 10.1016/j.neuroimage.2015.04.036
18. Le TT, Kuplicki R, Yeh HW, Aupperle RL, Khalsa SS, Simmons WK, et al. Effect of ibuprofen on BrainAGE: a randomized, placebo-controlled, dose-response exploratory study. Biol Psychiatry Cogn Neurosci Neuroimaging. (2018) 3:836–43. doi: 10.1016/j.bpsc.2018.05.002
19. Cole JH, Poudel RPK, Tsagkrasoulis D, Caan MWA, Steves C, Spector TD. Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker. Neuroimage. (2017) 163:115–24. doi: 10.1016/j.neuroimage.2017.07.059
20. Vieira S, Pinaya WH, Mechelli A. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: methods and applications. Neurosci Biobehav Rev. (2017) 74(Pt A):58–75. doi: 10.1016/j.neubiorev.2017.01.002
21. Jiang H, Lu N, Chen K, Yao L, Li K, Zhang J, Guo X. Predicting brain age of healthy adults based on structural MRI parcellation using convolutional neural networks. Front Neurol. (2019) 10:1346. doi: 10.3389/fneur.2019.01346
22. Feng X, Lipton ZC, Yang J, Small SA, Provenzano FA, Alzheimer's Disease Neuroimaging Initiative, et al. Estimating brain age based on a uniform healthy population with deep learning and structural magnetic resonance imaging. Neurobiol Aging. (2020) 91:15–25. doi: 10.1016/j.neurobiolaging.2020.02.009
23. Yeatman JD, Wandell BA, Mezer AA. Lifespan maturation and degeneration of human brain white matter. Nat Commun. (2014) 5:4932. doi: 10.1038/ncomms5932
24. Pan I, Thodberg HH, Halabi SS, Kalpathy-Cramer J, Larson BD. Improving automated pediatric bone age estimation using ensembles of models from the 2017 RSNA machine learning challenge. Radiol Artif Intell. (2019) 1:e190053. doi: 10.1148/ryai.2019190053
25. Engemann DA, Kozynets O, Sabbagh D, Lemaitre G, Varoquaux G, Liem F, Gramfort A. Combining magnetoencephalography with magnetic resonance imaging enhances learning of surrogate-biomarkers. Elife. (2020) 9:e54055. doi: 10.7554/eLife.54055
26. Lorio S, Fresard S, Adaszewski S, Kherif F, Chowdhury R, Frackowiak RS, et al. New tissue priors for improved automated classification of subcortical brain structures on MRI. Neuroimage. (2016) 130:157–66. doi: 10.1016/j.neuroimage.2016.01.062
27. Liu HY, Lee PL, Chou KH, Lai KL, Wang YF, Chen SP, et al. The cerebellum is associated with 2-year prognosis in patients with high-frequency migraine. J Headache Pain. (2020) 21:29. doi: 10.1186/s10194-020-01096-4
28. Schaefer A, Kong R, Gordon EM, Laumann TO, Zuo XN, Holmes AJ, et al. Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. Cereb Cortex. (2018) 28:3095–114. doi: 10.1093/cercor/bhx179
29. Diedrichsen J. A spatially unbiased atlas template of the human cerebellum. Neuroimage. (2006) 33:127–38. doi: 10.1016/j.neuroimage.2006.05.056
30. Lee PL, Chou KH, Lu CH, Chen HL, Tsai NW, Hsu AL, et al. Extraction of large-scale structural covariance networks from grey matter volume for Parkinson's disease classification. Eur Radiol. (2018) 28:3296–305. doi: 10.1007/s00330-018-5342-1
31. Kuo CY, Lee PL, Hung SC, Liu LK, Lee WJ, Chung CP, et al. Large-scale structural covariance networks predict age in middle-to-late adulthood: a novel brain aging biomarker. Cereb Cortex. (2020) 30:5844–62. doi: 10.1093/cercor/bhaa161
32. Hafkemeijer A, Moller C, Dopper EG, Jiskoot LC, van den Berg-Huysmans AA, van Swieten JC, et al. Differences in structural covariance brain networks between behavioral variant frontotemporal dementia and Alzheimer's disease. Hum Brain Mapp. (2016) 37:978–88. doi: 10.1002/hbm.23081
33. Niu X, Zhang F, Kounios J, Liang H. Improved prediction of brain age using multimodal neuroimaging data. Hum Brain Mapp. (2020) 41:1626–43. doi: 10.1002/hbm.24899
34. Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. (1970) 12:55–67. doi: 10.1080/00401706.1970.10488634
35. Smola AJ, Schölkopf B. A tutorial on support vector regression. Stat Comput. (2004) 14:199–222. doi: 10.1023/B:STCO.0000035301.49549.88
36. Bennett KP, Campbell C. Support vector machines: hype or hallelujah? SIGKDD Explor. (2003) 2:1–13. doi: 10.1145/380995.380999
37. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O. Scikit-learn: machine learning in python. J Mach Learn Res. (2011) 12:2825–30.
38. Ambroise C, McLachlan GJ. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci U S A. (2002) 99:6562–6. doi: 10.1073/pnas.102102699
39. Varoquaux G, Raamana PR, Engemann DA, Hoyos-Idrobo A, Schwartz Y, Thirion B. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. Neuroimage. (2017) 145(Pt B):166–79. doi: 10.1016/j.neuroimage.2016.10.038
40. Abraham A, Pedregosa F, Eickenberg M, Gervais P, Mueller A, Kossaifi J, et al. Machine learning for neuroimaging with scikit-learn. Front Neuroinform. (2014) 8:14. doi: 10.3389/fninf.2014.00014
41. Galton F. Regression towards mediocrity in hereditary stature. J Anthropol Inst G B Irel. (1886) 15:246–63. doi: 10.2307/2841583
43. Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn. (2015) 37:448–56.
44. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. (2014) 15:1929–58.
45. Ravi D, Alexander DC, Oxtoby NP. Degenerative Adversarial Neuroimage Nets: Generating Images That Mimic Disease Progression. Cham: Springer International Publishing (2019).
46. Shin H-C, Tenenholtz NA, Rogers JK, Schwarz CG, Senjem ML, Gunter JL, et al. Medical Image Synthesis for Data Augmentation and Anonymization Using Generative Adversarial Networks. Cham: Springer International Publishing (2018).
47. Li Q, Yu Z, Wang Y, Zheng H. TumorGAN: a multi-modal data augmentation framework for brain tumor segmentation. Sensors (Basel). (2020) 20:4203. doi: 10.3390/s20154203
48. Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. Int Conf Mach Learn. (2013).
49. Khundrakpam BS, Tohka J, Evans AC, Brain Development Cooperative Group. Prediction of brain maturity based on cortical thickness at different spatial resolutions. Neuroimage. (2015) 111:350–9. doi: 10.1016/j.neuroimage.2015.02.046
50. Gennatas ED, Avants BB, Wolf DH, Satterthwaite TD, Ruparel K, Ciric R, et al. Age-related effects and sex differences in gray matter density, volume, mass, and cortical thickness from childhood to young adulthood. J Neurosci. (2017) 37:5065–73. doi: 10.1523/JNEUROSCI.3550-16.2017
51. Valizadeh SA, Hanggi J, Merillat S, Jancke L. Age prediction on the basis of brain anatomical measures. Hum Brain Mapp. (2017) 38:997–1008. doi: 10.1002/hbm.23434
52. Liem F, Varoquaux G, Kynast J, Beyer F, Kharabian Masouleh S, Huntenburg JM, et al. Predicting brain-age from multimodal imaging data captures cognitive impairment. Neuroimage. (2017) 148:179–88. doi: 10.1016/j.neuroimage.2016.11.005
53. Brown TT, Kuperman JM, Chung Y, Erhart M, McCabe C, Hagler DJ, et al. Neuroanatomical assessment of biological maturity. Curr Biol. (2012) 22:1693–8. doi: 10.1016/j.cub.2012.07.002
54. Cole JH. Multimodality neuroimaging brain-age in UK biobank: relationship to biomedical, lifestyle, cognitive factors. Neurobiol Aging. (2020) 92:34–42. doi: 10.1016/j.neurobiolaging.2020.03.014
55. Domingos P. A few useful things to know about machine learning. Commun. ACM. (2012) 55:78–87. doi: 10.1145/2347736.2347755
56. Mechelli A, Friston KJ, Frackowiak RS, Price JC. Structural covariance in the human cortex. J Neurosci. (2005) 25:8303–10. doi: 10.1523/JNEUROSCI.0357-05.2005
57. Alexander-Bloch A, Giedd JN, Bullmore E. Imaging structural co-variance between human brain regions. Nat Rev Neurosci. (2013) 14:322–36. doi: 10.1038/nrn3465
58. Hafkemeijer A, Altmann-Schneider I, de Craen AJ, Slagboom PE, van der Grond J, Rombouts SA. Associations between age and gray matter volume in anatomical brain networks in middle-aged to older adults. Aging Cell. (2014) 13:1068–74. doi: 10.1111/acel.12271
59. Gupta CN, Calhoun VD, Rachakonda S, Chen J, Patel V, Liu J, et al. Patterns of gray matter abnormalities in schizophrenia based on an international mega-analysis. Schizophr Bull. (2015) 41:1133–42. doi: 10.1093/schbul/sbu177
60. Li M, Li X, Das TK, Deng W, Li Y, Zhao L, et al. Prognostic utility of multivariate morphometry in schizophrenia. Front Psychiatry. (2019) 10:245. doi: 10.3389/fpsyt.2019.00245
62. Peng H, Gong W, Beckmann CF, Vedaldi A, Smith SM. Accurate brain age prediction with lightweight deep neural networks. Med Image Anal. (2021) 68:101871. doi: 10.1016/j.media.2020.101871
63. Couvy-Duchesne B, Faouzi J, Martin B, Thibeau-Sutre E, Wild A, Ansart M, et al. Ensemble learning of convolutional neural network, support vector machine, and best linear unbiased predictor for brain age prediction: ARAMIS Contribution to the Predictive Analytics Competition 2019 Challenge. Front Psychiatry. (2020) 11:593336. doi: 10.3389/fpsyt.2020.593336
Keywords: structural MRI, neuroimaging, brain age, machine learning, ensemble deep learning, regularization
Citation: Kuo C-Y, Tai T-M, Lee P-L, Tseng C-W, Chen C-Y, Chen L-K, Lee C-K, Chou K-H, See S and Lin C-P (2021) Improving Individual Brain Age Prediction Using an Ensemble Deep Learning Framework. Front. Psychiatry 12:626677. doi: 10.3389/fpsyt.2021.626677
Received: 06 November 2020; Accepted: 22 February 2021;
Published: 23 March 2021.
Edited by:
James H. Cole, University College London, United KingdomReviewed by:
Daniele Ravi, University of Hertfordshire, United KingdomHongyoon Choi, Seoul National University Hospital, South Korea
Copyright © 2021 Kuo, Tai, Lee, Tseng, Chen, Chen, Lee, Chou, See and Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Kun-Hsien Chou, dargonchow@gmail.com
†These authors share co-first authorship