AUTHOR=Xu Jiangbao , Yuan Cuijie , Yu Guofeng , Li Hao , Dong Qiutong , Mao Dandan , Zhan Chengpeng , Yan Xinjiang TITLE=Predicting cerebral edema in patients with spontaneous intracerebral hemorrhage using machine learning JOURNAL=Frontiers in Neurology VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2024.1419608 DOI=10.3389/fneur.2024.1419608 ISSN=1664-2295 ABSTRACT=Background

The early prediction of cerebral edema changes in patients with spontaneous intracerebral hemorrhage (SICH) may facilitate earlier interventions and result in improved outcomes. This study aimed to develop and validate machine learning models to predict cerebral edema changes within 72 h, using readily available clinical parameters, and to identify relevant influencing factors.

Methods

An observational study was conducted between April 2021 and October 2023 at the Quzhou Affiliated Hospital of Wenzhou Medical University. After preprocessing the data, the study population was randomly divided into training and internal validation cohorts in a 7:3 ratio (training: N = 150; validation: N = 65). The most relevant variables were selected using Support Vector Machine Recursive Feature Elimination (SVM-RFE) and Least Absolute Shrinkage and Selection Operator (LASSO) algorithms. The predictive performance of random forest (RF), GDBT, linear regression (LR), and XGBoost models was evaluated using the area under the receiver operating characteristic curve (AUROC), precision–recall curve (AUPRC), accuracy, F1-score, precision, recall, sensitivity, and specificity. Feature importance was calculated, and the SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) methods were employed to explain the top-performing model.

Results

A total of 84 (39.1%) patients developed cerebral edema changes. In the validation cohort, GDBT outperformed LR and RF, achieving an AUC of 0.654 (95% CI: 0.611–0.699) compared to LR of 0.578 (95% CI, 0.535–0.623, DeLong: p = 0.197) and RF of 0.624 (95% CI, 0.588–0.687, DeLong: p = 0.236). XGBoost also demonstrated similar performance with an AUC of 0.660 (95% CI, 0.611–0.711, DeLong: p = 0.963). However, in the training set, GDBT still outperformed XGBoost, with an AUC of 0.603 ± 0.100 compared to XGBoost of 0.575 ± 0.096. SHAP analysis revealed that serum sodium, HDL, subarachnoid hemorrhage volume, sex, and left basal ganglia hemorrhage volume were the top five most important features for predicting cerebral edema changes in the GDBT model.

Conclusion

The GDBT model demonstrated the best performance in predicting 72-h changes in cerebral edema. It has the potential to assist clinicians in identifying high-risk patients and guiding clinical decision-making.