AUTHOR=Ni Zhihui , Zhu Yehao , Qian Yiwei , Li Xinbo , Xing Zhenqiu , Zhou Yinan , Chen Yu , Huang Lijie , Yang Jianjing , Zhuge Qichuan TITLE=Synthetic minority over-sampling technique-enhanced machine learning models for predicting recurrence of postoperative chronic subdural hematoma JOURNAL=Frontiers in Neurology VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2024.1305543 DOI=10.3389/fneur.2024.1305543 ISSN=1664-2295 ABSTRACT=Objective

Chronic subdural hematoma (CSDH) is a neurological condition with high recurrence rates, primarily observed in the elderly population. Although several risk factors have been identified, predicting CSDH recurrence remains a challenge. Given the potential of machine learning (ML) to extract meaningful insights from complex data sets, our study aims to develop and validate ML models capable of accurately predicting postoperative CSDH recurrence.

Methods

Data from 447 CSDH patients treated with consecutive burr-hole irrigations at Wenzhou Medical University’s First Affiliated Hospital (December 2014-April 2019) were studied. 312 patients formed the development cohort, while 135 comprised the test cohort. The Least Absolute Shrinkage and Selection Operator (LASSO) method was employed to select crucial features associated with recurrence. Eight machine learning algorithms were used to construct prediction models for hematoma recurrence, using demographic, laboratory, and radiological features. The Border-line Synthetic Minority Over-sampling Technique (SMOTE) was applied to address data imbalance, and Shapley Additive Explanation (SHAP) analysis was utilized to improve model visualization and interpretability. Model performance was assessed using metrics such as AUROC, sensitivity, specificity, F1 score, calibration plots, and decision curve analysis (DCA).

Results

Our optimized ML models exhibited prediction accuracies ranging from 61.0% to 86.2% for hematoma recurrence in the validation set. Notably, the Random Forest (RF) model surpassed other algorithms, achieving an accuracy of 86.2%. SHAP analysis confirmed these results, highlighting key clinical predictors for CSDH recurrence risk, including age, alanine aminotransferase level, fibrinogen level, thrombin time, and maximum hematoma diameter. The RF model yielded an accuracy of 92.6% with an AUC value of 0.834 in the test dataset.

Conclusion

Our findings underscore the efficacy of machine learning algorithms, notably the integration of the RF model with SMOTE, in forecasting the recurrence of postoperative chronic subdural hematoma. Leveraging the RF model, we devised an online calculator that may serve as a pivotal instrument in tailoring therapeutic strategies and implementing timely preventive interventions for high-risk patients.