Early neurological deterioration (END) is a frequent complication in patients with perforating artery territory infarction (PAI), leading to poorer outcomes. Therefore, we aimed to apply machine learning (ML) algorithms to predict the occurrence of END in PAI and investigate related risk factors.
This retrospective study analyzed a cohort of PAI patients, excluding those with severe stenosis of the parent artery. We included demographic characteristics, clinical features, laboratory data, and imaging variables. Recursive feature elimination with cross-validation (RFECV) was performed to identify critical features. Seven ML algorithms, namely logistic regression, random forest, adaptive boosting, gradient boosting decision tree, histogram-based gradient boosting, extreme gradient boosting, and category boosting, were developed to predict END in PAI patients using these critical features. We compared the accuracy of these models in predicting outcomes. Additionally, SHapley Additive exPlanations (SHAP) values were introduced to interpret the optimal model and assess the significance of input features.
The study enrolled 1,020 PAI patients with a mean age of 60.46 (range 49.11–71.81) years. Of these, 30.39% were women, and 129 (12.65%) experienced END. RFECV selected 13 critical features, including blood urea nitrogen (BUN), total cholesterol (TC), low-density-lipoprotein cholesterol (LDL-C), apolipoprotein B (apoB), atrial fibrillation, loading dual antiplatelet therapy (DAPT), single antiplatelet therapy (SAPT), argatroban, the basal ganglia, the thalamus, the posterior choroidal arteries, maximal axial infarct diameter (measured at < 15 mm), and stroke subtype. The gradient-boosting decision tree had the highest area under the curve (0.914) among the seven ML algorithms. The SHAP analysis identified apoB as the most significant variable for END.
Our results suggest that ML algorithms, especially the gradient-boosting decision tree, are effective in predicting the occurrence of END in PAI patients.