Machine learning (ML) has been widely utilized for constructing high-performance prediction models. This study aimed to develop a preoperative machine learning-based prediction model to identify functional recovery one year after hip fracture surgery.
We collected data from 176 elderly hip fracture patients admitted to the Department of Orthopaedics and Oncology at Shenzhen Second People's Hospital between May 2019 and December 2019, who met the inclusion criteria. Patient's functional recovery was monitored for one year after surgery. We selected 26 factors, comprising 12 preoperative indicators, 8 surgical indicators, and 6 postoperative indicators. Eventually, 77 patients were included based on the exclusion criteria. Random allocation divided them into the training set (70%) and test set (30%) for internal validation. The Lasso method was employed to screen prognostic variables. We conducted comparisons among various common machine learning classifiers to determine the best prediction model. Prediction performance was evaluated using the area under the receiver operating characteristic curve (ROC), calibration curve, and decision curve analysis. To identify the importance of the predictor variables, we performed the recursive feature elimination (RFE) algorithm based on Shapley Additive Explanations (SHAP) values.
The AUCs for the testing dataset were as follows: logistic regression (Logit) model = 0.934, k-nearest neighbors (KNN) model = 0.930, support vector machine (SVM) model = 0.910, Gaussian naive Bayes (GNB) model = 0.926, decision tree (DT) model = 0.730, random forest (RF) model = 0.957, and Extreme Gradient Boosting (XGB) model = 0.902. Among the seven ML-based models tested, the RF model demonstrated the best prediction performance, incorporating four features: postoperative rehabilitation compliance, marital status, age-adjusted Charlson comorbidity score (aCCI), and clinical frailty scale (CFS).
We developed a prediction model for the functional recovery following hip fracture surgery in elderly patients after one year, based on the Random Forest (RF) algorithm. This model exhibited superior prediction performance (ROC) compared to other models. The software application is available for use. External validation in a larger patient cohort or diverse hospital settings is necessary to assess the clinical utility of this tool.