
94% of researchers rate our articles as excellent or good
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.
Find out more
ORIGINAL RESEARCH article
Front. Endocrinol., 25 March 2025
Sec. Clinical Diabetes
Volume 16 - 2025 | https://doi.org/10.3389/fendo.2025.1526098
Background: Diabetic foot ulcers (DFUs) constitute a significant complication among individuals with diabetes and serve as a primary cause of nontraumatic lower-extremity amputation (LEA) within this population. We aimed to develop machine learning (ML) models to predict the risk of LEA in DFU patients and used SHapley additive explanations (SHAPs) to interpret the model.
Methods: In this retrospective study, data from 1,035 patients with DFUs at Sun Yat-sen Memorial Hospital were utilized as the training cohort to develop the ML models. Data from 297 patients across multiple tertiary centers were used for external validation. We then used least absolute shrinkage and selection operator analysis to identify predictors of amputation. We developed five ML models [logistic regression (LR), support vector machine (SVM), random forest (RF), k-nearest neighbors (KNN) and extreme gradient boosting (XGBoost)] to predict LEA in DFU patients. The performance of these models was evaluated using several metrics, including the area under the receiver operating characteristic curve (AUC), decision curve analysis (DCA), precision, recall, accuracy, and F1 score. Finally, the SHAP method was used to ascertain the significance of the features and to interpret the model.
Results: In the final cohort comprising 1332 individuals, 600 patients underwent amputation. Following hyperparameter optimization, the XGBoost model achieved the best amputation prediction performance with an accuracy of 0.94, a precision of 0.96, an F1 score of 0.94 and an AUC of 0.93 for the internal validation set on the basis of the 17 features. For the external validation set, the model attained an accuracy of 0.78, a precision of 0.93, an F1 score of 0.78, and an AUC of 0.83. Through SHAP analysis, we identified white blood cell counts, lymphocyte counts, and blood urea nitrogen levels as the model’s main predictors.
Conclusion: The XGBoost algorithm-based prediction model can be used to dynamically estimate the risk of LEA in DFU patients, making it a valuable tool for preventing the progression of DFUs to amputation.
At present, more than 550 million people are diagnosed with type 2 diabetes mellitus (T2DM) globally, and the prevalence of T2DM continues to increase (1). Projections indicate that by 2045, the number of individuals diagnosed with diabetes worldwide will increase to 700 million (1). Moreover, advancements in medical treatment have substantially prolonged the life expectancy of individuals with diabetes, leading to a notable increase in the prevalence of chronic diabetic complications (2). Among the myriad of diabetic complications, diabetic foot ulcers (DFUs) constitute a particularly severe and prevalent issue. DFUs are distinguished not only by their notably high mortality rate but also by their substantial contribution to approximately 85% of nontraumatic amputations worldwide (3). A previous study indicated that patients with DFUs perceive the risk of lower-extremity amputation (LEA) as a more significant concern than mortality throughout the progression of the disease (4). This phenomenon is attributable to the significant effects of LEA on patients’ physical and psychological well-being, leading to prolonged hospitalization, considerable financial strain, intricate treatment protocols, and a significantly diminished quality of life. Moreover, patients with DFUs who have undergone LEA have a poor prognosis, with a 3-year mortality rate of 35–50% (5) and a 5-year mortality rate of 52–80% (6). Consequently, conducting personalized assessments for patients with DFUs to evaluate their risk of amputation and identify associated risk factors can provide essential insights for early intervention treatments. The results of this analysis are expected to be useful for reducing the incidence of amputation surgeries, decreasing patient mortality rates, and lowering healthcare costs.
Presently, widely utilized classification systems for DFU, such as the Diabetic Ulcer Severity Score, the Meggitt–Wagner classification, and the University of Texas Diabetic Wound Classification, serve as standard instruments for informing treatment strategies and predicting the risk of disease progression in patients with DFUs (7–9). Although these classification systems have the potential to predict amputation risk in patients, they have not been universally adopted as the gold standard (10). This limitation stems primarily from practitioners’ reliance on clinical experience rather than objective statistical data for scoring. Furthermore, these systems cannot be used to integrate demographic information, clinical and laboratory data, medical history, foot condition, and other pertinent risk factors comprehensively (11, 12). This limitation has led to diminished sensitivity and specificity in predicting amputation risk among patients with DFUs.
DFUs present substantial complexity owing to the clinical heterogeneity observed among patients and the multimodal data obtained from various disciplines, such as imaging, surgery, and endocrinology. To elucidate the complexity of DFUs, it is imperative to use advanced analytical methodologies, such as machine learning (ML) and artificial intelligence (AI) (13). These data analysis techniques are used to develop algorithms for predicting outcomes by “learning” from data (14). Through the utilization of ML, clinical physicians can now predict the healing trajectories of DFUs, assess the risk of amputation, and develop personalized treatment plans on the basis of clinical data. Several studies have investigated the application of ML techniques in predicting diabetic foot amputations (15). However, the sample sizes in some studies are relatively small, which may limit their representativeness of the population (16–18).Moreover, several of these small-sample studies employ only a single type of ML algorithm for model development (18, 19). Consequently, there is an urgent need for the development of more sophisticated and advanced models capable of effectively addressing the heterogeneity observed in patients with DFUs. While various ML algorithms have been employed in several studies, their complex nature might limit their interpretation by patients and clinicians in real-world clinical settings (16, 17). The “black-box” nature of traditional ML algorithms poses challenges in explaining the specific patient characteristics that contribute to a particular prediction. The limited interpretability of ML methods constrains their application in medical decision support and it is also one of the significant barriers to their implementation in real-world clinical settings (20). To overcome these limitations, our study incorporated the ML algorithm with SHapley Additive exPlanations (SHAPs) (21). In addition to enhancing the precision of amputation risk prediction in patients with DFUs, SHAP provides intuitive explanations that empower patients to understand their own risk factors. It can aid clinicians in comprehending the decision-making process for evaluating disease severity and optimizing opportunities for early intervention, while also contributing to the development of interpretable and personalized risk prediction models.
In short, the small sample sizes of amputee patients, coupled with the limited interpretability of models, constrains the application of priors ML models in medical decision support systems. Thus, this study aimed to utilize data from multiple medical centers concerning patients with DFUs to develop and evaluate various ML models, ultimately identifying the best model for predicting the risk of amputation during hospitalization. Furthermore, SHAPs were used to visualize the optimal model and investigate the factors influencing the prognosis of DFUs. The objective was to equip healthcare providers with a concise and valuable instrument for identifying DFU patients at risk of amputation, enabling effective interventions to optimize clinical outcomes and improve the quality of life for these patients.
In this retrospective cohort study, we developed a series of ML models to predict the risk of amputation in patients with DFUs. This study involved the development, validation and subsequent interpretation of the models. Following the preprocessing of the data and the selection of relevant variables, models were constructed utilizing 5 distinct ML algorithms. The model performances were subsequently evaluated via both internal and external validation datasets to identify the optimal model. Finally, SHAP was used to elucidate the optimal model. Figure 1 illustrates the comprehensive research process, including the criteria for inclusion and exclusion, data preprocessing, feature selection, dataset partitioning, model development and validation, model comparison, and selection and interpretation of the optimal model.
Figure 1. Workflow for constructing explainable machine learning models for predicting the risk of amputation in diabetic foot ulcer patients. DFU, diabetic foot ulcer; T2DM, type 2 diabetes mellitus; KNN, k-nearest neighbors; LR, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting.
Data for the training cohort were derived from the clinical records of patients with DFUs (Wagner grades 1–5) (22) admitted to the Endocrinology Department at Sun Yat-sen Memorial Hospital from January 2015 to October 2023. The inclusion criteria were as follows: (1) a confirmed diagnosis of T2DM and (2) over the age of 18. The exclusion criteria included: (1) diabetes types other than T2DM; (2) refusal or discontinuation of treatment; (3) the presence of malignant tumors; and (4) significant deficiencies in clinical examination data or ambiguity in clinical outcomes. On the basis of these criteria, a total of 1035 T2DM patients with DFUs were included in the training cohort.
With data collected between January 2020 and October 2023, the external validation cohort consisted of patients with T2DM and DFUs from Shantou Central Hospital, Dongguan People’s Hospital, Jieyang People’s Hospital and Shenzhen Central Hospital. The inclusion and exclusion criteria were consistent with those of the training cohort. Finally, the external validation cohort included a total of 297 patients with T2DM complicated by DFUs. This study received approval from the Ethics Committee of Sun Yat-sen Memorial Hospital. Given that all patient data were anonymized and that the study did not influence clinical decision-making, the requirements for individual patient consent and an ethical informed consent statement were waived. The baseline characteristics of the demographic and clinical variables for the training cohort and external validation cohorts are detailed in Table 1 and Supplementary Table 1.
The outcome of our study was amputation, which included both minor and major amputations (any LEA). The term major amputation refers to amputations above the ankle, and minor amputation refers to any amputation below the ankle.
Drawing upon contemporary research and clinical guidelines, we selected 62 potential predictive factors that may influence the risk of lower-limb amputation in patients with DF. The variables selected for this study included the following demographic and clinical characteristics: age, weight, height, body mass index (BMI), Wagner grade, smoking history, alcohol consumption history, and history of previous ulcers. Four comorbidities were considered: hypertension, history of cardiovascular disease, diabetic nephropathy (DN), and diabetic peripheral neuropathy (PND). Additionally, in the present study, we incorporated lower-limb vascular imaging examinations, which were conducted to assess vascular occlusion, vascular calcification, and arteriosclerosis. Furthermore, a total of 39 laboratory indicators were selected for the study, including D-dimer, C-reactive protein (CRP), the neutrophil count, hemoglobin (Hb), glycated Hb (HbA1c), triglycerides (TGs), low-density lipoprotein (LDL), albumin (ALB), blood urea nitrogen (BUN), and creatinine (Cr), among others.
Following the selection of study variables, data were extracted from the health information systems (HISs) of various hospitals. Indicators with more than 20% missing data were subsequently excluded from the analysis. Finally, a total of 55 candidate variables were selected within the training cohort. For variables with missing values constituting less than 5% of the data, imputation was performed using the median value. In cases where the proportion of missing data exceeded 5%, multiple imputation via random forest (RF) methodology was applied to address the missing values (23). To identify the most predictive factors for amputation and reduce the possibility of overfitting among the included variables, we used least absolute shrinkage and selection operator (LASSO) regression on the entire dataset of the training cohort. This approach facilitated the elimination of confounding variables, thereby enhancing model performance and mitigating the risk of overfitting. In LASSO regression, the coefficient estimates were regularized towards zero, with the degree of shrinkage being governed by an additional parameter, denoted as λ. To calculate the best possible values for λ, we used 10-fold cross-validation and iteratively applied LASSO regression to each fold. We subsequently identified the optimal tuning parameter (min λ) for the model by minimizing the cross-validation error and validated the model selection parameters to select the optimal predictive variable (24). After a thorough evaluation of the model features and their performance, we identified the min λ-1se as the final parameter for the Lasso model.
As mentioned above, we sequentially undertook the development, validation, interpretation, and application of ML models in the current study. Initially, the patients in the training cohort were randomly partitioned, with 70% allocated to the training set and 30% to the internal validation set. We used five ML algorithms [extreme gradient boosting (XGBoost), support vector machine (SVM), RF, k-nearest neighbors (KNN) and logistic regression (LR)] to develop predictive models (25). In the training set, model hyperparameters were optimized to reduce overfitting and improve accuracy via GridSearch with tenfold cross-validation. We subsequently constructed receiver operating characteristic (ROC) curves and decision curve analysis (DCA) curves via the internal validation set. We then assessed the predictive performance of various ML models by calculating the accuracy, area under the curve (AUC), recall rate, precision rate, and F1 score (26, 27). The performance of the five models was further validated using an external dataset, employing the same methodological approach. We evaluated model performance via AUCs and F1 scores as the principal metrics and selected the optimal model on the basis of these metrics. To improve the interpretability of the ML model outcomes and analyze the contributing factors, we used SHAP to evaluate the feature importance of the optimal model.
The normality of continuous variables was assessed via the Kolmogorov–Smirnov (K–S) test. Continuous variables with normal distributions are reported as the means (standard deviations, SDs) and were compared via independent samples t tests. Conversely, variables that did not follow a normal distribution are presented as the median (interquartile range) and were compared via the Kruskal–Wallis test. Categorical variables are presented as frequencies and percentages (n, %). Comparative analyses between the amputee and nonamputee groups were conducted via Student’s t test, the Mann–Whitney U test, or the chi–square test, contingent upon the distribution of the variables. A p value of less than 0.05 was considered to indicate statistical significance. Data preprocessing, model construction, validation, and interpretation of the ML models were executed via R Studio version 4.2 and Python (v. 3.8.3).
As depicted in Figure 2, our study included a total of 1,332 patients diagnosed with DFUs. Among these, 1,035 patients were allocated to the training cohort, whereas 297 patients were assigned to the external validation cohort. The patients were categorized into two distinct groups on the basis of their posttreatment amputation status. In the training cohort, 45.6% (472/1035) of the patients underwent amputation, whereas in the external validation cohort, the incidence of amputation was 43.1% (128/297). Our data suggest that patients who underwent amputation presented elevated levels of inflammatory markers, including WBC and PCT, alongside increased serum Cr and BUN levels. Furthermore, these patients had increased levels of platelets (PLTs), D-dimers, and fibrinogen. Conversely, the levels of Hb, serum ALB, serum globulin (GLB), and lipids (including TGs, HDL, and LDL) were significantly lower in amputee patients than in nonamputee patients. The baseline characteristics of all the candidate variables for the training cohort are detailed in Table 1.
To attain a further reduction in data dimensionality, in the present study, we utilized LASSO regression analysis to identify and select relevant features from the training cohorts. The LASSO regression model incorporated a total of 55 variables, and the plot of the coefficients for this analysis is shown in Figure 3A. Each curve represents one variable. For each value of λ, the variables and their corresponding nonzero coefficients form a LASSO model. We subsequently used 10-fold cross-validation to analyze and determine the optimal LASSO regression parameters (28). When λ = 0.0058, the cross-validation error of the model is minimized. Nevertheless, to further reduce the number of variables included in the model, we opted for λmin-1 se (λ = 0.0111) as the final parameter for the LASSO regression analysis (Figure 3B) (29). Ultimately, seventeen variables were identified as predictive factors for amputation in the ML model. The regression coefficients of these variables are depicted in Figure 3C.
Figure 3. LASSO regression analysis was used to select potential variables. A total of 53 variables were initially included, and 17 variables were ultimately selected for further analysis. (A) LASSO coefficient analysis of the clinical features. (B) Tuning parameter selection in the LASSO regression model from 10-fold cross-validation. (C) Plot of the LASSO coefficient of the 17 candidate predictors for amputation. WBC, white blood cell; BUN, blood urea nitrogen; PLT, platelet; TG, triglyceride; Pct, procalcitonin; LDL, low-density lipoprotein; UA, uric acid; HGB, hemoglobin; ALB, albumin; GLB, globulin; LYM, lymphocyte.
In the model development and validation phase, using GridSearchCV from the sklearn library, we initially identified the optimal hyperparameters for five distinct ML models. Comprehensive details regarding the hyperparameters of the ML models are presented in Table 2. The final models were subsequently trained using the optimized hyperparameters and the 17 variables selected through LASSO regression as input features. The five ML models (KNN, LR, SVM, RF, and XGBoost) demonstrated robust discriminative capabilities, as evidenced by their AUCs (95% CI) of 0.94 (0.91-0.97), 0.93 (0.90-0.96), 0.93 (0.90-0.96), 0.95 (0.92-0.98), and 0.93 (0.90-0.96), respectively, in the internal validation set. The AUC curves of the five ML models evaluated on the internal validation dataset are displayed in Figure 4. Furthermore, the F1 score was also chosen to compare model performance, as it effectively measures accuracy in imbalanced datasets by harmonizing precision and recall (Table 2). Moreover, the DCA is visually presented in Figure 4B as an adequate representation of the model’s clinical utility.
Table 2. Hyperparameters of the ML models and comparison of performance among the five models in the internal validation cohort.
Figure 4. ROC curve and DCA comparison of the 5 models in the internal validation set. (A) ROC curves of five ML models for predicting amputation in DFU patients. (B) DCA for the 5 models in the internal validation set. AUC, area under the receiver operating characteristic curve; KNN, k-nearest neighbors; LR, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting.
To validate the model in different patients, we created an external validation cohort and rigorously assessed the model’s performance. The external cohort was recruited from four hospitals from 2020–2023. The baseline characteristics of the 17 variables selected as input features for the ML models in the external validation cohorts are detailed in Additional File 1. In the external validation cohort, the AUC values (95% CIs) for the KNN, LR, SVM, RF, and XGBoost models were 0.77 (0.73, 0.82), 0.80 (0.75, 0.85), 0.81 (0.76, 0.86), 0.83 (0.78, 0.87), and 0.84 (0.80, 0.90), respectively (Figure 5A). In Table 3, we presented a summary of the performance metrics for the five models, including the AUC, accuracy, precision, recall, and F1 score. In addition, Figure 5B visually presented the DCA in the external validation cohort, effectively illustrating the model’s clinical utility.
Figure 5. ROC curve and DCA comparison of the 5 models in the external validation set. (A) ROC curves of 5 ML models for predicting amputation in DFU patients within the external validation set. (B) DCA for the 5 models in the external validation set. AUC, area under the receiver operating characteristic curve; KNN, k-nearest neighbors; LR, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting.
Table 3. Comparison of the performance of the five machine learning models within the external validation cohort.
Compared with the other models, XGBoost consistently demonstrated superior AUCs and F1 scores when evaluated with external validation cohorts. Furthermore, it maintained a high recall rate in the external validation set, thereby demonstrating the model’s ability to accurately identify patients at risk of amputation among those suffering from DFUs. Consequently, subsequent analysis was conducted via the XGBoost model.
The SHAP algorithm was used to assess the importance of each predictive feature in relation to the amputation results, thereby providing further insight into the prediction mechanism used by the XGBoost model. Using SHAP, we assessed the global importance of all 17 features across the entire dataset of training cohorts to elucidate their overall impacts. The results of this analysis are depicted in a summary plot in Figure 6. In the SHAP summary plot, positive Shapley values for each feature signify an elevated risk of amputation, whereas negative values denote a diminished risk of amputation. Correspondingly, the colors depicted in the figure represent the magnitude of feature values, with red indicating high feature values and blue denoting low feature values. For example, patients with poor nutritional status (characterized by decreased levels of ALB, LDL, GLB, and TGs) during treatment are more likely to require amputation than those with elevated levels of these biomarkers. Additionally, patients with elevated white blood cell counts or decreased lymphocyte and Hb levels during treatment have a greater risk of amputation than do those without these hematological abnormalities.
Figure 6. SHAP summary plot of the 17 features of the XGBoost model. For each patient, a dot is generated corresponding to the attribution value of each feature in the model, resulting in one dot per feature per patient on the line. Each line represents a feature, and the abscissa is the SHAP value. The higher the SHAP value of a feature is, the greater the probability of an amputation event. The dots are colored according to the patient’s feature values and are accumulated vertically to describe the density. Green represents a high feature value (in this case, death), whereas blue represents a low feature value. WBC, white blood cell; BUN, blood urea nitrogen; PLT, platelet; TG, triglyceride; Pct, procalcitonin; LDL, low-density lipoprotein; UA, uric acid; HGB, hemoglobin; ALB, albumin; GLB, globulin; LYM, lymphocyte.
In Figure 7, we demonstrate the application of the SHAP method in elucidating individual model predictions, providing an intuitive framework for guiding clinicians’ decision-making processes and deepening their understanding of the model’s predictive mechanisms. The force plots begin with the average of all predictions as their base value. Each predictor, along with its corresponding Shapley value, is depicted by an arrow that either increases (indicated in red) or decreases (indicated in blue) the model’s predicted value. The feature values are listed at the top of the plot. Finally, the convergence points of the red and blue arrows represent the predicted output values of the model. In Figure 7A, the patient was diagnosed with a Wagner stage-III DFU. The patient’s lymphocyte count (1.63×109/L), WBC count (4.78×109/L), BUN level (3.4 mmol/L), GLB level (40.6 g/L), and fasting blood glucose level (9.7 mmol/L) were critical parameters for accurately predicting the likelihood of avoiding amputation. However, the patient’s ALB level of 21.6 g/L and Wagner grade were inversely correlated with the prediction of amputation. On the basis of the prediction model, the outcome depicted in Figure 7A, where f(x) = −4.33, suggests a high probability of nonamputation. In contrast, the outcomes presented in Figures 7B, C, with f(x) values of 3.69 and 0.518, respectively, indicate a relatively high likelihood of amputation.
Figure 7. Force plot of model prediction results suggested for three randomly selected samples via SHAP values. The f(x) value represents the output values. The feature values are listed at the top of the plot with feature names. Each group of features was ranked from the center to both ends according to the extent of their impact. The length of the bar for each feature reflects the weight of that feature in the prediction. Factors that increased the predicted score are colored red, and those that decreased the predicted score are shown in blue. (A) A patient who did not undergo amputation; (B) A patient who underwent amputation; (C) Presentation of a patient without amputation via the SHAP method. WBC, white blood cell; BUN, blood urea nitrogen; PLT, platelet; FastingBG, fasting blood glucose; Pct, procalcitonin; LDL, low-density lipoprotein; UA, uric acid; HGB, hemoglobin; ALB, albumin; GLB, globulin; LYM, lymphocyte.
The aim of this study was to develop and validate five ML models for prognostic prediction in patients with DFUs undergoing amputation, thereby providing clinicians with reliable diagnostic information and treatment options. Ultimately, the XGBoost model was used as the baseline model for the study. The model exhibited exceptional predictive performance across both the internal and the external validation datasets, achieving AUC values of 0.93 and 0.84, respectively. Moreover, with the use of SHAP values and corresponding visualizations, we elucidated the influence of each clinical feature on the performance of the overall XGBoost model. The illustration of feature importance contributes to a comprehensive understanding of the models used for predicting amputation in patients with DFUs.
In the era of big data, a growing array of ML algorithms has been increasingly applied in research on disease risk and prognosis. Previous studies have also conducted assessments of amputation risk for DFU patients. In a prior investigation involving retrospective data from a cohort of 618 patients diagnosed with DFUs, investigators used 37 clinical features to construct an ML model designed to predict the likelihood of amputation in hospitalized patients (18). The final model demonstrated high predictive performance, with an AUC of 0.90. However, the initial dataset used in that study comprised a relatively small sample size, with only 117 amputee patients. Additionally, a considerable number of features were incorporated into the model. These factors may have contributed to overfitting of the model, potentially leading to inaccurate results and undermining the generalizability of the findings. Wang et al. used ML to predict the outcomes of minor amputations in patients with severe wounds (Texas University grade 3+), achieving an AUC of 0.881 (17). Owing to the limited number of amputee cases in the initial dataset, they utilized the synthetic minority oversampling technique (SMOTE) to perform oversampling. SMOTE is used to mitigate data imbalance; however, its use carries the potential risk of inducing model overfitting (30). In contrast to the previously cited studies, our hospital serves as a tertiary referral center, attracting patients with advanced-stage DFUs. This resulted in a more comprehensive and balanced dataset during the initial data collection phase. As previously mentioned, the AUC for the XGBoost model reached values of 0.93 in the internal validation cohorts and 0.83 in the external validation cohorts, demonstrating the strong predictive ability of the model. Nevertheless, the discrepancy in AUC between internal and external validation indicates potential variations in data distribution. The data presented in Table 1 and Supplementary Table 1 demonstrate that patients within the training cohorts present a greater prevalence of Wagner grades IV-V. Furthermore, amputee patients in these cohorts are distinguished by advanced age, increased white blood cell counts, and compromised nutritional status. Collectively, these findings indicate a heightened severity of illness among patients in the training cohorts. These differences impact model performance in external validation cohorts and may reduce accuracy for patients with milder symptoms. It is imperative to refine and augment the model by utilizing a larger, multicenter dataset that encompasses patients exhibiting varying severities of DFUs.
Moreover, a notable advantage of our research was the incorporation of the XGBoost algorithm, which has garnered significant attention in recent years owing to its rapid computational efficiency, robust generalization properties, and superior predictive performance (31–33). Furthermore, we used GridSearchCV for the optimization of hyperparameters. In our analysis, the p value for the difference in the AUC between XGBoost and the other models was not statistically significant. Nonetheless, it is crucial to underscore that the selection of an optimal model transcends mere statistical significance. Unlike logistic regression (linear) or SVM (kernel-dependent), XGBoost automatically captures complex feature interactions through sequential tree-building., which is vital for patients with DFUs (34). Each new tree corrects residuals from previous trees, modeling intricate patterns that linear models or single decision trees miss. XGBoost incorporates L1 and L2 regularization directly into its objective function, penalizing overly complex trees. This reduces overfitting, a critical weakness of KNN. Moreover, unlike RF (which builds trees independently), XGBoost uses gradient boosting to improve predictions iteratively. This error-correction mechanism allows it to refine model performance more effectively. Parameters such as max_depth, learning_rate, and subsample allow fine-grained control over bias-variance tradeoffs. While RF requires the tuning of fewer parameters, it lacks adaptability to the boosting framework. Additionally, assessing metrics such as recall, accuracy, and the F1 score further substantiates the robust predictive performance of the XGBoost model.
An additional strength of our study is the application of SHAP for the interpretation of the XGBoost model, which helped with the identification of important variables linked to amputation risk. In the final model, the WBC count and lymphocyte count were the most significant features for model selection, given that these parameters are frequently associated with systemic inflammatory responses and the severity of infection in patients (35). A series of infection markers (such as the WBC count, CRP level and erythrocyte sedimentation rate) have long been regarded as predictive indicators for LEA in patients with DFUs (36, 37). This observation is consistent with clinical experience, indicating that a pronounced inflammatory response exacerbates tissue damage, impedes reparative mechanisms, and significantly elevates the risk of amputation. Moreover, nutritional status is also a predictor of amputation. In our study, nonamputee patients presented elevated levels of serum ALB, GLB, and Hb, which serve as indicators of patients’ nutritional status. In alignment with these findings, a study of 3,654 DFU patients revealed that lower Hb and plasma ALB levels independently increase the risk of amputation (38). Interestingly, although an adverse lipid profile or dyslipidemia is a significant risk factor for various diabetic complications (39–41), our study indicates that reduced lipid levels in DFU patients often signify a poor prognosis. Therefore, clinicians should pay close attention to the nutritional status of DFU patients and promptly address issues related to anemia and malnutrition to improve the overall condition of these patients and promote wound healing.
In the present study, predictors of amputation in patients with DFUs were examined and found to align with previous research. Specifically, our findings corroborate earlier studies indicating that elevated BUN levels are positively correlated with higher rates of amputation (42). BUN serves as a biomarker indicative of renal function in patients. Increased BUN levels are frequently correlated with compromised kidney function (43). Compromised renal function can result in edema and metabolic disturbances in patients, hindering the healing process of DFU and culminating in amputation. Moreover, elevated fasting blood glucose levels, stenosis of below-the-knee arteries, and increased uric acid levels are significantly correlated with poor prognosis in DFU patients (44–46). The incorporation of a diverse range of features has enabled our model to achieve favorable predictive performance, resulting in strong efficacy and generalizability across both internal and external validation cohorts.
In the future, the implementation of our XGBoost model in clinical settings offers transformative potential for developing precise management strategies for patients with DFUs. It possesses the ability to perform real-time analysis of patient data as they are collected, providing clinicians with immediate risk stratification and predictive insights. For example, during patient admissions, the model can evaluate risk factors for conditions of DFUs, thereby facilitating timely interventions. In the future, the integration of our XGBoost models as plugins or via APIs into electronic health record (EHR) systems is anticipated. This integration will facilitate clinicians’ access to predictive analytics at the commencement of treatment. This could allow for better population-based strategies to identify amputations and more precisely target prevention or treatment resources to patients who would benefit the most.
Nevertheless, our study is subject to certain limitations. The development of the predictive model is restricted primarily by the features used during its training. There may be additional features that serve as useful predictive factors for DF amputation risk that were not identified in this study. For instance, the lack of assessment to classify patients’ DFUs into neuropathic, ischemic, and neuro-ischemic categories upon admission has resulted in insufficient data on the types of DFUs. The omission of these factors could introduce potential bias. Second, the retrospective design of this study inherently leads to instances of missing data, constituting another limitation of our research. Further validation through prospective studies is warranted. Third, the features utilized for modeling in this study were exclusively gathered from patients during admission. Should these features be collected at multiple time points and across various care settings, the resulting dataset would be more comprehensive. Fourth, our hospital is designated as a tertiary care facility in Guangdong Province, which frequently results in the referral of patients with severe DFUs through multiple channels. As a result, the patients managed at our institution typically present with more advanced conditions and an increased likelihood of requiring amputation. These differences may affect the accuracy of models when they are applied to patients exhibiting milder symptoms. It is essential to incorporate a larger multicenter dataset. Fifth, due to the limited number of major amputation cases, we did not assess the risks of major and minor amputations in patients with diabetic foot ulcers separately. In the future, it is crucial to gather more patient data to analyze the disparate risk factors for major and minor amputations comprehensively. This will facilitate enhanced risk prediction for both types of amputation.
In conclusion, ML models have emerged as reliable tools for amputation prediction in patients with DFUs. The adoption of explainable modeling techniques, such as SHAP, offers insights into the significance of individual features in contributing to the model’s output, thereby enhancing the transparency and feasibility of model deployment. Therefore, the model can serve as a valuable reference for clinicians in tailoring precise management strategies for patients with DFUs. With additional prospective validation and refinement, this model has the potential to identify patients at high risk for amputation, thereby contributing to a reduction in the overall amputation rate in DFU patients. Future research could incorporate additional novel biomarkers and prospective data to refine and enhance the prediction model, rendering it more comprehensive and holistic.
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
The studies involving humans were approved by the Ethics Committee of Sun Yat-sen Memorial Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because the patient data were anonymized and the study did not influence clinical decision-making. Therefore, in accordance with national legislation and institutional regulations, written informed consent for participation was not required.
HT: Data curation, Investigation, Writing – original draft. LLY: Data curation, Investigation, Writing – original draft. YH: Data curation, Writing – review & editing. YC: Data curation, Writing – review & editing. LY: Writing – review & editing. DL: Writing – review & editing. SX: Data curation, Writing – review & editing. BY: Data curation, Writing – review & editing. MR: Writing – review & editing.
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by grants from National Natural Science Foundation of China (U20A20352, 82370822,82470850),Guang Dong Clinical Research Center for Metabolic Diseases(2020B1111170009), the Guangdong Basic and Applied Basic Research Foundation(2024A1515010503)and Guangzhou Science and Technology Projects (2024B03J1342).
The authors would like to express our gratitude to all the physicians and research personnel who participated in this study for their contributions to our research.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The author(s) declare that no Generative AI was used in the creation of this manuscript.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo.2025.1526098/full#supplementary-material
1. Sun H, Saeedi P, Karuranga S, Pinkepank M, Ogurtsova K, Duncan BB, et al. Idf diabetes atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res Clin Pract. (2022) 183:109119. doi: 10.1016/j.diabres.2021.109119
2. Zheng Y, Ley SH, Hu FB. Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat Rev Endocrinol. (2018) 14:88–98. doi: 10.1038/nrendo.2017.151
3. Schaper NC, van Netten JJ, Apelqvist J, Bus SA, Hinchliffe RJ, Lipsky BA, et al. Practical guidelines on the prevention and management of diabetic foot disease (iwgdf 2019 update). Diabetes Metab Res Rev. (2020) 36 Suppl 1:e3266. doi: 10.1002/dmrr.v36.S1
4. Wukich DK, Raspovic KM, Suder NC. Patients with diabetic foot disease fear major lower-extremity amputation more than death. Foot Ankle Spec. (2018) 11:17–21. doi: 10.1177/1938640017694722
5. Ugwu E, Adeleye O, Gezawa I, Okpe I, Enamino M, Ezeani I. Predictors of lower extremity amputation in patients with diabetic foot ulcer: Findings from medfun, a multi-center observational study. J Foot Ankle Res. (2019) 12:34. doi: 10.1186/s13047-019-0345-y
6. Thorud JC, Plemmons B, Buckley CJ, Shibuya N, Jupiter DC. Mortality after nontraumatic major amputation among patients with diabetes and peripheral vascular disease: A systematic review. J Foot Ankle Surg. (2016) 55:591–9. doi: 10.1053/j.jfas.2016.01.012
7. Lavery LA, Armstrong DG, Harkless LB. Classification of diabetic foot wounds. J Foot Ankle Surg. (1996) 35:528–31. doi: 10.1016/S1067-2516(96)80125-6
8. Beckert S, Witte M, Wicke C, Konigsrainer A, Coerper S. A new wound-based severity score for diabetic foot ulcers: A prospective analysis of 1,000 patients. Diabetes Care. (2006) 29:988–92. doi: 10.2337/dc05-2431
9. Ince P, Abbas ZG, Lutale JK, Basit A, Ali SM, Chohan F, et al. Use of the sinbad classification system and score in comparing outcome of foot ulcer management on three continents. Diabetes Care. (2008) 31:964–7. doi: 10.2337/dc07-2367
10. Monteiro-Soares M, Russell D, Boyko EJ, Jeffcoate W, Mills JL, Morbach S, et al. Guidelines on the classification of diabetic foot ulcers (iwgdf 2019). Diabetes Metab Res Rev. (2020) 36 Suppl 1:e3273. doi: 10.1002/dmrr.v36.S1
11. Mehraj M, Shah IJIJOS. A review of wagner classification and current concepts in management of diabetic foot. Int J Orthop Sci. (2018) 4:933–5. doi: 10.22271/ortho.2018.v4.i1n.133
12. Kumar VH, Moghadam AGBJISJ. A study to test the validity of diabetic ulcer severity score (duss) at tertiary care hospital. Int Surg J. (2017) 4:4010–4. doi: 10.18203/2349-2902.isj20175401
13. Singh AV, Bhardwaj P, Laux P, Pradeep P, Busse M, Luch A, et al. Ai and ml-based risk assessment of chemicals: Predicting carcinogenic risk from chemical-induced genomic instability. Front Toxicol. (2024) 6:1461587. doi: 10.3389/ftox.2024.1461587
14. Stevens LM, Mortazavi BJ, Deo RC, Curtis L, Kao DP. Recommendations for reporting machine learning analyses in clinical research. Circ Cardiovasc Qual Outcomes. (2020) 13:e006556. doi: 10.1161/CIRCOUTCOMES.120.006556
15. Guan H, Wang Y, Niui P, et al. The role of machine learning in advancing diabetic foot: a review. Front Endocrinol (Lausanne). (2024) 15:1325434. doi: 10.3389/fendo.2024.1325434
16. Lin C, Yuan Y, Ji L, Yang X, Yin G, Lin S. The amputation and survival of patients with diabetic foot based on establishment of prediction model. Saudi J Biol Sci. (2020) 27:853–8. doi: 10.1016/j.sjbs.2019.12.020
17. Wang S, Wang J, Zhu MX, Tan Q. Machine learning for the prediction of minor amputation in university of texas grade 3 diabetic foot ulcers. PloS One. (2022) 17:e0278445. doi: 10.1371/journal.pone.0278445
18. Xie P, Li Y, Deng B, Du C, Rui S, Deng W, et al. An explainable machine learning model for predicting in-hospital amputation rate of patients with diabetic foot ulcer. Int Wound J. (2022) 19:910–8. doi: 10.1111/iwj.13691
19. Stefanopoulos S, Qiu Q, Ren G, Ahmed A, Osman M, Brunicardi FC, et al. A machine learning model for prediction of amputation in diabetics. J Diabetes Sci Technol. (2024) 18:874–81. doi: 10.1177/19322968221142899
20. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA. (2017) 318:517–8. doi: 10.1001/jama.2017.7797
21. Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat BioMed Eng. (2018) 2:749–60. doi: 10.1038/s41551-018-0304-0
22. Hinchliffe RJ, Forsythe RO, Apelqvist J, Boyko EJ, Fitridge R, Hong JP, et al. Guidelines on diagnosis, prognosis, and management of peripheral artery disease in patients with foot ulcers and diabetes (iwgdf 2019 update). Diabetes Metab Res Rev. (2020) 36 Suppl 1:e3276. doi: 10.1002/dmrr.v36.S1
23. Yin JM, Li Y, Xue JT, Zong GW, Fang ZZ, Zou L. Explainable machine learning-based prediction model for diabetic nephropathy. J Diabetes Res. (2024) 2024:8857453. doi: 10.1155/2024/8857453
24. Feng X, Hong T, Liu W, Xu C, Li W, Yang B, et al. Development and validation of a machine learning model to predict the risk of lymph node metastasis in renal carcinoma. Front Endocrinol (Lausanne). (2022) 13:1054358. doi: 10.3389/fendo.2022.1054358
25. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in python. J Mach Learn Res. (2011) 12:2825–30.
26. Bekkar M, Djemaa HK, Alitouche TAJJIEA. Evaluation measures for models assessment over imbalanced data sets. Intl J Data Mining Knowledge Manage Process (2013) 3:15–33. doi: 10.5121/ijdkp.2013.3402
27. Sokolova M, GJIp L. A systematic Anal Perform measures classification tasks. Information Processing & Management. (2009) 45:427–37. doi: 10.1016/j.ipm.2009.03.002
28. Singh AV, Bansod G, Schumann A, Bierkandt FS, Laux P, Nakhale SV, et al. Investigating tattoo pigments composition with uv-vis and ft-ir spectroscopy supported by chemometric modelling. Curr Analytical Chem. (2024) 20:17. doi: 10.2174/0115734110316443240725051037
29. Wang Q, Qiao W, Zhang H, Liu B, Li J, Zang C, et al. Nomogram established on account of lasso-cox regression for predicting recurrence in patients with early-stage hepatocellular carcinoma. Front Immunol. (2022) 13:1019638. doi: 10.3389/fimmu.2022.1019638
30. Xu Z, Shen D, Kou Y, Nie T. A synthetic minority oversampling technique based on gaussian mixture model filtering for imbalanced data classification. IEEE Trans Neural Netw Learn Syst. (2024) 35:3740–53. doi: 10.1109/TNNLS.2022.3197156
31. Hou N, Li M, He L, Xie B, Wang L, Zhang R, et al. Predicting 30-days mortality for mimic-iii patients with sepsis-3: A machine learning approach using xgboost. J Transl Med. (2020) 18:462. doi: 10.1186/s12967-020-02620-5
32. Deng X, Li M, Deng S, Wang L. Hybrid gene selection approach using xgboost and multi-objective genetic algorithm for cancer classification. Med Biol Eng Comput. (2022) 60:663–81. doi: 10.1007/s11517-021-02476-x
33. Fan Z, Jiang J, Xiao C, Chen Y, Xia Q, Wang J, et al. Construction and validation of prognostic models in critically ill patients with sepsis-associated acute kidney injury: Interpretable machine learning approach. J Transl Med. (2023) 21:406. doi: 10.1186/s12967-023-04205-4
34. Li X, Wang Z, Zhao W, Shi R, Zhu Y, Pan H, et al. Machine learning algorithm for predict the in-hospital mortality in critically ill patients with congestive heart failure combined with chronic kidney disease. Ren Fail. (2024) 46:2315298. doi: 10.1080/0886022X.2024.2315298
35. Gong H, Ren Y, Li Z, Zha P, Bista R, Li Y, et al. Clinical characteristics and risk factors of lower extremity amputation in the diabetic inpatients with foot ulcers. Front Endocrinol (Lausanne). (2023) 14:1144806. doi: 10.3389/fendo.2023.1144806
36. Gurney JK, Stanley J, York S, Rosenbaum D, Sarfati D. Risk of lower limb amputation in a national prevalent cohort of patients with diabetes. Diabetologia. (2018) 61:626–35. doi: 10.1007/s00125-017-4488-8
37. McDermott K, Fang M, Boulton AJM, Selvin E, Hicks CW. Etiology, epidemiology, and disparities in the burden of diabetic foot ulcers. Diabetes Care. (2023) 46:209–21. doi: 10.2337/dci22-0043
38. Gezawa ID, Ugwu ET, Ezeani I, Adeleye O, Okpe I, Enamino M. Anemia in patients with diabetic foot ulcer and its impact on disease outcome among Nigerians: Results from the medfun study. PloS One. (2019) 14:e0226226. doi: 10.1371/journal.pone.0226226
39. Peng X, Gou D, Zhang L, Wu H, Chen Y, Shao X, et al. Status and influencing factors of lower limb amputation in patients with diabetic foot ulcer. Int Wound J. (2023) 20:2075–81. doi: 10.1111/iwj.14076
40. Ikura K, Hanai K, Shinjyo T, Uchigata Y. Hdl cholesterol as a predictor for the incidence of lower extremity amputation and wound-related death in patients with diabetic foot ulcers. Atherosclerosis. (2015) 239:465–9. doi: 10.1016/j.atherosclerosis.2015.02.006
41. Eckert AJ, Zimny S, Altmeier M, Dugic A, Gillessen A, Bozkurt L, et al. Factors associated with diabetic foot ulcers and lower limb amputations in type 1 and type 2 diabetes supported by real-world data from the german/Austrian dpv registry. J Diabetes. (2024) 16:e13531. doi: 10.1111/1753-0407.13531
42. Xu J, Gao J, Li H, Zhu Z, Liu J, Gao C. The risk factors in diabetic foot ulcers and predictive value of prognosis of wound tissue vascular endothelium growth factor. Sci Rep. (2024) 14:14120. doi: 10.1038/s41598-024-64009-4
43. Mitrofanova A, Merscher S, Fornoni A. Kidney lipid dysmetabolism and lipid droplet accumulation in chronic kidney disease. Nat Rev Nephrol. (2023) 19:629–45. doi: 10.1038/s41581-023-00741-w
44. Chuter V, Schaper N, Hinchliffe R, Mills J, Azuma N, Behrendt CA, et al. Performance of non-invasive bedside vascular testing in the prediction of wound healing or amputation among people with foot ulcers in diabetes: A systematic review. Diabetes Metab Res Rev. (2024) 40:e3701. doi: 10.1002/dmrr.v40.3
45. Lin C, Liu J, Sun H. Risk factors for lower extremity amputation in patients with diabetic foot ulcers: A meta-analysis. PloS One. (2020) 15:e0239236. doi: 10.1371/journal.pone.0239236
Keywords: diabetic foot ulcer, lower extremity amputation, risk factor, machine learning, SHAP
Citation: Tao H, You L, Huang Y, Chen Y, Yan L, Liu D, Xiao S, Yuan B and Ren M (2025) An interpreting machine learning models to predict amputation risk in patients with diabetic foot ulcers: a multi-center study. Front. Endocrinol. 16:1526098. doi: 10.3389/fendo.2025.1526098
Received: 11 November 2024; Accepted: 10 March 2025;
Published: 25 March 2025.
Edited by:
Ping Wang, Michigan State University, United StatesReviewed by:
Michael Edwin Edmonds, King’s College Hospital NHS Foundation Trust, United KingdomCopyright © 2025 Tao, You, Huang, Chen, Yan, Liu, Xiao, Yuan and Ren. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Meng Ren, cmVubWVuZzgwQDEzOS5jb20=; Shan Xiao, c25vb3B5X3hzQDEyNi5jb20=; Bichai Yuan, MTIxNTE2NTc1NkBxcS5jb20=
†These authors have contributed equally to this work
‡ORCID: Meng Ren, orcid.org/0000-0001-9935-1449
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Research integrity at Frontiers
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.