Predictive models based on machine learning have been widely used in clinical practice. Patients with acute myocardial infarction (AMI) are prone to the risk of acute kidney injury (AKI), which results in a poor prognosis for the patient. The aim of this study was to develop a machine learning predictive model for the identification of AKI in AMI patients.
Patients with AMI who had been registered in the Medical Information Mart for Intensive Care (MIMIC) III and IV database were enrolled. The primary outcome was the occurrence of AKI during hospitalization. We developed Random Forests (RF) model, Naive Bayes (NB) model, Support Vector Machine (SVM) model, eXtreme Gradient Boosting (xGBoost) model, Decision Trees (DT) model, and Logistic Regression (LR) models with AMI patients in MIMIC-IV database. The importance ranking of all variables was obtained by the SHapley Additive exPlanations (SHAP) method. AMI patients in MIMIC-III databases were used for model evaluation. The area under the receiver operating characteristic curve (AUC) was used to compare the performance of each model.
A total of 3,882 subjects with AMI were enrolled through screening of the MIMIC database, of which 1,098 patients (28.2%) developed AKI. We randomly assigned 70% of the patients in the MIMIC-IV data to the training cohort, which is used to develop models in the training cohort. The remaining 30% is allocated to the testing cohort. Meanwhile, MIMIC-III patient data performs the external validation function of the model. 3,882 patients and 37 predictors were included in the analysis for model construction. The top 5 predictors were serum creatinine, activated partial prothrombin time, blood glucose concentration, platelets, and atrial fibrillation, (SHAP values are 0.670, 0.444, 0.398, 0.389, and 0.381, respectively). In the testing cohort, using top 20 important features, the models of RF, NB, SVM, xGBoost, DT model, and LR obtained AUC of 0.733, 0.739, 0.687, 0.689, 0.663, and 0.677, respectively. Placing RF models of number of different variables on the external validation cohort yielded their AUC of 0.711, 0.754, 0.778, 0.781, and 0.777, respectively.
Machine learning algorithms, particularly the random forest algorithm, have improved the accuracy of risk stratification for AKI in AMI patients and are applied to accurately identify the risk of AKI in AMI patients.