AUTHOR=Zhou Hongshan , Liu Leping , Zhao Qinyu , Jin Xin , Peng Zhangzhe , Wang Wei , Huang Ling , Xie Yanyun , Xu Hui , Tao Lijian , Xiao Xiangcheng , Nie Wannian , Liu Fang , Li Li , Yuan Qiongjing 

TITLE=Machine learning for the prediction of all-cause mortality in patients with sepsis-associated acute kidney injury during hospitalization

JOURNAL=Frontiers in Immunology

VOLUME=Volume 14 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2023.1140755

DOI=10.3389/fimmu.2023.1140755

ISSN=1664-3224

ABSTRACT=Background: S-AKI is considered to be associated with high morbidity and mortality, a commonly accepted model to predict mortality is urged consequently. This study used a machine learning model to identify vital variables associated with mortality in sepsis-associated acute kidney injury (S-AKI) patients in the hospital and predict the risk of death in the hospital. We hope that this model can help identify high-risk patients early and reasonably allocate medical resources in ICU.
Methods: A total of 1,6154 S-AKI patients from the Medical Information Mart for Intensive Care IV database were examined as the training set (80%) and validation set (20%). Variables (129 in total) were collected, including basic patient information, diagnosis, clinical data, and medication records. We developed and validated machine learning models using eleven different algorithms and selected the one that performed the best. Afterward, recursive feature elimination was used to select key variables. Different indicators were used to compare the prediction performance of each model. The SHapley Additive exPlanations package was applied to interpret the best machine learning model in a web tool for clinicians to use. Finally, we collected clinical data of S-AKI patients from two hospitals for external validation.
Results: In this study, fifteen critical variables were finally selected, including urine output, maximum blood urea nitrogen, rate of injection of norepinephrine, maximum anion gap, maximum creatine, maximum red blood cell volume distribution width, minimum international normalized ratio, maximum heart rate, maximum temperature, maximum respiratory rate, minimum fraction of inspired O2, minimum creatine, minimum Glasgow Coma Scale and diagnosis of diabetes and stroke. The categorical boosting algorithm model presented significantly better predictive performance (ROC: 0.83) than other models (ACC: 75%, Youden index: 50%, sensitivity: 75%, specificity: 75%, F1 score: 0.56, PPV: 44% and NPV: 92%). External validation data from two hospitals in China were also well validated (ROC: 0.75).
Conclusions: After selecting fifteen crucial variables, a machine learning-based model for predicting the mortality of S-AKI patients was successfully established and CatBoost model demonstrated best predictive performance.