Skip to main content

ORIGINAL RESEARCH article

Front. Artif. Intell.
Sec. Medicine and Public Health
Volume 7 - 2024 | doi: 10.3389/frai.2024.1348907
This article is part of the Research Topic Physiologically-driven Precision Critical Care using Big Data View all articles

Development and validation of an interpretable machine learning for mortality prediction in patients with sepsis

Provisionally accepted
Bihua He Bihua He Zheng Qiu Zheng Qiu *
  • Hubei Provincial Third People's Hospital (Zhongshan Hospital), Wuhan, Hubei Province, China

The final, formatted version of the article will be published soon.

    Sepsis is a leading cause of death. However, there is a lack of useful model to predict outcome in sepsis. Herein, the aim of this study was to develop an explainable machine learning (ML) model for predicting 28-day mortality in patients with sepsis based on Sepsis 3.0 criteria.We obtained the data from the Medical Information Mart for Intensive Care (MIMIC)-III database (version 1.4). The overall data was randomly assigned to the training and testing sets at a ratio of 3:1. Following the application of LASSO regression analysis to identify the modeling variables, we proceeded to develop models using Extreme Gradient Boost (XGBoost), Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF) techniques with 5-fold cross-validation. The optimal model was selected based on its area under the curve (AUC). Finally, the Shapley additive explanations (SHAP) method was used to interpret the optimal model. Results: A total of 5834 septic adults were enrolled, the median age was 66 years (IQR, 54-78 years) and 2342 (40.1%) were women. After feature selection, 14 variables were included for developing model in the training set. The XGBoost model (AUC: 0.806) showed superior performance with AUC, compared with RF (AUC: 0.794), LR (AUC: 0.782) and SVM model (AUC: 0.687). SHAP summary analysis for XGBoost model showed that urine output on day 1, age, blood urea nitrogen and body mass index were the top four contributors. SHAP dependence analysis demonstrated insightful nonlinear interactive associations between factors and outcome. SHAP force analysis provided three samples for model prediction.Conclusions: In conclusion, our study successfully demonstrated the efficacy of ML models in predicting 28-day mortality in sepsis patients, while highlighting the potential of the SHAP method to enhance model transparency and aid in clinical decision-making.

    Keywords: Sepsis, machine learning, Mortality, prediction, database

    Received: 03 Dec 2023; Accepted: 26 Jun 2024.

    Copyright: © 2024 He and Qiu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Zheng Qiu, Hubei Provincial Third People's Hospital (Zhongshan Hospital), Wuhan, 430000, Hubei Province, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.