Heart failure is a cardiovascular disorder, while sepsis is a common non-cardiac cause of mortality. Patients with combined heart failure and sepsis have a significantly higher mortality rate and poor prognosis, making early identification of high-risk patients and appropriate allocation of medical resources critically important.
We constructed a survival prediction model for patients with heart failure and sepsis using the eICU-CRD database and externally validated it using the MIMIC-IV database. Our primary outcome is the 28-day all-cause mortality rate. The Boruta method is used for initial feature selection, followed by feature ranking using the XGBoost algorithm. Four machine learning models were compared, including Logistic Regression (LR), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GNB). Model performance was assessed using metrics such as area under the curve (AUC), accuracy, sensitivity, and specificity, and the SHAP method was utilized to visualize feature importance and interpret model results. Additionally, we conducted external validation using the MIMIC-IV database.
We developed a survival prediction model for heart failure complicated by sepsis using data from 3891 patients in the eICU-CRD and validated it externally with 2928 patients from the MIMIC-IV database. The LR model outperformed all other machine learning algorithms with a validation set AUC of 0.746 (XGBoost: 0.726, AdaBoost: 0.744, GNB: 0.722), alongside accuracy (0.685), sensitivity (0.666), and specificity (0.712). The final model incorporates 10 features: age, ventilation, norepinephrine, white blood cell count, total bilirubin, temperature, phenylephrine, respiratory rate, neutrophil count, and systolic blood pressure. We employed the SHAP method to enhance the interpretability of the model based on the LR algorithm. Additionally, external validation was conducted using the MIMIC-IV database, with an external validation AUC of 0.699.
Based on the LR algorithm, a model was constructed to effectively predict the 28-day all-cause mortality rate in patients with heart failure complicated by sepsis. Utilizing our model predictions, clinicians can promptly identify high-risk patients and receive guidance for clinical practice.