Survival prediction for heart failure complicated by sepsis: based on machine learning methods

Zhang, Qitian; Xu, Lizhen; He, Weibin; Lai, Xinqi; Huang, Xiaohong

doi:10.3389/fmed.2024.1410702

ORIGINAL RESEARCH article

Front. Med., 03 October 2024

Sec. Intensive Care Medicine and Anesthesiology

Volume 11 - 2024 | https://doi.org/10.3389/fmed.2024.1410702

This article is part of the Research TopicClinical Application of Artificial Intelligence in Emergency and Critical Care Medicine, Volume VView all 15 articles

Survival prediction for heart failure complicated by sepsis: based on machine learning methods

Qitian Zhang¹^†

Lizhen Xu²^†

Weibin He¹

Xinqi Lai¹

Xiaohong Huang¹^*

¹Department of Cardiology, Zhangzhou Affiliated Hospital of Fujian Medical University, Zhangzhou, Fujian, China
²Department of Endocrinology, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou University Affiliated Provincial Hospital, Fuzhou, China

Background: Heart failure is a cardiovascular disorder, while sepsis is a common non-cardiac cause of mortality. Patients with combined heart failure and sepsis have a significantly higher mortality rate and poor prognosis, making early identification of high-risk patients and appropriate allocation of medical resources critically important.

Methods: We constructed a survival prediction model for patients with heart failure and sepsis using the eICU-CRD database and externally validated it using the MIMIC-IV database. Our primary outcome is the 28-day all-cause mortality rate. The Boruta method is used for initial feature selection, followed by feature ranking using the XGBoost algorithm. Four machine learning models were compared, including Logistic Regression (LR), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GNB). Model performance was assessed using metrics such as area under the curve (AUC), accuracy, sensitivity, and specificity, and the SHAP method was utilized to visualize feature importance and interpret model results. Additionally, we conducted external validation using the MIMIC-IV database.

Results: We developed a survival prediction model for heart failure complicated by sepsis using data from 3891 patients in the eICU-CRD and validated it externally with 2928 patients from the MIMIC-IV database. The LR model outperformed all other machine learning algorithms with a validation set AUC of 0.746 (XGBoost: 0.726, AdaBoost: 0.744, GNB: 0.722), alongside accuracy (0.685), sensitivity (0.666), and specificity (0.712). The final model incorporates 10 features: age, ventilation, norepinephrine, white blood cell count, total bilirubin, temperature, phenylephrine, respiratory rate, neutrophil count, and systolic blood pressure. We employed the SHAP method to enhance the interpretability of the model based on the LR algorithm. Additionally, external validation was conducted using the MIMIC-IV database, with an external validation AUC of 0.699.

Conclusion: Based on the LR algorithm, a model was constructed to effectively predict the 28-day all-cause mortality rate in patients with heart failure complicated by sepsis. Utilizing our model predictions, clinicians can promptly identify high-risk patients and receive guidance for clinical practice.

1 Introduction

The 2018 medical insurance data reveals that sepsis and heart failure, respectively, ranked first and second in 30-day readmission rates among patients (1). Sepsis is defined as a dysregulated host response to infection, leading to organ failure (2). In 2017, an estimated 48.9 million cases of sepsis were recorded globally, resulting in 11 million sepsis-related deaths, which accounted for 19.7% of all global deaths (3). The mortality rates of sepsis in intensive care units and hospitals are reported to be 25.8 and 35.3%, respectively (4), with annual losses exceeding $24 billion (5, 6). Heart failure is a cardiovascular disorder characterized by high incidence and mortality rates, representing an escalating global epidemic (7). Over 64 million individuals worldwide are afflicted with heart failure, severely compromising their quality of life (8). Chronic heart failure is the leading complication in septic patients, with two-thirds of critically ill cases having prior heart failure (9, 10). Heart failure patients may exhibit underlying circulatory dysfunction and impaired cardiac reserve, placing them at increased risk if they develop sepsis. Alon et al. discovered that heart failure patients admitted for sepsis had a higher mortality rate compared to those without heart failure (51% vs. 41%; p = 0.015) (11). Walker et al. studied the effect of sepsis on heart failure patient mortality and found it caused one-fourth of deaths (12). The high incidence and mortality rates stress the importance of early identification, assessment, and management of heart failure patients with sepsis.

Currently, there are no identified predictive models for survival in patients with heart failure complicated by sepsis. The Sequential Organ Failure Assessment (SOFA), Simplified Acute Physiology Score II (SAPS II), and Acute Physiology Score III (APS III) are frequently utilized assessment tools for predicting disease prognosis (13, 14). Despite their extensive utilization, they exhibit limitations such as the complexity of assessment, insufficient specificity, and potential suitability restricted to specific disease types or clinical contexts. The current research trend is to integrate novel biomarkers (15, 16) into established scoring systems or to revamp these systems (17) to improve their predictive accuracy for disease prognosis. In clinical practice, machine learning is widely applied for result prediction, diagnosis, medical image interpretation, disease risk assessment, and treatment planning (18, 19). Compared to traditional statistical methods, machine learning excels in handling complex data, exhibiting higher accuracy and efficiency (20). In the past, the application of machine learning was constrained by limited interpretability. However, with the emergence of techniques like SHAP, users can now professionally understand model predictions with greater clarity (21).

Our research aims to build survival prediction models using various machine learning algorithms to assess the overall in-hospital mortality rate among patients with heart failure complicated by sepsis. We utilize the eICU-CRD database to build machine learning models, selecting the one with optimal predictive performance. Subsequently, we conduct external validation using the MIMIC-IV database. Additionally, the SHAP method is used to explain model predictions and assess the importance of features. The objective of this study is to identify critically ill patients and offer guidance for clinical practice.

2 Materials and methods

2.1 Data sources and study population

This study draws data from two primary sources: the eICU Collaborative Research Database (eICU-CRD) and The Medical Information Mart for Intensive Care IV database (MIMIC-IV). The eICU-CRD database encompasses various ICU units across the United States, offering a comprehensive array of clinical data, physiological parameters, and medical events. Spanning from 2014 to 2015, it meticulously documents information for over 200,000 patients, facilitating medical research endeavors and data-informed clinical decision-making (22). On the other hand, MIMIC-IV (version 2.2) represents an extensive repository of intensive care data, featuring detailed records of more than 190,000 ICU patients from 2008 to 2019 (23). This database is characterized by its exhaustive collection of clinical details, including demographic profiles, laboratory findings, and medication histories, serving as invaluable resources for rigorous clinical investigations.

We identified patients with heart failure complicated by sepsis from both the eICU-CRD and MIMIC-IV databases using ICD-9 and ICD-10 codes. The exclusion criteria for the study population are: (1) age under 18 years, (2) ICU stay duration less than 24 h, and (3) clinical information missing rate exceeding 30% at data collection. For patients with multiple hospital admissions or ICU visits, only the first ICU experience during the initial hospital admission is considered. Heart failure was defined as a syndrome resulting from structural or functional cardiac abnormalities that lead to inadequate cardiac output and congestion in the systemic or pulmonary circulation, encompassing all types of heart failure with different ejection fractions. Sepsis was diagnosed based on the Sepsis-3.0 guidelines, which define it as life-threatening organ dysfunction caused by a dysregulated host response to infection. A SOFA score ≥ 2 (or a qSOFA score ≥ 2 for suspected infection in non-ICU settings) was used to diagnose sepsis.

2.2 Data extraction and preprocessing

In this study, we included ICU patients diagnosed with heart failure and sepsis, and extracted the following data: (1) Demographics: age, gender, height, and weight; (2) Vital Signs: temperature (T), heart rate (HR), respiratory rate (R), systolic blood pressure (SBP), diastolic blood pressure (DBP), mean blood pressure (MBP), and peripheral oxygen saturation (SpO2); (3) Laboratory parameters: complete blood count, liver and kidney function tests, electrolytes, lipid profile, blood gas analysis, coagulation function, cardiac enzymes, and BNP; (4) Comorbidities: hypertension, diabetes mellitus, hyperlipidemia, chronic obstructive pulmonary disease (COPD), pneumonia, chronic kidney disease (CKD), and atrial fibrillation (AF); (5) Medication data: angiotensin-converting enzyme inhibitors/angiotensin II receptor blockers (ACEI/ARB), beta Blockers, furosemide, spironolactone, dobutamine, dopamine, epinephrine, milrinone, norepinephrine, and phenylephrine; and (6) Other Indicators: Ventilation and 24-h fluid balance. The primary outcome is the 28-day all-cause mortality rate.

Initially, we transformed certain indicators, such as computing BMI from height and weight and determining 28-day in-hospital mortality using hospitalization duration and survival status. Variables with over 30% missing data were removed, and missing values in the remaining features were imputed using KNN. Outliers were identified using the 1.5 times interquartile range method, particularly focusing on BMI, mechanical ventilation time, and 24-h fluid balance, and were subsequently removed. Additionally, Spearman correlation coefficients were calculated to evaluate variable relationships, while VIF values assessed multicollinearity. Variables with high correlation or VIF exceeding 5 underwent pre-screening. Continuous variables were standardized for model stability, and categorical variables were transformed into dummy variables via one-hot encoding. Despite minor sample imbalances in the outcome variable, we chose not to employ sample balancing techniques.

2.3 Model construction and evaluation

The Boruta method is used for initial feature screening, determining feature importance by comparing them with randomly generated “shadow features” (24). The XGBoost method is employed for importance ranking of the preliminarily selected features. Model construction and validation are conducted using the EICU dataset, with 10-fold cross-validation to generate training and validation sets, and Logistic Regression (LR), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GNB) models are established and validated. Model performance is evaluated on the validation set using metrics such as the area under the curve (AUC) for discrimination, calibration curve for accuracy, and DCA curve for clinical utility, as well as metrics including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score. The final predictive model is optimized using hyperparameter tuning and grid search. Additionally, the MIMIC dataset is utilized as external validation data, following the same data processing methods as the EICU dataset, with evaluation based on metrics including AUC, accuracy, sensitivity, and specificity to assess model generalization performance.

2.4 Model interpretation

SHAP (SHapley Additive exPlanations) is a technique based on game theory’s Shapley values (25). It’s used to interpret machine learning predictions by dissecting the contribution of each feature. This enhances model transparency and ensures fair decision-making. We employ SHAP to analyze the outcomes of our top-performing machine learning model. This method not only identifies crucial features for optimizing model performance but also provides detailed insights through feature contribution charts, summary plots, and explanations for individual predictions. These tools help us understand the extent of each feature’s influence, whether it’s positive or negative, and how they collectively impact model outcomes.

2.5 Statistical analysis

For continuous variables, we display using mean and standard deviation, and comparison is done using t-tests (or Wilcoxon rank-sum tests); for categorical variables, presentation is in percentages, and comparison is conducted using chi-square tests (or Fisher’s exact tests). A p-value <0.05 is deemed statistically significant. All statistical analyses were performed using R version 4.2.3 and Python version 3.11.4.

3 Results

3.1 Baseline characteristics

According to the inclusion and exclusion criteria, our study cohort comprised a total of 6819 patients with heart failure and sepsis. Among these, 3891 cases from the eICU-CRD were used for model construction, while 2928 cases from the MIMIC-IV database were used for external validation. As shown in Figure 1, the screening process is illustrated. During the selection process, patients with ICU stays less than 1 day or under 18 years old were excluded. Subsequently, data processing involved removing outliers and handling missing values. In the eICU-CRD database, 560 patients (14.4%) died within 28 days, compared to 660 patients (22.5%) in the MIMIC-IV database. Differences in baseline characteristics are summarized in Table 1. In the eICU-CRD database, compared to the survival group, patients in the death group exhibited higher age, white blood cell count, neutrophil count, TBIL (total bilirubin), ALT (alanine aminotransferase), BUN (blood urea nitrogen), respiratory rate, fluid balance, and mechanical ventilation time, and lower BMI (body mass index), calcium, blood pressure, and peripheral oxygen pressure. Differences in comorbidities, such as atrial fibrillation, hypertension, and pneumonia, were also observed between the two groups. Additionally, there were differences in medication usage between the two groups, including the use of ACEI/ARB (ACE inhibitors/angiotensin receptor blockers), beta-blockers, furosemide, spironolactone, dobutamine, dopamine, epinephrine, norepinephrine, and phenylephrine.

Figure 1

Figure 1. Flowchart of patient selection and research methodology.

Table 1

Table 1. Baseline characteristics of the eICU-CRD and MIMIC-IV databases, categorized by survival and death groups.

3.2 Feature selection

We eliminated features with a missing rate exceeding 30%, as demonstrated in Appendix Figure 1. Features with notably high missing rates are primarily found in laboratory tests such as cardiac enzymes, blood gas analysis, lipid profile, and coagulation function. Additionally, guided by the correlation heatmap showing features with correlation coefficients greater than 0.5 and features with VIF exceeding 5, as illustrated in Appendix Figure 2, we conducted further screening. Prior to model construction, we excluded features with high correlation and VIF, including hemoglobin, lymphocyte count, chloride, aspartate aminotransferase, blood urea nitrogen, mean blood pressure, and diastolic blood pressure.

The Boruta method, based on random forests, assesses feature importance by comparing original features with randomly generated “shadow features.” We applied Boruta for initial feature selection, as depicted in Figure 2. Green denotes important features included in the model to enhance predictive capability; red represents unimportant features excluded from consideration; yellow indicates features with uncertain importance requiring further investigation. Blue represents shadow features for comparison but not used in model training. Boruta identified 22 initial features, including age, WBC (white blood cell count), NE (neutrophil count), MONO (monocyte count), PLT (platelet count), sodium, calcium, TBIL (total bilirubin), alanine aminotransferase, creatinine, T (temperature), R (respiratory rate), SBP (systolic blood pressure), oxygen saturation, ventilation time, BMI, atrial fibrillation, dopamine, epinephrine, norepinephrine, phenylephrine, and ventilation.

Figure 2

Figure 2. Feature selection analyzed by Boruta algorithm.

The XGBoost algorithm ranks feature importance based on split frequency and gain in decision trees. Appendix Figure 3 shows our feature importance ranking using XGBoost. The top 10 variables, in descending order of importance, are: age, ventilation, norepinephrine, WBC, TBIL, T, phenylephrine, R, NE, SBP.

3.3 Model construction

This study utilized four binary classification machine learning algorithms, Logistic Regression (LR), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GNB), to construct predictive models. Employing the eICU database, we implemented a 10-fold cross-validation technique to establish training and validation sets, followed by evaluation on a separate test cohort. Figure 3 and Table 2 illustrate the performance of these models. The ROC curve (Figure 4A) highlights LR’s superior performance, achieving an AUC of 0.746 in the test cohort, compared to XGBoost (0.726), AdaBoost (0.744), and GNB (0.722). Furthermore, LR’s calibration curve (Figure 4B) closely aligns with the ideal line, indicating excellent calibration. Decision curve analysis (DCA) (Figure 4C) indicates LR’s highest net benefit within the 0–80% threshold range. The precision-recall (PR) curve (Figure 4D) illustrates LR’s higher recall at sustained high precision. Additionally, LR demonstrates robust performance across various metrics, including accuracy (0.685), sensitivity (0.666), specificity (0.712), positive predictive value (0.285), negative predictive value (0.914), and F1 score (0.397). Consequently, we selected the LR algorithm for model construction, incorporating 10 variables: age, ventilation, norepinephrine, WBC, TBIL, T, phenylephrine, R, NE, and SBP. Through hyperparameter tuning and grid search optimization, we established the model parameters as follows: tol (convergence measure): 1e-06, penalty (regularization type): l2, max_iter (number of iterations): 100, C (regularization factor): 1.0.

Figure 3

Figure 3. (A) SHAP summary plot and (B) SHAP force plot.

Table 2

Table 2. Model performance comparison: AUC, accuracy, sensitivity, specificity, PPV, NPV, F1 score, and Brier score.

Figure 4

Figure 4. Summary plot of machine learning performance evaluation. (A) ROC curve, (B) calibration plot, (C) DCA curve, (D) PR curve.

3.4 Model interpretation

This study employs the SHAP method to interpret model results, presenting both SHAP summary plots and SHAP force plots. In the SHAP summary plot, the Y-axis represents features, while the X-axis indicates the impact of features on outcomes. Each point represents a sample, with red indicating high-risk values and blue indicating low-risk values. As shown in Figure 3A, the LR model’s feature importance from top to bottom is: age, ventilation, norepinephrine, T, R, TBIL, SBP, WBC, NE, phenylephrine. Older age (red points) correlates with higher SHAP estimated values, predicting an increased risk of mortality. Additionally, higher white blood cell count, total bilirubin, and respiratory rate are associated with increased mortality risk. Patients using ventilation, norepinephrine, and phenylephrine also show increased mortality risk. Furthermore, lower temperature and lower systolic blood pressure are associated with increased mortality risk. In the SHAP force plot, each Shapley value is represented by an arrow, indicating whether it positively (increases) or negatively (decreases) affects the prediction. As illustrated in Figure 3B, increases in white blood cell count, decreases in temperature, and increases in total bilirubin push the predicted mortality risk higher, while younger age and lower neutrophil count push the predicted mortality risk lower. It’s important to note that due to standardization of numerical variables to a mean of 0 and a variance of 1, the data in the plots are not in their original scale.

3.5 External validation

We selected 2928 cases of patients with heart failure and sepsis from the MIMIC-IV database for external validation. In the MIMIC database, these patients had a 28-day in-hospital mortality rate of 22.5%, slightly lower than that of the eICU-CRD database (14.4%). Prior to external validation, we applied the same data processing methods to the MIMIC data as we did to the eICU-CRD data. The validation results revealed an AUC of 0.699 and a Brier score of 0.169. Additionally, the accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were 0.699, 0.156, 0.403, 0.673, 0.648, and 0.261, respectively. With the AUC difference between the external validation and validation/test sets being less than 0.1, we conclude that the LR model demonstrates favorable stability.

4 Discussion

This study represents the pioneering application of machine learning algorithms to forecast in-hospital mortality among patients with heart failure and sepsis. Our model can be applied to heart failure patients with sepsis upon ICU admission. Our model exhibits exceptional performance in distinguishing between survival and mortality outcomes, coupled with robust calibration and clinical relevance. The utilization of external validation bolsters the model’s reliability and generalizability, validating its efficacy across diverse datasets and fortifying the study’s scientific robustness and credibility. Leveraging SHAP for visual interpretation of model outcomes enhances the interpretability of predictive results. Furthermore, the model’s reliance on a concise and readily accessible set of predictive variables underscores its suitability for clinical deployment. Our model could be integrated into clinical decision support systems within hospitals, especially in ICU. The model would automatically calculate the mortality risk for patients with heart failure complicated by sepsis using routinely collected clinical data, with outputs presented to clinicians via the electronic health record system. This would provide real-time risk assessments to help prioritize care and optimize resource allocation.

Our research findings suggest that the Logistic Regression (LR) model exhibits superior performance in predicting the survival rate of patients with heart failure complicated by sepsis. Moreover, studies indicate that the LR algorithm performs effectively in forecasting various clinical binary outcomes (26, 27). LR offers several advantages, including its simplicity, broad applicability, and straightforward result interpretation, establishing it as a pivotal and dependable modeling technique for binary classification problems (28). However, LR has its limitations; it is sensitive to the quality of feature engineering, vulnerable to outliers, and unable to handle complex non-linear relationships. Moreover, in situations with large feature spaces and predominantly sparse features, LR’s performance may be limited, potentially resulting in overfitting (29). Before model construction, we conducted comprehensive data preprocessing, encompassing correlation and multicollinearity assessments, outlier and missing value handling, and data standardization, aiming to enhance the LR model’s efficacy. Additionally, external validation has confirmed that the LR model we constructed avoids overfitting and demonstrates reliable generalization ability.

Sepsis and heart failure are common complications in critically ill patients, characterized by complex pathological conditions. Cardiac dysfunction in sepsis, indicated by reduced EF, may accelerate the progression to septic shock by lowering cardiac output and metabolic demand (30). Treatment strategies for sepsis and heart failure often conflict, influenced by varying severity and patient conditions (31). Fluid resuscitation, recommended in sepsis management guidelines, addresses tissue perfusion deficits but may exacerbate congestive symptoms and worsen prognosis in heart failure (32, 33). Our study indicates that higher fluid balance predicts increased mortality in heart failure and sepsis. Singh et al. found that septic patients receiving >3L of fluid experienced reduced EF and higher in-hospital mortality (34). Additionally, other studies have shown that higher fluid balance during hospitalization is associated with increased mortality in patients with heart failure combined with sepsis (35–37). Zhang et al. discovered that higher fluid balance within 24 h of admission is strongly associated with in-hospital mortality in patients with heart failure and sepsis (OR 2.53, 95% CI 1.60–3.99, p < 0.001) (31). Due to myocardial edema and oxidative stress, excessive fluid intake is a factor contributing to myocardial injury. For patients with high fluid balance, increased atrial and venous pressures can lead to fluid shift into the interstitium, exacerbating tissue edema, causing tissue distortion and microcirculatory disturbances, thereby resulting in cellular metabolic dysregulation (38, 39). There remains controversy surrounding fluid resuscitation. Duttuluri et al. retrospectively evaluated heart failure patients with severe sepsis, finding increased in-hospital mortality and intubation rates in the hypotensive subgroup receiving inadequate fluid (<30 mL/kg) (40).

Our research reveals that patients with elevated respiratory rates, hypotension, and those necessitating interventions such as norepinephrine, phenylephrine, or ventilation, exhibit a heightened risk of mortality prediction. Norepinephrine and phenylephrine are typically employed to augment cardiac contractility and blood pressure for organ perfusion maintenance, while ventilation is essential for respiratory support. This elevated predictive risk likely reflects the severity of patients’ conditions and the associated potential hazards they face. Moreover, it underscores the necessity for prompt and assertive therapeutic interventions tailored to these patients, alongside vigilant monitoring and comprehensive support measures. Sepsis guidelines recommend norepinephrine as the first-line vasopressor for sepsis and septic shock (33). De Backer et al. found in their study that among 280 cases of cardiogenic shock patients, norepinephrine was more effective than dopamine, significantly reducing the 28-day mortality rate (p = 0.03) (41). Additionally, a meta-analysis from 2015 also indicated that in the treatment of septic shock, norepinephrine, compared to dopamine, could lower the mortality rate (RR: 0.89; 95% CI: 0.81–0.98) (42). Additionally, there are studies indicating that compared to adrenaline, norepinephrine carries a lower risk of tachycardia [29] and is associated with reduced mortality risk (43, 44).

With the exacerbation of an aging society, the incidence of sepsis among the elderly is gradually increasing, making it one of the leading causes of mortality in this demographic (45). Age has been demonstrated as an independent risk factor for mortality in sepsis patients, with mortality rates showing a linear increase with advancing age (46). Our research findings indicate that advanced age is associated with a higher predictive risk of mortality in patients with sepsis complicated by heart failure. Elderly patients commonly exhibit compromised immune function, diminished organ reserve, and a higher prevalence of comorbidities such as diabetes and coronary artery disease compared to younger counterparts (47). Sepsis in this demographic frequently presents and swiftly evolves into multi-organ failure. De Matteis et al. studied 6930 elderly patients with heart failure and found that in-hospital mortality increased with advancing age, with infection correlating with an elevated risk of in-hospital death (48). We also found that elevated levels of white blood cells and norepinephrine were associated with poor outcomes. In bacterial and fungal infections, elevated blood neutrophil levels serve as early and sensitive indicators of inflammation (49). Elevated white blood cell or neutrophil counts in sepsis patients suggest immune system activation and intensified inflammatory response, potentially indicating an excessively activated inflammatory state associated with increased mortality risk. Heightened vigilance and proactive therapeutic interventions are warranted to mitigate inflammation and prevent further deterioration in such cases. Additionally, low body temperature and elevated total bilirubin increase the risk of mortality assessment. Observing low body temperature or elevated total bilirubin (TBIL) in sepsis patients may suggest a severe condition and poor prognosis. Low body temperature could indicate suppressed inflammatory response or impaired metabolic function, compromising the body’s resistance to infection. Elevated TBIL may signify impaired liver function, possibly due to infection or inflammation.

This study has several limitations. Firstly, we acknowledge that the quality and completeness of the data in the MIMIC-IV and eICU-CRD databases may have certain limitations, especially with the potential absence of key clinical variables (such as ejection fraction or NT-proBNP), which could affect the accuracy of the model’s predictions. Future studies will need to incorporate more comprehensive data to improve the model and conduct further validation to enhance its accuracy. Secondly, as this is a retrospective study, the data primarily comes from ICU patients, which may introduce selection bias, limiting the model’s broader applicability to other clinical settings. Therefore, we suggest that future research validate the model using multi-center data to reduce selection bias and improve its generalizability. Additionally, the imbalance between survival and death in the dataset may affect the model’s performance in predicting mortality. Lastly, the data were collected at different time points, leading to potential temporal discrepancies, which may cause data drift and result in inconsistent model performance across different periods. Thus, future research should validate and adjust the model using data from various timeframes to address these challenges and ensure the model’s robustness in different temporal and clinical settings.

5 Conclusion

In this study, we constructed a machine learning model to predict 28-day all-cause mortality in ICU patients with heart failure complicated by sepsis. The final Logistic Regression model incorporates commonly used clinical indicators such as age, mechanical ventilation, respiratory rate, blood pressure, white blood cell count, and vasopressor use. This combination of variables enables the model to predict short-term mortality risk early, upon ICU admission, providing a clinical alert for high-risk patients and assisting clinicians in more effectively allocating medical and nursing resources. Furthermore, the model’s generalizability and potential clinical utility were validated across two large ICU databases (eICU-CRD and MIMIC-IV). Despite its strong predictive performance, further updates and validations with larger, multicenter patient cohorts are required to enhance the model’s generalizability and practical application in broader clinical settings.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary material.

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the patients/participants was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions

QZ: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. LX: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. WH: Conceptualization, Data curation, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing. XL: Conceptualization, Data curation, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing. XH: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Acknowledgments

We would like to acknowledge the contributions of specific colleagues and institutions that aided the efforts of the authors in this study.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2024.1410702/full#supplementary-material

References

1. Hajj, J, Blaine, N, Salavaci, J, and Jacoby, D. The "Centrality of Sepsis": A review on incidence, mortality, and cost of care. Healthcare. (2018) 6. doi: 10.3390/healthcare6030090

PubMed Abstract | Crossref Full Text | Google Scholar

2. Napolitano, LM. Sepsis 2018: definitions and guideline changes. Surg Infect. (2018) 19:117–25. doi: 10.1089/sur.2017.278