Early prediction of sepsis associated encephalopathy in elderly ICU patients using machine learning models: a retrospective study based on the MIMIC-IV database

Han, Yupeng; Xie, Xiyuan; Qiu, Jiapeng; Tang, Yijie; Song, Zhiwei; Li, Wangyu; Wu, Xiaodan

doi:10.3389/fcimb.2025.1545979

ORIGINAL RESEARCH article

Front. Cell. Infect. Microbiol., 17 April 2025

Sec. Clinical Infectious Diseases

Volume 15 - 2025 | https://doi.org/10.3389/fcimb.2025.1545979

This article is part of the Research TopicMolecular mechanisms and clinical studies of multi-organ dysfunction in sepsis associated with pathogenic microbial infectionView all 11 articles

Early prediction of sepsis associated encephalopathy in elderly ICU patients using machine learning models: a retrospective study based on the MIMIC-IV database

Yupeng Han^1†

Xiyuan Xie^1†

Jiapeng Qiu^1†

Yijie Tang¹

Zhiwei Song²

Wangyu Li³

Xiaodan Wu^1,4*

¹Department of Anesthesiology, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, China
²Department of Neurology, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, China
³Department of Pain Management, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
⁴Fujian Provincial Key Laboratory of Critical care Medicine, Fujian Provincial Hospital, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, China

Background: Sepsis associated encephalopathy (SAE) is prevalent among elderly patients in the ICU and significantly affects patient prognosis. Due to the symptom similarity with other neurological disorders and the absence of specific biomarkers, early clinical diagnosis remains challenging. This study aimed to develop a predictive model for SAE in elderly ICU patients.

Methods: The data of elderly sepsis patients were extracted from the MIMIC IV database (version 3.1) and divided into training and test sets in a 7:3 ratio. Feature variables were selected using the LASSO-Boruta combined algorithm, and five machine learning (ML) models, including Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost),Light Gradient Boosting Machine(LGBM), Multilayer Perceptron (MLP), and Support Vector Machines (SVM), were subsequently developed using these variables. A comprehensive set of performance metrics was used to assess the predictive accuracy, calibration, and clinical applicability of these models. For the machine learning model with the best performance, we employed the SHapley Additive Explanations(SHAP) method to visualize the model.

Results: Based on strict inclusion and exclusion criteria, a total of 3,156 elderly sepsis patients were enrolled in the study, with an SAE incidence rate of 48.7%. The mortality rate of elderly sepsis patients who developed SAE was significantly higher than that of patients in the non-SAE group (28.78% vs. 12.59%, P < 0.001). A total of 18 feature variables were selected for the construction of the ML model using the LASSO-Boruta combined algorithm. Compared to the other four models and traditional scoring systems, the XGBoost model demonstrated the best overall predictive performance, with Area Under the Curve(AUC)=0.898, accuracy=0.830, recall=0.819, F1-Score=0.820, specificity=0.840, and Precision=0.821. Furthermore, the results from the Decision Curve Analysis (DCA) and calibration curves demonstrated that the XGBoost model has significant clinical value and stable predictive performance. The ten-fold cross-validation method further confirmed the robustness and generalizability of the model. In addition, we simplified the model based on the SHAP feature importance ranking, and the results indicated that the simplified XGBoost model retains excellent predictive ability (AUC=0.858).

Conclusions: The XGBoost model effectively predicts SAE in elderly ICU patients and may serve as a reliable tool for clinicians to identify high-risk patients.

1 Introduction

Sepsis-associated encephalopathy (SAE) is a severe neurological syndrome characterized by sepsis-induced acute diffuse cerebral dysfunction (Bleck et al., 1993; Gofton and Young, 2012; Heming et al., 2017). Early clinical features of SAE include impaired consciousness, cognitive decline, altered mental status, and, in severe cases, coma, which may lead to long-term neurocognitive deficits and a high mortality rate (Sonneville et al., 2023). The onset of SAE involves complex mechanisms, including abnormal neuroinflammatory and immune responses, blood-brain barrier damage, and microcirculatory disorders (Ren et al., 2020; Hong et al., 2023). Previous studies have reported a significant correlation between the occurrence and severity of SAE and patient prognosis, with even mild alterations in consciousness (GCS score of 13 or 14) significantly increasing the risk of death (Sonneville et al., 2017).

With the aging population, elderly patients with septic encephalopathy have garnered increased attention (Manabe and Heneka, 2022). Existing research suggests that the incidence of sepsis and SAE is notably higher among elderly ICU patients, with age being a significant high-risk factor affecting short-term survival in patients who have developed SAE (Ljungström et al., 2019; Chen et al., 2020; Zhang et al., 2024). This reciprocal causality adds significant complexity to the clinical treatment and management of elderly patients with SAE. Additionally, it has been reported that even among elderly patients who survive sepsis, many develop long-term cognitive deficits during recovery (Iwashyna et al., 2010; Muzambi et al., 2021). Although early recognition of high-risk patients and prompt interventions can substantially enhance prognosis, factors such as atypical clinical symptoms and comorbidities associated with elderly SAE complicate the identification and early prediction of risk factors. Therefore, an in-depth analysis of the clinical characteristics of this patient group is essential.

In recent years, machine learning (ML) techniques have demonstrated significant potential for predicting, diagnosing, and assessing the risk of sepsis and its associated complications (Yue et al., 2022; Li et al., 2024; Lin et al., 2024; Prithula et al., 2024). ML is a computational method that builds data models by analyzing large and multidimensional datasets to make predictions. It works by using historical data (such as patient age, medical history, and lab results) to identify variables that may significantly impact the onset and progression of diseases, rather than relying on predefined programming rules. As a result, ML can provide more accurate diagnosis and prognosis assessments compared to traditional predictive models, helping clinicians identify early risks and intervention opportunities. In diagnosing and prognosticating SAE, researchers have utilized machine learning models to conduct comprehensive analyses of clinical, laboratory, and demographic characteristics, providing more personalized prediction tools for clinical risk screening of SAE patients (Lu et al., 2022; Peng et al., 2022). However, evidence supporting the superiority of ML models in the early prediction of SAE occurrence in elderly sepsis patients remains limited. This study aims to develop several ML models for the early prediction of SAE occurrence in elderly sepsis patients in ICU and identify the model with the highest predictive performance.

2 Materials and methods

2.1 Data source

The data used to construct the model were obtained from the single-center database of the Medical Information Marketplace in Intensive Care IV (MIMIC-IV, version v3.1). This database contains clinical information on all patient hospitalizations admitted to the ICU at Beth Israel Deaconess Medical Center from 2008 to 2022, including demographics, length of hospitalization, ICU admissions and discharges, vital signs, laboratory data, medications, and nursing care records (Johnson et al., 2024, Johnson et al., 2023). To request access to the database, the author of this study (YP.H.) completed the Collaborative Institutional Training Initiative (CITI) program exam and received a certificate (ID: 59425375). Because the MIMIC database is de-identified and does not contain private patient information, the Institutional Review Board at Beth Israel Deaconess Medical Center waived the requirement for informed consent.

2.2 Participants

The study included patients who fulfilled the following criteria: (1) age ≥65 years; (2) met Sepsis-3.0 criteria and were admitted to the ICU; (3) no diagnosis of sepsis-associated encephalopathy at the time of ICU admission; (4) length of stay in the ICU ≥24 hours; (5) for patients with multiple ICU admissions, data were collected only for the first admission. In addition, we excluded the following categories of patients: (1) patients with combined brain parenchymal injury (cerebral infarction, cerebral hemorrhage, traumatic brain injury) and other cerebrovascular diseases; (2) patients with mental disorders or dementia; (3) patients with alcohol or drug addiction; (4) patients with hepatic or renal encephalopathy and suspected metabolic encephalopathy (e.g., hyponatremia [<120 mmol/L], hyperglycemia [>180 mg/dL], or hypoglycemia [<54 mg/dL]); and (5) patients with missing delirium assessment or missing data >30%.

The primary outcome of the study was the occurrence of SAE in elderly sepsis patients after the first day of ICU admission. The diagnostic criteria for SAE included meeting the SEPSIS 3.0 criteria, along with a GCS score <15 or a positive ICU delirium assessment (Hong et al., 2023; Sonneville et al., 2023; Kurtz et al., 2024). The Delirium Assessment Scale for the ICU (CAM-ICU) was used to assess delirium in ICU patients. Based on the aforementioned diagnostic criteria and to ensure baseline consistency among study participants, we excluded patients who were diagnosed with altered consciousness (GCS < 15) or delirium within the first 24 hours of admission. Additionally, because this was a hypothesis-driven epidemiological study, no attempt was made to estimate the sample size, and all eligible elderly septic patients in the dataset were included to maximize statistical power. The flow chart of patient selection is shown in Figure 1.

Figure 1

Figure 1. Flow chart of the patient selection.

2.3 Data extraction

Structured Query Language in PostgreSQL was used to extract case information from the database for patients who met the inclusion criteria. The extracted data for this study included the following variables: (1) Demographic information: age, gender, and race; (2) Comorbidities: hypertension (HTN), type 1 diabetes (T1DM), type 2 diabetes (T2DM), hyperlipidemia (HLD), coronary artery disease (CAD), chronic obstructive pulmonary disease (COPD), chronic kidney disease (CKD), and Charlson Comorbidity Index; (3) Vital signs and disease scores at the time of admission: heart rate (HR), respiratory rate (RR), non-invasive mean arterial pressure (MAP), oxygen saturation (SpO2), and disease scores including Sequential Organ Failure Assessment(SOFA), Oxygenation Assessment Index Score (OASIS), Simplified Acute Physiology Score II (SAPS II), and Acute Physiology Score III(APS III); (4) Laboratory indices within 24 hours of admission: white blood cell count (WBC), platelet count (PLT), red blood cell count (RBC), red cell distribution width (RDW), hemoglobin, hematocrit, PCO₂, pH, glucose, potassium, sodium, chloride, anion gap, lactate, prothrombin time (PT), international normalized ratio (INR), creatinine, and blood urea nitrogen (BUN); (5) Prognostic data included therapeutic measures during hospitalization (e.g., Ventilation, continuous renal replacement therapy (CRRT), vasopressors, sedatives and analgesics, Glucocorticoids drugs), length of ICU stay, and 28-day mortality. Additionally, to minimize the impact of missing data on model construction, variables with less than 20% missing data were interpolated using the KNNImputer (KNN) method, while those with more than 20% missing were discarded. The advantage of the KNNImputer method lies in its ability to fill missing values by leveraging the similarity between data points, thereby preserving the inherent structure and relationships of the data. Compared to traditional methods like mean or median imputation, KNNImputer handles complex, nonlinear data distributions more effectively, particularly in cases with multiple missing features. Furthermore, KNNImputer does not require specific distribution assumptions, offering greater flexibility and enhancing the accuracy and stability of predictive models (Akter et al., 2024; Guan et al., 2024).

2.4 Feature selection

After including patients based on strict adherence to the inclusion and exclusion criteria, we split the dataset into training and test sets in a 7:3 ratio using the Bootstrap sampling technique. The Bootstrap method is a robust and flexible tool for statistical inference and model evaluation. Bootstrap generates a large volume of sample data through repeated sampling, which helps balance differences in sample distributions and enables the full modeling of the sampling distributions for both the model and control groups, thus facilitating the assessment of differences or relationships between groups. Additionally, the final feature variables used in the logistic regression and machine learning models were selected using LASSO regression and Boruta methods based on the validation set (Li et al., 2024). The LASSO method selects features and reduces dimensionality by shrinking coefficients, retaining features with larger contributions and eliminating redundant ones. The Boruta algorithm identifies the most important features by comparing the Z-value of each feature to that of the “shadow features.” The common feature variables identified by both methods were selected as the final feature set for the model. This approach enhances model accuracy while reducing the risk of overfitting and excluding irrelevant predictors. Considering that variables within the first 24 hours of admission better reflect the patient’s initial health status and disease severity, which are crucial for predicting early risks and clinical outcomes, while daily treatment measures during hospitalization are more influenced by changes in the patient’s condition and are less effective in predicting early SAE occurrence, this study included only monitoring data from the first 24 hours of admission in the variable selection.

2.5 Construction and validation of machine learning models

After feature selection, five machine learning models were employed for constructing and validating the diagnostic model using identical training and validation sets. These models include three ensemble algorithms—Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LGBM)—and two conventional base algorithms: Multilayer Perceptron (MLP) and Support Vector Machine (SVM). The models were trained on the training set, and the test set was used for model validation. We evaluated the performance of the ML prediction models using metrics such as Area Under the Curve (AUC), specificity, recall, F1 score, and accuracy to identify the best diagnostic model. Additionally, calibration curves, precision-recall (PR) curves, and Decision Curve Analysis (DCA) were used to evaluate the calibration and clinical applicability of the ML models. Finally, for the best-performing diagnostic models, we revalidated their generalization ability and robustness using 10-fold cross-validation to prevent overfitting.

2.6 Model interpretation and feature importance

SHapley Additive Explanations(SHAP) is a game-theory-based model interpretation method used to explain the output of machine learning models. SHAP interprets the impact of each feature on the final prediction by considering all possible combinations and orders of features and calculating each feature’s contribution in those combinations (Wang et al., 2021). The SHAP feature importance ranking and SHAP bees plot show each feature’s contribution to the final prediction, while the SHAP force diagram offers an intuitive visualization of how different features influence individual predictions. In our work, we use the aforementioned SHAP methods to visualize the best-performing ML models in terms of efficacy, thereby enhancing their interpretability.

2.7 Comparison of the optimal model with traditional scoring systems

To assess whether predictive models outperform traditional methods in early prediction of SAE in elderly ICU patients, We evaluated the best models against traditional scoring systems using the same dataset.

2.8 Simplification of the best machine learning prediction model

In this study, we aim to simplify the model with the highest predictive efficacy based on the SHAP feature importance ranking results. The simplified model not only reduces the complexity of clinical decision-making but also enables clinicians to quickly assess the patient’s condition in daily practice, thereby enhancing the efficiency and accuracy of clinical decision-making.

2.9 Statistical analysis

Data analysis was performed using DecisionLinnc1.0 software, a platform that integrates multiple programming language environments for data processing, analysis, and machine learning model construction through a visual interface (DecisionLinnc Core Team, 2023). The Kolmogorov-Smirnov test was used for continuous variables. As these variables were non-normally distributed, they are presented as median (interquartile range), with differences between groups assessed using the Mann-Whitney U test. Categorical variables are presented as percentages (%), and group differences were compared with the Pearson chi-square test, with p-values < 0.05 considered statistically significant.

3 Results

3.1 Comparison of clinical information of patients

A total of 3,156 elderly sepsis patients were included in the study based on rigorous inclusion and exclusion criteria, with 1,620 patients in the non-SAE group and 1,536 in the SAE group, resulting in an SAE incidence rate of 48.7%.The overall missing data situation is shown in Supplementary Table 1. Patients in the SAE group had a higher median age compared to those in the non-SAE group (P < 0.05). In terms of comorbidities, the SAE group exhibited a higher Charlson Comorbidity Index, a greater proportion of patients with comorbid COPD and AKI, and a lower proportion of those with hyperlipidemia and ischemic heart disease (IHD) (P < 0.05). Concerning vital signs upon admission and disease severity scores, the SAE group had significantly higher median values for MAP, RR, HR, and disease scores (SOFA, OASIS, SAPS II, and APS III), while their median SpO² was lower compared to the non-SAE group (P < 0.05). Above results are presented in detail in Table 1.

Table 1

Table 1. Comparison of baseline characteristics in the NonSAE and SAE groups.

Laboratory data comparison (Table 2) showed that WBC, PLT, RBC, RDW, Hemoglobin, Hematocrit, PCO₂, Glucose, Sodium, Anion gap, Creatinine, and BUN levels were higher in the elderly SAE group compared to the non-SAE group (P < 0.05). In contrast, PCO2, pH, and chloride levels were lower in the SAE group (P < 0.05). No significant differences were found in coagulation function indices, including PTT, PT, INR, potassium, and lactate (P > 0.05).

Table 2

Table 2. Comparison of laboratory examination data between two groups of patients on admission.

Regarding treatment, a higher proportion of patients in the SAE group received vasopressin, sedatives, analgesics, and corticosteroids (P < 0.05). The SAE group also had a greater proportion of patients requiring ventilation and CRRT (P < 0.05). Moreover, the ICU stay duration, and 28-day mortality rate were significantly higher in the SAE group compared to the non-SAE group (P < 0.05). The results are presented in Table 3.

Table 3

Table 3. Comparison of treatment and prognosis between two groups of patients.

3.2 Feature selection

We sequentially used Lasso regression and the Boruta method to identify relevant features from the training set. In LASSO regression, the variable coefficients are presented in Figure 2A, and the relationship between the regularization parameter (λ) and the mean cross-validation error (CVM) is depicted in Figure 2B. These results indicate that λ = 0.0009 (i.e., Logλ-min = -6.968) is the optimal value for achieving the model’s highest efficacy. The 18 variables identified in the LASSO regression as strongly associated with the occurrence of SAE in elderly sepsis patients included MAP, RR, HTN, COPD, CKD, SOFA, OASIS, SAPS II, Charlson, WBC, sodium, PLT, hematocrit, glucose, anion gap, PCO₂, PTT, and BUN. The regression coefficients for the variables in the LASSO regression are provided in Supplementary Table 2. Subsequently, the Boruta method identified only T1DM as an irrelevant variable (Figure 2C). Ultimately, the 18 variables identified above were included in the subsequent analysis.

Figure 2

Figure 2. Feature selection. (A) The relationship between Lambda (regularization parameter) and CVM (mean cross validation error) in Lasso regression; (B) Lasso regression Lambda and Coefficients Plot. (C) Feature selection based on Boruta principle.

3.3 Model performance comparisons

Five ML models were developed to assess the risk of SAE in elderly sepsis patients in our study. The ROC curves demonstrated that the three models from the integrated algorithm (XGBoost, LGBM, and CatBoost) exhibited good predictive performance for new SAEs in elderly sepsis patients, outperforming the MLP and SVM models based on the common algorithm (Figure 3A). Among these models, the XGBoost model (AUC=0.898) demonstrated the best performance, followed by LGBM (AUC = 0.882), CatBoost (AUC = 0.872), MLP (AUC = 0.691), and SVM (AUC = 0.672). In terms of clinical applicability, the three models from the integrated algorithm demonstrated consistent net benefits across various threshold probabilities, with the XGBoost model providing the greatest net benefit (Figure 3B), demonstrating the significant clinical value of our developed ML model. Additionally, the calibration curves confirmed the stability of the results for each model (Figure 3C). We also examined the detailed performance metrics of the aforementioned ML models (shown in Table 4), and the results indicated that the XGBoost model outperformed the other four models.

Figure 3

Figure 3. Machine learning model construction and diagnostic energy efficiency evaluation. (A) ROC Curve Plot; (B) DCA Curve Plot; (C) Calibration Plot.

Table 4

Table 4. Performances of the machine learning models for predicting SAE in elderly ICU patients.

Furthermore, to assess the robustness and generalization ability of the three models—XGBoost, LGBM, and CatBoost—we re-examined the predictive efficacy of the models using ten-fold cross-validation. The detailed performance metrics (Table 5) and ROC curve results (Figure 4) confirm that our constructed model demonstrates good robustness and generalization ability, with no signs of overfitting or underfitting. Based on these results, the XGBoost model was identified as the best-performing model in this study.

Table 5

Table 5. Ten fold cross validation evaluation of three ensemble algorithm ML models.

Figure 4

Figure 4. Ten fold cross validation evaluation. (A) ROC Curve Plot of XGBoost; (B) ROC Curve Plot of LGBM; (C) ROC Curve Plot of CatBoost.

3.4 Comparison of the optimal model with traditional scoring systems

We compared the XGBoost model with traditional scoring systems using the same dataset. Our results revealed that traditional scoring systems (SOFA, OASIS, SAPS II, APS III) exhibited poor predictive efficacy, with all AUC values falling below 0.7 (Table 6).

Table 6

Table 6. Comparison of the optimal model with traditional scoring systems.

3.5 Model visualization based on SHAP principle

We focus on visualizing the contribution of feature variables in the XGBoost model using the SHAP (Shapley Additive Explanations) principle. The SHAP feature importance ranking and swarm plots display the relative contribution of each feature to the model’s global prediction outcomes, while the SHAP force plots illustrate the contribution of these factors for a specific individual. The SHAP-based feature importance ranking plot (Figure 5A) and the swarm plot (Figure 5B) show the global contribution of feature variables incorporated in the XGBoost model, with the horizontal axis representing the SHAP values, indicating each feature’s contribution to the model’s predicted outcomes. The vertical axis ranks the features based on the impact of their cumulative SHAP values. Our results demonstrate that OASIS, MAP, PCO2, SOFA, and PLT are the five most important features influencing new-onset SAEs in elderly patients with sepsis. The SHAP force plot (Figure 5C) highlights the direction and magnitude of each feature’s influence on the prediction for a specific elderly patient with SAE, based on the XGBoost model for this particular outcome. In this visualization, red indicators represent a positive impact, while blue indicators denote a negative impact. Notably, our results show that for this particular patient, the critical values influencing the likelihood of an SAE are PLT = 199, Oasis = 32, Sodium = 147, PCO₂ = 54, and MAP = 73. In addition, we further simplified the XGBoost predictive model by utilizing these five metrics. The results indicate that the simplified model maintains excellent predictive ability (AUC=0.858), and the DCA curve, along with the calibration curve, validate the reliability of the findings (Figure 6).

Figure 5

Figure 5. Model interpretation and feature importance. (A) SHAP Importance Plot; (B) SHAP Bees Plot; (C) SHAP Heat Force Plot.

Figure 6

Figure 6. Simplification of the XGBoost Prediction Model construction and diagnostic energy efficiency evaluation. (A) ROC Curve Plot; (B) DCA Curve Plot;(C) Calibration Plot.

3 Discussion

In this study, we constructed a prediction model based on ML models for the risk of SAE in elderly sepsis patients admitted to ICU. We identified 18 clinical variables through the use of LASSO combined with the Boruta method and constructed six machine learning models using these variables. Subsequent results demonstrated that the XGBoost algorithm model exhibited the best predictive performance among all ML models and provided substantial clinical utility, as confirmed by the DCA curve analysis. The ten-fold cross-validation results provided additional confirmation of the stability and clinical utility of the XGBoost algorithm model for predicting SAE diagnosis in elderly patients.

SAE is recognized as the most common encephalopathy in the ICU, with incidence rates varying across studies and populations, typically ranging from 10% to 50% globally (Gofton and Young, 2012; Sonneville et al., 2017, Sonneville et al., 2023). In our study, the incidence was 48.7%, which is relatively high. Compared to elderly sepsis patients without neurological disorders, those with SAE had significantly longer ICU stays and higher 28-day mortality risk, highlighting the severity and complexity of sepsis in the elderly. Several previous studies have identified age as an important risk factor in the onset and progression of SAE (Ljungström et al., 2019; Chen et al., 2020; Zhang et al., 2024). Therefore, Prompt detection of high-risk patients at risk of developing SAE is crucial for elderly sepsis patients admitted to ICU. Unfortunately, although existing guidelines for sepsis management emphasize early recognition and prompt treatment of sepsis (Evans et al., 2021), they mainly focus on signs of sepsis, source of infection control, and organ function support, and fail to provide an in-depth analysis of elderly septic patients, including those with SAE (Gamboa-Antiñolo, 2021). Traditional sepsis scoring systems such as SOFA, Oasis, Saps II, and Aps III are widely used in the critical care assessment of sepsis (Qiu et al., 2023; Fan and Ma, 2024), but our results demonstrated limited effectiveness in identifying elderly SAE, with AUC values consistently below 0.7. In fact, these traditional scoring systems primarily focus on physiological parameters and organ failure. Therefore, despite their validity in assessing the severity and short-term prognosis of sepsis, they are not well-suited for diagnosing and predicting sepsis-associated encephalopathy.

In recent years, ML has demonstrated great potential in the diagnosis and prognosis of sepsis and its associated complications. By analyzing comprehensive clinical data, ML models are able to identify potential risk factors for the development of sepsis and predict the progression of the disease, with primary applications in early prediction, risk assessment, and the development of personalized treatment strategies (Fleuren et al., 2020; Komorowski et al., 2022; Upadhyaya et al., 2025). The introduction of machine learning has improved the accuracy and efficiency of sepsis management and offered robust support for clinical decision-making. For example, Li et al. constructed an ML-based model from the data of 1,663 patients receiving RRT in the MIMIC database and found that the LR model exhibited outstanding performance in predicting the risk of clinical prognosis in patients with sepsis-associated acute kidney injury undergoing RRT (Li et al., 2024). Notably, ML methods have also demonstrated significant potential in the diagnosis and prognosis of the SAE, and several researchers have built models for early diagnosis and short-term prognostic risk assessment of high-risk patients with SAE using ML approaches (Ge et al., 2022; Lu et al., 2022; Peng et al., 2022; Guo et al., 2023). However, there remains a gap in machine learning models specifically for predicting SAE risk in elderly sepsis patients.

Among the five ML prediction models developed in this study, the three integration algorithm-based models—XGBoost, LGBM, and CatBoost—outperform the MLP and SVM models that utilize standard algorithms. This finding suggests that the integrated learning approach performs more effectively for this type of prediction task, likely due to its ability to combine multiple weak learners and enhance the generalization and accuracy of the model. Our results demonstrate that the geriatric SAE prediction model based on the XGBoost algorithm exhibits the best overall performance, surpassing the other four ML models and the traditional ICU condition scoring tool. This outcome is consistent with findings from previous studies (Lu et al., 2022; Zhang et al., 2023). XGBoost is an efficient, gradient boosting algorithm widely used for classification, regression, and ranking tasks. It has the advantage of combining multiple weak predictive models to generate accurate predictions. Due to its superior comprehensive performance, XGBoost ML has garnered increasing attention for predicting adverse clinical outcomes (Yue et al., 2022; Guan et al., 2024).

SHAP is a method used to explain the degree of contribution of feature variables in ML models and provides a clearer visual interpretation of model predictions. The global analysis utilizing SHAP identifies OASIS, MAP, PCO2, SOFA, and platelets as the top five factors influencing the occurrence of SAE in elderly sepsis patients. OASIS and SOFA are commonly employed in ICUs to assess patient condition (He et al., 2022; Fan and Ma, 2024), with higher scores typically indicating more severe disease states. This suggests that the patient’s physiological systems are under greater stress, leading to a significantly increased risk of SAE. Elevated MAP generally reflects higher blood pressure, which may impact cerebral blood flow, causing endothelial damage in cerebral vessels and exacerbating neurological symptoms or cerebral complications (Slessarev et al., 2020; Wang et al., 2022). This, in turn, increases the risk of SAEs in elderly patients with sepsis. Our modeling results also indicated that elevated blood indicators, such as platelets and PCO2, were associated with the occurrence of SAEs. Platelets, key cells in blood coagulation, are often activated in septic conditions, promoting thrombosis. When platelet counts are elevated, they not only raise the risk of thrombosis but also contribute to SAE by aggravating microvascular damage and promoting an inflammatory response, which can lead to organ failure (Fodil and Zafrani, 2022; Leung and Middleton, 2024). Finally, elevated PCO₂ may be linked to brain tissue hypoxia and inadequate cerebral perfusion during sepsis. Sepsis-induced metabolic disturbances lead to carbon dioxide accumulation, raising PCO₂ levels, which may worsen the mismatch in cerebral blood flow regulation (Carr et al., 2023; Caldwell et al., 2024), thereby impairing neurological function and the normal metabolism of brain cells. In summary, the combination of multiple physiological indicators and pathological states in elderly septic patients significantly increases the risk of SAE. Meanwhile, we further simplified the model using the above five indicators. The results indicate that the simplified model retains excellent predictive ability (AUC=0.858). The simplified model enables clinicians to quickly access and evaluate these key indicators, thereby improving prediction efficiency. These findings also underscore the significance of machine learning models in developing disease prediction systems.

This study has several limitations. First, patients with SAE often require sedation to control agitation, reduce metabolic demand, or improve ventilation tolerance (Helbing et al., 2018); however, sedation can obscure symptoms of delirium and cognitive impairment, potentially distorting assessment scores. The patient consciousness assessment data in this study were derived from electronic medical records available in the MIMIC database; however, they lacked key information, such as the sedation-to-assessment interval and the state of recovery of consciousness, which may have led to biased results. Second, the retrospective nature of this study, based on a single database, may restrict the applicability and generalizability of the findings. Future studies should incorporate multicenter ICU data and conduct prospective investigations to stratify patient populations more effectively (e.g., considering the timing of SAE onset and the type and timing of sedative medication use), allowing for a more comprehensive evaluation and refinement of the model. Moreover, this retrospective study is based on the MIMIC database, which primarily includes ICU patients from the United States. Its demographic composition may not fully represent global populations, limiting the generalizability of our findings. Ethnic differences can influence SAE presentation, treatment response, and outcomes (Haddad et al., 2020), while disparities in healthcare access and comorbidities may affect the external validity of our model. Future studies should validate these findings in diverse populations and assess whether race-specific model adjustments enhance performance across demographic groups. Finally, although the SHAP method was used to visually illustrate the high-risk factors for SAE occurrence in elderly patients, aiding physicians in identifying and understanding the most relevant clinical characteristics, further optimization of the model is needed. Future efforts should focus on optimizing the model, developing user-friendly interfaces (e.g., a mobile application or web tool), and exploring its integration with electronic medical record systems. This would enable clinicians to input patient data and retrieve predictive results without requiring programming expertise, thereby facilitating clinical decision-making.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Author contributions

YH: Conceptualization, Data curation, Writing – original draft, Writing – review & editing. XX: Conceptualization, Writing – original draft, Data curation, Writing – review & editing. JQ: Conceptualization, Writing – original draft, Writing – review & editing, Data curation. YT: Formal Analysis, Writing – original draft. ZS: Formal Analysis, Writing – original draft. WL: Methodology, Writing – original draft. XW: Funding acquisition, Supervision, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This article was supported by grant 82271238 from the National Natural Science Foundation of China.

Acknowledgments

The authors would like to thank all the colleagues who contributed to this work.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcimb.2025.1545979/full#supplementary-material

References

Akter, S., Simul Hasan Talukder, M., Mondal, S. K., Aljaidi, M., Bin Sulaiman, R., Alshammari, A. A. (2024). Brain tumor classification utilizing pixel distribution and spatial dependencies higher-order statistical measurements through explainable ML models. Sci. Rep. 14, 25800. doi: 10.1038/s41598-024-74731-8

PubMed Abstract | Crossref Full Text | Google Scholar

Bleck, T. P., Smith, M. C., Pierre-Louis, S. J., Jares, J. J., Murray, J., Hansen, C. A. (1993). Neurologic complications of critical medical illnesses. Crit. Care Med. 21, 98–103. doi: 10.1097/00003246-199301000-00019

PubMed Abstract | Crossref Full Text | Google Scholar

Caldwell, H. G., Hoiland, R. L., Bain, A. R., Howe, C. A., Carr, J., Gibbons, T. D., et al. (2024). Evidence for direct CO(2) -mediated alterations in cerebral oxidative metabolism in humans. Acta physiologica (Oxford England) 240, e14197. doi: 10.1111/apha.14197

PubMed Abstract | Crossref Full Text | Google Scholar

Carr, J., Day, T. A., Ainslie, P. N., Hoiland, R. L. (2023). The jugular venous-to-arterial PCO2 difference during rebreathing and end-tidal forcing: Relationship with cerebral perfusion. J. Physiol. (Lond.) 601, 4251–4262. doi: 10.1113/JP284449

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, J., Shi, X., Diao, M., Jin, G., Zhu, Y., Hu, W., et al. (2020). A retrospective study of sepsis-associated encephalopathy: epidemiology, clinical features and adverse outcomes. BMC Emerg Med. 20, 77. doi: 10.1186/s12873-020-00374-3

PubMed Abstract | Crossref Full Text | Google Scholar

DecisionLinnc Core Team (2023). DecisionLinnc. 1.0. Available online at: https://www.statsape.com/ (Accessed October 10, 2024).

Google Scholar

Evans, L., Rhodes, A., Alhazzani, W., Antonelli, M., Coopersmith, C. M., French, C., et al. (2021). Surviving sepsis campaign: international guidelines for management of sepsis and septic shock 2021. Intensive Care Med. 47, 1181–1247. doi: 10.1007/s00134-021-06506-y

PubMed Abstract | Crossref Full Text | Google Scholar

Fan, S., Ma, J. (2024). The value of five scoring systems in predicting the prognosis of patients with sepsis-associated acute respiratory failure. Sci. Rep. 14, 4760. doi: 10.1038/s41598-024-55257-5

PubMed Abstract | Crossref Full Text | Google Scholar

Fleuren, L. M., Klausch, T., Zwager, C. L., Schoonmade, L. J., Guo, T., Roggeveen, L. F., et al. (2020). Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med. 46, 383–400. doi: 10.1007/s00134-019-05872-y

PubMed Abstract | Crossref Full Text | Google Scholar

Fodil, S., Zafrani, L. (2022). Severe thrombotic thrombocytopenic purpura (TTP) with organ failure in critically ill patients. J. Clin. Med. 11, 1103. doi: 10.3390/jcm11041103

PubMed Abstract | Crossref Full Text | Google Scholar

Gamboa-Antiñolo, F. M. (2021). Prognostic tools for elderly patients with sepsis: in search of new predictive models. Intern Emerg Med. 16, 1027–1030. doi: 10.1007/s11739-021-02729-5

PubMed Abstract | Crossref Full Text | Google Scholar

Ge, C., Deng, F., Chen, W., Ye, Z., Zhang, L., Ai, Y., et al. (2022). Machine learning for early prediction of sepsis-associated acute brain injury. Front. Med. (Lausanne) 9. doi: 10.3389/fmed.2022.962027

PubMed Abstract | Crossref Full Text | Google Scholar

Gofton, T. E., Young, G. B. (2012). Sepsis-associated encephalopathy. Nat. Rev. Neurol. 8, 557–566. doi: 10.1038/nrneurol.2012.183

PubMed Abstract | Crossref Full Text | Google Scholar

Guan, C., Gong, A., Zhao, Y., Yin, C., Geng, L., Liu, L., et al. (2024). Interpretabl e machine learning model for new-onset atrial fibrillation prediction in critically ill patients: a multi-center study. Crit. Care (London England) 28, 349. doi: 10.1186/s13054-024-05138-0

PubMed Abstract | Crossref Full Text | Google Scholar

Guo, J., Cheng, H., Wang, Z., Qiao, M., Li, J., Lyu, J. (2023). Factor analysis based on SHapley Additive exPlanations for sepsis-associated encephalopathy in ICU mortality prediction using XGBoost - a retrospective study based on two large database. Front. Neurol. 14. doi: 10.3389/fneur.2023.1290117

PubMed Abstract | Crossref Full Text | Google Scholar

Haddad, D. N., Mart, M. F., Wang, L., Lindsell, C. J., Raman, R., Nordness, M. F., et al. (2020). Socioeconomic factors and intensive care unit-related cognitive impairment. Ann. Surg. 272, 596–602. doi: 10.1097/SLA.0000000000004377

PubMed Abstract | Crossref Full Text | Google Scholar

He, Y., Xu, J., Shang, X., Fang, X., Gao, C., Sun, D., et al. (2022). Clinical characteristics and risk factors associated with ICU-acquired infections in sepsis: A retrospective cohort study. Front. Cell Infect. Microbiol 12. doi: 10.3389/fcimb.2022.962470

PubMed Abstract | Crossref Full Text | Google Scholar

Helbing, D. L., Böhm, L., Witte, O. W. (2018). Sepsis-associated encephalopathy. CMAJ: Can. Med. Assoc. J. = J. l’Association medicale Can. 190, E1083. doi: 10.1503/cmaj.180454

PubMed Abstract | Crossref Full Text | Google Scholar

Heming, N., Mazeraud, A., Verdonk, F., Bozza, F. A., Chrétien, F., Sharshar, T. (2017). Neuroanatomy of sepsis-associated encephalopathy. Crit. Care (London England) 21, 65. doi: 10.1186/s13054-017-1643-z

PubMed Abstract | Crossref Full Text | Google Scholar

Hong, Y., Chen, P., Gao, J., Lin, Y., Chen, L., Shang, X. (2023). Sepsis-associated encephalopathy: From pathophysiology to clinical management. Int. Immunopharmacol. 124, 110800. doi: 10.1016/j.intimp.2023.110800

PubMed Abstract | Crossref Full Text | Google Scholar

Iwashyna, T. J., Ely, E. W., Smith, D. M., Langa, K. M. (2010). Long-term cognitive impairment and functional disability among survivors of severe sepsis. JAMA 304, 1787–1794. doi: 10.1001/jama.2010.1553

PubMed Abstract | Crossref Full Text | Google Scholar

Johnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., et al. (2024). MIMIC-IV (version 3.1). PhysioNet. doi: 10.13026/kpb9-mt58

PubMed Abstract | Crossref Full Text | Google Scholar

Johnson, A., Bulgarelli, L., Shen, L., Gayles, A., Shammout, A., Horng, S., et al. (2023). MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data 10, 1. doi: 10.1038/s41597-022-01899-x

PubMed Abstract | Crossref Full Text | Google Scholar

Komorowski, M., Green, A., Tatham, K. C., Seymour, C., Antcliffe, D. (2022). Sepsis biomarkers and diagnostic tools with a focus on machine learning. EBioMedicine 86, 104394. doi: 10.1016/j.ebiom.2022.104394

PubMed Abstract | Crossref Full Text | Google Scholar

Kurtz, P., van den Boogaard, M., Girard, T. D., Hermann, B. (2024). Acute encephalopathy in the ICU: a practical approach. Curr. Opin. Crit. Care 30, 106–120. doi: 10.1097/MCC.0000000000001144

PubMed Abstract | Crossref Full Text | Google Scholar

Leung, G., Middleton, E. A. (2024). The role of platelets and megakaryocytes in sepsis and ARDS. J. Physiol. (Lond.) 602, 6047–6063. doi: 10.1113/JP284879

PubMed Abstract | Crossref Full Text | Google Scholar

Li, C., Zhao, K., Ren, Q., Chen, L., Zhang, Y., Wang, G., et al. (2024). Development and validation of a model for predicting in-hospital mortality in patients with sepsis-associated kidney injury receiving renal replacement therapy: a retrospective cohort study based on the MIMIC-IV database. Front. Cell Infect. Microbiol 14. doi: 10.3389/fcimb.2024.1488505

PubMed Abstract | Crossref Full Text | Google Scholar

Lin, J., Gu, C., Sun, Z., Zhang, S., Nie, S. (2024). Machine learning-based model for predicting the occurrence and mortality of nonpulmonary sepsis-associated ARDS. Sci. Rep. 14, 28240. doi: 10.1038/s41598-024-79899-7

PubMed Abstract | Crossref Full Text | Google Scholar

Ljungström, L., Andersson, R., Jacobsson, G. (2019). Incidences of community onset severe sepsis, Sepsis-3 sepsis, and bacteremia in Sweden - A prospective population-based study. PloS One 14, e0225700. doi: 10.1371/journal.pone.0225700

PubMed Abstract | Crossref Full Text | Google Scholar

Lu, X., Kang, H., Zhou, D., Li, Q. (2022). Prediction and risk assessment of sepsis-associated encephalopathy in ICU based on interpretab le machine learning. Sci. Rep. 12, 22621. doi: 10.1038/s41598-022-27134-6

PubMed Abstract | Crossref Full Text | Google Scholar

Manabe, T., Heneka, M. T. (2022). Cerebral dysfunctions caused by sepsis during ageing. Nat. Rev. Immunol. 22, 444–458. doi: 10.1038/s41577-021-00643-7

PubMed Abstract | Crossref Full Text | Google Scholar

Muzambi, R., Bhaskaran, K., Smeeth, L., Brayne, C., Chaturvedi, N., Warren-Gash, C. (2021). Assessment of common infections and incident dementia using UK primary and secondary care data: a historical cohort study. Lancet Healthy Longevity 2, e426–e435. doi: 10.1016/S2666-7568(21)00118-5

PubMed Abstract | Crossref Full Text | Google Scholar

Peng, L., Peng, C., Yang, F., Wang, J., Zuo, W., Cheng, C., et al. (2022). Machine learning approach for the prediction of 30-day mortality in patients with sepsis-associated encephalopathy. BMC Med. Res. Methodol 22, 183. doi: 10.1186/s12874-022-01664-z

PubMed Abstract | Crossref Full Text | Google Scholar

Prithula, J., Islam, K. R., Kumar, J., Tan, T. L., Reaz, M., Rahman, T., et al. (2024). A novel classical machine learning framework for early sepsis prediction using electronic health record data from ICU patients. Comput. Biol. Med. 184, 109284. doi: 10.1016/j.compbiomed.2024.109284

PubMed Abstract | Crossref Full Text | Google Scholar

Qiu, X., Lei, Y. P., Zhou, R. X. (2023). SIRS, SOFA, qSOFA, and NEWS in the diagnosis of sepsis and prediction of adverse outcomes: a systematic review and meta-analysis. Expert Rev. Anti Infect. Ther. 21, 891–900. doi: 10.1080/14787210.2023.2237192

PubMed Abstract | Crossref Full Text | Google Scholar

Ren, C., Yao, R. Q., Zhang, H., Feng, Y. W., Yao, Y. M. (2020). Sepsis-associated encephalopathy: a vicious cycle of immunosuppression. J. Neuroinflammation 17, 14. doi: 10.1186/s12974-020-1701-3

PubMed Abstract | Crossref Full Text | Google Scholar

Slessarev, M., Mahmoud, O., McIntyre, C. W., Ellis, C. G. (2020). Cerebral blood flow deviations in critically ill patients: potential insult contributing to ischemic and hyperemic injury. Front. Med. (Lausanne) 7. doi: 10.3389/fmed.2020.615318

PubMed Abstract | Crossref Full Text | Google Scholar

Sonneville, R., Benghanem, S., Jeantin, L., de Montmollin, E., Doman, M., Gaudemer, A., et al. (2023). The spectrum of sepsis-associated encephalopathy: a clinical perspective. Crit. Care (London England) 27, 386. doi: 10.1186/s13054-023-04655-8

PubMed Abstract | Crossref Full Text | Google Scholar

Sonneville, R., de Montmollin, E., Poujade, J., Garrouste-Orgeas, M., Souweine, B., Darmon, M., et al. (2017). Potentially modifiable factors contributing to sepsis-associated encephalopathy. Intensive Care Med. 43, 1075–1084. doi: 10.1007/s00134-017-4807-z

PubMed Abstract | Crossref Full Text | Google Scholar

Upadhyaya, P., Wang, J., Mathew, D. T., Ali, A., Tallowin, S., Gann, E., et al. (2025). Predicting sepsis induced hypotension patient attributes for restrictive vs liberal fluid strategy. Shock 63, 309–405. doi: 10.1097/SHK.0000000000002506

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, S., Tang, C., Liu, Y., Border, J. J., Roman, R. J., Fan, F. (2022). Impact of impaired cerebral blood flow autoregulation on cognitive impairment. Front. Aging 3. doi: 10.3389/fragi.2022.1077302

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, K., Tian, J., Zheng, C., Yang, H., Ren, J., Liu, Y., et al. (2021). Interpretab le prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput. Biol. Med. 137, 104813. doi: 10.1016/j.compbiomed.2021.104813

PubMed Abstract | Crossref Full Text | Google Scholar

Yue, S., Li, S., Huang, X., Liu, J., Hou, X., Zhao, Y., et al. (2022). Machine learning for the prediction of acute kidney injury in patients with sepsis. J. Transl. Med. 20, 215. doi: 10.1186/s12967-022-03364-0

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, Z., Guo, L., Jia, L., Duo, H., Shen, L., Zhao, H. (2024). Factors contributing to sepsis-associated encephalopathy: a comprehensive systematic review and meta-analysis. Front. Med. (Lausanne) 11. doi: 10.3389/fmed.2024.1379019

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, Y., Hu, J., Hua, T., Zhang, J., Zhang, Z., Yang, M. (2023). Development of a machine learning-based prediction model for sepsis-associated delirium in the intensive care unit. Sci. Rep. 13, 12697. doi: 10.1038/s41598-023-38650-4

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: machine learning, early prediction, sepsis associated encephalopathy, elderly, MIMIC-IV

Citation: Han Y, Xie X, Qiu J, Tang Y, Song Z, Li W and Wu X (2025) Early prediction of sepsis associated encephalopathy in elderly ICU patients using machine learning models: a retrospective study based on the MIMIC-IV database. Front. Cell. Infect. Microbiol. 15:1545979. doi: 10.3389/fcimb.2025.1545979

Received: 16 December 2024; Accepted: 31 March 2025;
Published: 17 April 2025.

Edited by:

Lina Zhao, Tianjin Medical University General Hospital, China

Reviewed by:

Qiyang Li, Southern Medical University, China
Peter Klein Klouwenberg, Analytical Diagnostic Center (ADC), Curaçao

Copyright © 2025 Han, Xie, Qiu, Tang, Song, Li and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xiaodan Wu, d3hpYW9kYW5Ac2luYS5jb20=

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.