- Department of Nephrology, Hunan Key Laboratory of Kidney Disease and Blood Purification, The Second Xiangya Hospital of Central South University, Changsha, China
Background: Sepsis-associated acute kidney injury (SA-AKI) is common in critically ill patients, which is associated with significantly increased mortality. Existing mortality prediction tools showed insufficient predictive power or failed to reflect patients' dynamic clinical evolution. Therefore, the study aimed to develop and validate machine learning-based models for real-time mortality prediction in critically ill patients with SA-AKI.
Methods: The multi-center retrospective study included patients from two distinct databases. A total of 12,132 SA-AKI patients from the Medical Information Mart for Intensive Care IV (MIMIC-IV) were randomly allocated to the training, validation, and internal test sets. An additional 3,741 patients from the eICU Collaborative Research Database (eICU-CRD) served as an external test set. For every 12 h during the ICU stays, the state-of-the-art eXtreme Gradient Boosting (XGBoost) algorithm was used to predict the risk of in-hospital death in the following 48, 72, and 120 h and in the first 28 days after ICU admission. Area under the receiver operating characteristic curves (AUCs) were calculated to evaluate the models' performance.
Results: The XGBoost models, based on routine clinical variables updated every 12 h, showed better performance in mortality prediction than the SOFA score and SAPS-II. The AUCs of the XGBoost models for mortality over different time periods ranged from 0.848 to 0.804 in the internal test set and from 0.818 to 0.748 in the external test set. The shapley additive explanation method provided interpretability for the XGBoost models, which improved the understanding of the association between the predictor variables and future mortality.
Conclusions: The interpretable machine learning XGBoost models showed promising performance in real-time mortality prediction in critically ill patients with SA-AKI, which are useful tools for early identification of high-risk patients and timely clinical interventions.
Introduction
Sepsis is life-threatening organ dysfunction due to a dysregulated host response to infection. It is a major cause of health loss worldwide (1, 2). Acute kidney injury (AKI), characterized by an abrupt increase in serum creatinine (SCr) or decrease in urine output, is a common complication of critical illness (3–5). AKI has been shown to be more frequent, less likely to resolve, and associated with higher mortality in critically ill patients with sepsis than in those without (6). Considering the critical condition of patients with sepsis-associated AKI (SA-AKI), the accurate prediction of their outcomes is a topic of interest.
Studies have shown that widely-used severity scores, such as the Simplified Acute Physiology Score II (SAPS-II) and the Sequential Organ Failure Assessment (SOFA) score, exhibit insufficient power for outcome prediction in SA-AKI patients (7, 8). A few prediction models for mortality in patients with SA-AKI have been established (7, 8). However, they were limited to small sample size or inadequate predictive performance. In addition, the models incorporated static measurements at single time points, typically in the early period after intensive care unit (ICU) admission, and failed to reflect patients' dynamic clinical evolution. There is still a lack of feasible ways to assess the real-time risk of death and guide individualized treatment decisions in critically ill patients with SA-AKI.
The rapid development in big data analytics and machine learning techniques, along with the data-rich environment in ICU settings, provide unprecedented opportunities to establish novel mortality prediction tools in SA-AKI patients (9–11). Advanced machine learning methods are adept at handling high-order interactions and fitting complex non-linear relationships, which can be used to integrate large amounts of data from electronic health records (EHRs). The application of data-driven analytics by machine learning has shown promise to improve predictive performance in medical fields (12–15).
The study aimed to develop and validate machine learning-based models for real-time mortality prediction in critically ill patients with SA-AKI, in an attempt to provide useful tools for early prognostic assessment and clinical decision-making.
Methods
Source of Data
Data were obtained from the Medical Information Mart for Intensive Care IV (MIMIC-IV) v1.0 and the eICU Collaborative Research Database (eICU-CRD) v2.0 (16–19). The MIMIC-IV is a large and publicly available database containing records from patients admitted to the ICUs of the Beth Israel Deaconess Medical Center from 2008 to 2019. The eICU-CRD is a multi-center telehealth database including data from more than 200,000 admissions to 335 ICUs at 208 hospitals across the United States between 2014 and 2015. The study was an analysis of the third-party databases with pre-existing institutional review board approval and all protected patient information de-identified. One of the authors has completed the Collaborative Institutional Training Initiative course and can access the databases (certification number 40010711).
Study Population
The study included adult patients with sepsis who developed AKI within 48 h after ICU admission. In the MIMIC-IV, sepsis was diagnosed based on the Sepsis-3 criteria, including suspected infection and a SOFA score ≥ 2 (1). We identified patients with suspected infection (antibiotics administration concomitant with body fluid cultures) during the first 24 h after ICU admission and calculated SOFA scores using data from the same period (20). In the eICU-CRD, sepsis was identified according to the admission diagnosis recorded on the Acute Physiology and Chronic Health Evaluation IV dataset (21). AKI was defined based on the 2012 Kidney Disease: Improving Global Outcomes Clinical Practice Guideline, using both SCr and urine output criteria (3). Baseline SCr was defined as the minimum SCr value in the 7 days prior to ICU admission, or the first SCr value after ICU admission if no pre-admission SCr was available (22, 23). If the patient had multiple ICU admissions during a hospital stay, only the first ICU stay was included in the analysis to ensure the independence of the data. Patients with age < 18 years old, end-stage renal disease (identified by diagnosis codes), and ICU stay < 48 hours were excluded.
Outcomes and Predictor Variables
The primary outcome was in-hospital mortality within 28 days after ICU admission, censored at hospital discharge or 28 days, whichever occurred first. Each patient's ICU stay within 28 days was separated into 12-hour windows, which were labeled as “death” or “survival”. Specifically, to predict mortality in the next 48, 72, and 120 h, the time windows in the corresponding hours before death were labeled as “death” and the remaining as “survival”. To predict mortality in the first 28 days after ICU admission, all time windows were labeled as “death” in patients who died and “survival” in patients who survived. The final objective of the model was to predict the correct label for each time window. Additionally, the secondary outcomes were ICU length of stay, hospital length of stay and use of renal replacement therapy (RRT) within the first 28 days.
The predictor variables within each time window contained four static features (age, sex, ethnicity, and baseline SCr) and sets of dynamic features including hours from ICU admission, vital signs, laboratory values, and interventions. The list of all predictor variables included for modeling is provided in Table 1. For dynamic features, their values were time-varying and updated on a 12-hour basis. We used the mean value of variables measured multiple times and the lowest Glasgow Coma Scale (GCS) score in each time window. For variables with no recorded measurements during the 12-hour windows, their values were carried forward from the most recent measurements.
Statistical Analysis
Statistical analyses were performed using R 4.1.2 (https://cran.r-project.org). Continuous variables were presented as medians with interquartile ranges and categorical variables were presented as numbers with percentages. The schematic diagram of methods is shown in Supplementary Figure S1. We divided the study population in the MIMIC-IV into the training (50%), validation (30%), and internal test (20%) sets, randomized at the patient level to ensure that each patient was allocated to only a subset. We used the cohort of SA-AKI patients in the eICU-CRD as an external test set. In the training set, the eXtreme Gradient Boosting (XGBoost) algorithm was used to establish mortality prediction models with all predictor variables input. XGBoost, a scalable end-to-end tree boosting system, is an optimized implementation of the gradient boosting framework designed to be highly efficient, flexible, and portable (24). During the training process, it generates a series of decision trees, each of which is generated based on the previous one to decrease the gradient of the loss function. After that, a prediction model composed of multiple decision trees is obtained. The XGBoost algorithm can handle missing values by adding a default direction for them in each tree node and learning the optimal direction from the data. Therefore, missing values were directly input into the XGBoost models as not available values. Supplementary Table S1 provides the percentages of missing values in the predictor variables. For machine learning approaches, hyperparameter tuning is required to fit the complex relationship in the data and avoid overfitting. The hyperparameters in the XGBoost models (learning rate, minimum sum of instance weight, maximum tree depth, and minimum loss reduction) and max number of boosting iterations were optimized on the validation set to achieve the maximum area under the receiver operating characteristic curves (AUCs). The xgboost package was used for XGBoost modeling. Details on the functions and tuning parameters used for the XGBoost algorithm can be found in Supplementary Table S2. More details about the XGBoost algorithm can be found at XGBoost Documentation (https://xgboost.readthedocs.io/).
The performance of the prediction models was assessed on the internal and the external test sets. AUC was selected as the primary evaluation metric. Other metrics included sensitivity, specificity, and accuracy. We reported the metrics under multiple cutoff values, based on the local maximas of the receiver operating characteristic curves. We compared the performance of the XGBoost models with traditional risk scores, including the SOFA score (25) and SAPS-II (26). We did not calculate the risk scores in each 12-hour window for patients in the eICU-CRD because some required variables were unavailable.
The XGBoost algorithm provides the importance of features in predicting the outcome. We used the gain as the measure, representing the fractional contribution of each feature to the model output based on the total gain of this feature's splits. To explore the interpretability of the XGBoost models, we used the Shapley Additive exPlanations (SHAP) method (27), which provides consistent and locally accurate attribution values for each feature. The influence of the predictor variables on the outcome can be explained by the summing effects of variable attributions in calculating the output risk for each observation.
In sensitivity analysis, we applied other frequently used machine learning algorithms such as random forest and support vector machine to our dataset for comparison (28, 29). Additionally, we assessed the performance of the SOFA score, SAPS-II and XGBoost model using data gathered in the early period after ICU admission, i.e., the first 12 h, in predicting in-hospital mortality in the first 28 days.
Results
Baseline Characteristics and Outcomes
A total of 15,603 critically ill patients with SA-AKI were included in our study, with 6,066 in the training set, 3,639 in the validation set, 2,427 in the internal test set, and 3,471 in the external test set (Figure 1). Baseline characteristics and outcomes of the study population in each dataset are shown in Table 2 and Supplementary Table S3. In the MIMIC-IV, 56.6% of SA-AKI patients were diagnosed by urine output criteria, 9.2% by SCr criteria, and 34.2% by both criteria. In the eICU-CRD, the proportions of SA-AKI patients meeting urine output criteria, SCr criteria, and both criteria were 38.5, 40.9, and 20.5%, respectively. The overall in-hospital mortality within 28 days was 18.6% in the training set, 17.0% in the validation set, 18.3% in the internal test set, and 22.7% in the external test set. For each 12 h window of the ICU stays, the number of in-hospital deaths in the first 28 days is shown in Supplementary Table S4. Distribution of the predictor variables within each 12-hour window of the ICU stays is shown in Supplementary Table S5.
Figure 1. Study flow diagram. SA-AKI, sepsis-associated acute kidney injury; ICU, intensive care unit; ESRD, end-stage renal disease.
Table 2. Baseline characteristics and outcomes of SA-AKI patients in the training, validation and internal test sets.
Model Performance
The receiver operating characteristic curves of the models for mortality in the following 48, 72, and 120 h and in the first 28 days after ICU admission are shown in Figure 2 and Supplementary Figures S2–S4. The XGBoost models showed better discrimination than the SOFA score and SAPS-II, with the AUCs ranging from 0.848 to 0.804 in the internal test set and from 0.818 to 0.748 in the external test set. The sensitivity, specificity, and accuracy of the XGBoost models at different cutoffs for mortality prediction in the internal and the external test sets are provided in Table 3 and Supplementary Tables S6–S8. In the internal test set, the XGBoost model achieved a sensitivity of 80.1% and specificity of 72.9% at the cutoff of 0.0349 for mortality in the following 48 h. The sensitivity was slightly higher, and the specificity was lower in the external test set than in the internal test set across different cutoffs. The calibration curves of the XGBoost models comparing the predicted and observed probability across deciles in the internal and the external test sets are shown in Figure 3 and Supplementary Figures S5–S7. The XGBoost models were well-calibrated, except that they might underestimate or overestimate the probability at the higher risk deciles.
Figure 2. Receiver operating characteristic curves of the models for mortality in the following 48 h in the training set. (A) validation set; (B) internal test set; (C) external test set; (D) SOFA, sequential organ failure assessment; SAPS-II, simplified acute physiology score II; XGBoost, extreme gradient boosting.
Figure 3. Calibration curves of the XGBoost model for mortality in the following 48 h in the internal. (A) and the external; (B) test sets.
Model Interpretability
Figure 4 and Supplementary Figures S8–S10 illustrate the feature importance derived from the XGBoost models. The top five most important predictor variables in the XGBoost model for mortality in the following 48 h were urine output, GCS score, hours from admission, serum lactate level, and age. Figure 5 and Supplementary Figures S11–S13 provide the SHAP summary plots of the XGBoost models, revealing the impact of the predictor variables on model output. Lower GCS score, decreased urine output, prolonged ICU length of stay, older age, and higher blood urea nitrogen (BUN) level were the top five factors associated with increased risk of death in the following 48 h.
Figure 4. Feature importance derived from the XGBoost model for mortality in the following 48 h. The importance value represents the fractional contribution of each feature to the XGBoost model based on the total gain of this feature's splits. Higher percentage means a more important feature. GCS, glasgow coma scale; PaCO2, partial pressure of arterial carbon dioxide; PaO2, partial pressure of arterial oxygen; INR, international normalized ratio; RRT, renal replacement therapy.
Figure 5. SHAP summary plot of the XGBoost model for mortality in the following 48 h. Higher SHAP value means a higher probability of death within the next 48 h. Purple represents higher feature values and yellow represents lower feature values. A dot is created for each feature attribution in calculating the output risk for each observation. GCS, glasgow coma scale; INR, international normalized ratio; RRT, renal replacement therapy; PaO2, partial pressure of arterial oxygen; PaCO2, partial pressure of arterial carbon dioxide.
Sensitivity Analysis
In sensitivity analysis, the XGBoost models showed higher AUCs than the random forest and the support vector machine models in the internal and the external test sets (Supplementary Table S9). In addition, the XGBoost model using data gathered during the first 12 h after ICU admission showed poor predictive performance for in-hospital mortality in the first 28 days, with the AUC being 0.770 (95% CI 0.747–0.794) in the internal test set and 0.676 (95% CI 0.655–0.697) in the external test set (Supplementary Figure S14).
Discussion
In this multi-center retrospective study, we developed and validated interpretable machine learning-based models using the XGBoost algorithm for real-time mortality prediction in critically ill patients with SA-AKI. The XGBoost models exhibited better performance than traditional risk scores (including the SOFA score and SAPS-II) or other machine learning models (including the random forest and support vector machine models) in predicting death in the following 48, 72, and 120 h and in the first 28 days after ICU admission. The XGBoost models could help identify high-risk patients in real time for early clinical interventions.
SA-AKI is common in critically ill patients with rapid clinical evolution and significantly higher mortality than those without AKI or with AKI attributed to other causes (6). Reliable prediction models are essential for clinicians to assess the risk of death and make proper clinical decisions in critically ill patients with SA-AKI. Generic scores, such as the SOFA score and SAPS-II, are widely used for outcome prediction in critical care settings. However, they have shown controversial results on predictive performance for mortality in AKI patients (7, 8, 30–32). Recently, several models have been proposed to predict AKI mortality in unselected ICU patients (31, 32), but few have been validated in patients with SA-AKI. Da Hora Passos et al. (7) proposed a clinical score to predict 7 days mortality in a cohort of 186 SA-AKI patients who required continuous RRT. The five-variable score showed better performance than the generic models, with a C-statistic of 0.82, but was limited to a single center and small sample size. In addition, Hu et al. (8) established a prediction model for in-hospital mortality in critically ill patients with SA-AKI. However, the model included only static clinical variables and showed insufficient predictive power.
Compared with the other risk prediction tools, our models have several strengths. First, the study demonstrated the applicability of the XGboost algorithm in mortality prediction in critically ill patients with SA-AKI. The XGBoost models had stronger predictive power than the traditional risk scores. Sensitivity analysis further showed that the XGBoost models were superior to the random forest and the support vector machine models. XGBoost-based models have shown exciting performance in various situations, such as volume responsiveness in patients with oliguric AKI (14), long-term kidney outcomes in patients with IgA nephrology (33), and mortality in ICU patients with rhabdomyolysis (34). The reasons for the improvement in predictive abilities observed in the XGBoost models may be multifactorial. The XGBoost algorithm, based on the gradient tree boosting framework, is adept at fitting non-linearities, discontinuities and complex high-order interactions. It is also robust to outliers in and multicollinearity among predictor variables. Besides, the XGBoost algorithm can handle missing values automatically, allowing the input of only available predictor variables in its clinical application.
Second, the real-time mortality prediction models can provide dynamic risk assessment and guide clinical decision-making. Patients in the ICU environment are clinically unstable, change rapidly between states of deterioration and improvement, and require continuous monitoring and interventions (35). It has promoted the establishment of real-time prediction models in critical care, such as models for mortality in critically ill children (35), the development of AKI (36), and sepsis onset (37, 38). Previously published models for mortality prediction in SA-AKI patients included static physiological parameters gathered during the early stages of the ICU stays. However, SA-AKI patients with similar disease severity at the early stage of ICU admission may exhibit different clinical outcomes due to distinct disease trajectories and treatment responses. The real-time prediction models can provide the risk of death updated on a 12-hour basis, which is more accurate and allows clinicians to make predictions dynamically.
Third, our models achieved promising predictive performance in both the internal and the external test sets, which demonstrated their robustness and generalizability. The predictor variables included in our model are routinely collected and usually available in the EHRs, and their values are rarely influenced by the examiner. Using only the most basic and commonly measured clinical data can facilitate the generalizability of the prediction model in other ICUs. Our models were further validated in an external test set, including 3,471 SA-AKI patients from a large multi-center critical care database with significantly different distributed features. Furthermore, automated data extraction from EHRs and data input can save additional labor and cost and reduce the possibility of incorrect entry in future clinical applications of the models (35).
Fourth, the interpretability of the models was explored to reveal the predictors for death over different time periods. Most recently, the relationship between the evolution of SA-AKI and mortality has been revealed. Uhel et al. (39) found that persistent AKI, but not transient AKI, was associated with increased mortality in critically ill septic patients. Ozrazgat-Baslanti et al. (40) also showed that persistent AKI and the absence of renal recovery were associated with worse clinical outcomes. Our results further demonstrated that decreased urine output and higher BUN level were important factors for increased real-time risk of death, suggesting the necessity for continuous renal function monitoring in SA-AKI patients. Additionally, the discovery of other potentially modifiable extra-renal risk factors, such as lower GCS score, higher lactate level, higher heart rate, and higher respiratory rate, may help improve patient care and outcomes.
Our study was subject to some limitations. Firstly, it was a retrospective analysis based on the publicly accessible databases. The diagnosis of sepsis in the eICU-CRD may not meet the updated Sepsis-3 criteria. It remains unclear whether the prediction model performs well for individual prognostication and whether its clinical application can improve patient outcomes. Secondly, although the XGBoost algorithm can handle missing values automatically, the presence of missing data may lead to bias. Thirdly, clinical data beyond the ICU stays were unavailable, limiting the continuous assessment of the risk of death for SA-AKI patients who were transferred to the general wards or other locations. Finally, the visualization and application of the models are still limited. In our subsequent study, we will prospectively investigate the effectiveness of our models and develop a web-based risk calculator that automatically extracts data from EHRs and performs risk calculations.
Conclusions
This study developed and externally validated interpretable machine learning XGBoost models for real-time mortality prediction in critically ill patients with SA-AKI. The XGBoost models, based on routine clinical variables updated every 12 h, showed promising performance in predicting death in the following 48, 72, and 120 h and in the first 28 days after ICU admission. The real-time prediction models are useful tools for early identification of high-risk patients and timely clinical interventions. Future studies are required to determine the robustness and effectiveness of the prediction models in a prospective way.
Data Availability Statement
The datasets analyzed for this study can be found in the MIMIC-IV (https://mimic.mit.edu/) and eICU-CRD (https://eicu-crd.mit.edu/).
Ethics Statement
The studies involving human participants were reviewed and approved by the Institutional Review Boards of the Beth Israel Deaconess Medical Center and Massachusetts Institute of Technology. Written informed consent for participation was not required for this study in accordance with the National Legislation and the Institutional Requirements.
Author Contributions
S-BD designed, supervised the study, and drafted the manuscript. X-QL performed the data extraction, analysed, interpreted the data, and drafted the manuscript. PY and Y-XK analyzed and interpreted the data and critically revised the manuscript. Y-HD, TW, and XW analyzed the data and revised the manuscript critically for important intellectual content. All authors have read and approved the final manuscript.
Funding
This study was supported by National Natural Science Foundation of China (Grant No. 81873607).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2022.853102/full#supplementary-material
References
1. Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA. (2016) 315:801–10. doi: 10.1001/jama.2016.0287
2. Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan DR, et al. Global, regional, and national sepsis incidence and mortality, 1990–2017: analysis for the Global Burden of Disease Study. Lancet. (2020) 395:200–11. doi: 10.1016/s0140-6736(19)32989-7
3. KDIGOKAKIW GROUP. KDIGO clinical practice guideline for acute kidney injury. Kidney Int Suppl. (2012) 2:1–138. doi: 10.1038/kisup.2012.1
4. Hoste EA, Bagshaw SM, Bellomo R, Cely CM, Colman R, Cruz DN, et al. Epidemiology of acute kidney injury in critically ill patients: the multinational AKI-EPI study. Intensive Care Med. (2015) 41:1411–23. doi: 10.1007/s00134-015-3934-7
5. Hoste EAJ, Kellum JA, Selby NM, Zarbock A, Palevsky PM, Bagshaw SM, et al. Global epidemiology and outcomes of acute kidney injury. Nat Rev Nephrol. (2018) 14:607–25. doi: 10.1038/s41581-018-0052-0
6. Peters E, Antonelli M, Wittebole X, Nanchal R, Francois B, Sakr Y, et al. A worldwide multicentre evaluation of the influence of deterioration or improvement of acute kidney injury on clinical outcome in critically ill patients with and without sepsis at ICU admission: results from the intensive care over nations audit. Crit Care. (2018) 22:188. doi: 10.1186/s13054-018-2112-z
7. da Hora Passos R, Ramos JG, Mendonca EJ, Miranda EA, Dutra FR, Coelho MF, et al. A clinical score to predict mortality in septic acute kidney injury patients requiring continuous renal replacement therapy: the helenicc score. BMC Anesthesiol. (2017) 17:e21. doi: 10.1186/s12871-017-0312-8
8. Hu H, Li L, Zhang Y, Sha T, Huang Q, Guo X, et al. A Prediction model for assessing prognosis in critically ill patients with sepsis-associated acute kidney injury. Shock. (2021) 56:564–72. doi: 10.1097/SHK.0000000000001768
9. Bailly S, Meyfroidt G, Timsit JF. What's new in ICU in 2050: big data and machine learning. Intensive Care Med. (2018) 44:1524–7. doi: 10.1007/s00134-017-5034-3
10. Sanchez-Pinto LN, Luo Y, Churpek MM. Big data and data science in critical care. Chest. (2018) 154:1239–48. doi: 10.1016/j.chest.2018.04.037
11. Gutierrez G. Artificial intelligence in the intensive care unit. Crit Care. (2020) 24:101. doi: 10.1186/s13054-020-2785-y
12. Kang MW, Kim J, Kim DK, Oh KH, Joo KW, Kim YS, et al. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Crit Care. (2020) 24:42. doi: 10.1186/s13054-020-2752-7
13. Meyer A, Zverinski D, Pfahringer B, Kempfert J, Kuehne T, Sündermann SH, et al. Machine learning for real-time prediction of complications in critical care: a retrospective study. Lancet Respir Med. (2018) 6:905–14. doi: 10.1016/s2213-2600(18)30300-x
14. Zhang Z, Ho KM, Hong Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit Care. (2019) 23:112. doi: 10.1186/s13054-019-2411-z
15. Luo X-Q, Yan P, Zhang N-Y, Luo B, Wang M, Deng Y-H, et al. Machine learning for early discrimination between transient and persistent acute kidney injury in critically ill patients with sepsis. Sci Rep. (2021) 11:20269. doi: 10.1038/s41598-021-99840-6
16. Pollard TJ, Johnson AEW, Raffa JD, Celi LA, Mark RG, Badawi O. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci Data. (2018) 5:180178. doi: 10.1038/sdata.2018.178
17. Johnson A, Bulgarelli L, Pollard T, Horng S, Celi LA, Mark R. MIMIC-IV (version 1.0). PhysioNet. (2020). doi: 10.13026/s6n6-xd98
18. Yang J, Li Y, Liu Q, Li L, Feng A, Wang T, et al. Brief introduction of medical database and data mining technology in big data era. J Evid Based Med. (2020) 13:57–69. doi: 10.1111/jebm.12373
19. Wu WT, Li YJ, Feng AZ, Li L, Huang T, Xu AD, et al. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Mil Med Res. (2021) 8:44. doi: 10.1186/s40779-021-00338-z
20. Johnson AEW, Aboab J, Raffa JD, Pollard TJ, Deliberato RO, Celi LA, et al. A comparative analysis of sepsis identification methods in an electronic database. Crit Care Med. (2018) 46:494–9. doi: 10.1097/CCM.0000000000002965
21. Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today's critically ill patients. Crit Care Med. (2006) 34:1297–310. doi: 10.1097/01.CCM.0000215112.84523.F0
22. Zhao GJ, Xu C, Ying JC, Lu WB, Hong GL, Li MF, et al. Association between furosemide administration and outcomes in critically ill patients with acute kidney injury. Crit Care. (2020) 24:75. doi: 10.1186/s13054-020-2798-6
23. Chaudhary K, Vaid A, Duffy A, Paranjpe I, Jaladanki S, Paranjpe M, et al. Utilization of deep learning for subphenotype identification in sepsis-associated acute kidney injury. Clin J Am Soc Nephrol. (2020) 15:1557–65. doi: 10.2215/CJN.09330819
24. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, Ca (2016). p. 785–94.
25. Vincent JL, Moreno R, Takala J, Willatts S, De Mendonça A, Bruining H, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. (1996) 22:707–10. doi: 10.1007/bf01709751
26. Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. JAMA. (1993) 270:2957–63. doi: 10.1001/jama.270.24.2957
27. Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. (2017) 30:4765–74.
29. Cortes C, Vapnik V. Support-vector networks. Mach Learn. (1995) 20:273–97. doi: 10.1023/A:1022627411411
30. Demirjian S, Chertow GM, Zhang JH, O'Connor TZ, Vitale J, Paganini EP, et al. Model to predict mortality in critically ill adults with acute kidney injury. Clin J Am Soc Nephrol. (2011) 6:2114–20. doi: 10.2215/CJN.02900311
31. Lin K, Hu Y, Kong G. Predicting in-hospital mortality of patients with acute kidney injury in the ICU using random forest model. Int J Med Inform. (2019) 125:55–61. doi: 10.1016/j.ijmedinf.2019.02.002
32. Huang H, Liu Y, Wu M, Gao Y, Yu X. Development and validation of a risk stratification model for predicting the mortality of acute kidney injury in critical care patients. Ann Transl Med. (2021) 9:323. doi: 10.21037/atm-20-5723
33. Chen T, Li X, Li Y, Xia E, Qin Y, Liang S, et al. Prediction and risk stratification of kidney outcomes in IgA nephropathy. Am J Kidney Dis. (2019) 74:300–9. doi: 10.1053/j.ajkd.2019.02.016
34. Liu C, Liu X, Mao Z, Hu P, Li X, Hu J, et al. Interpretable machine learning model for early prediction of mortality in ICU patients with rhabdomyolysis. Med Sci Sports Exerc. (2021) 53:1826–34. doi: 10.1249/mss.0000000000002674
35. Kim SY, Kim S, Cho J, Kim YS, Sol IS, Sung Y, et al. A deep learning model for real-time mortality prediction in critically ill children. Crit Care. (2019) 23:279. doi: 10.1186/s13054-019-2561-z
36. Le S, Allen A, Calvert J, Palevsky PM, Braden G, Patel S, et al. Convolutional Neural Network Model for Intensive Care Unit Acute Kidney Injury Prediction. Kidney Int Rep. (2021) 6:1289–98. doi: 10.1016/j.ekir.2021.02.031
37. Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD, Buchman TG. An Interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit Care Med. (2018) 46:547–53. doi: 10.1097/CCM.0000000000002936
38. Li X, Xu X, Xie F, Xu X, Sun Y, Liu X, et al. A time-phased machine learning model for real-time prediction of sepsis in critical care. Crit Care Med. (2020) 48:e884–8. doi: 10.1097/CCM.0000000000004494
39. Uhel F, Peters-Sengers H, Falahi F, Scicluna BP, van Vught LA, Bonten MJ, et al. Mortality and host response aberrations associated with transient and persistent acute kidney injury in critically ill patients with sepsis: a prospective cohort study. Intensive Care Med. (2020) 46:1576–89. doi: 10.1007/s00134-020-06119-x
Keywords: sepsis, acute kidney injury, mortality, machine learning, critical care
Citation: Luo X-Q, Yan P, Duan S-B, Kang Y-X, Deng Y-H, Liu Q, Wu T and Wu X (2022) Development and Validation of Machine Learning Models for Real-Time Mortality Prediction in Critically Ill Patients With Sepsis-Associated Acute Kidney Injury. Front. Med. 9:853102. doi: 10.3389/fmed.2022.853102
Received: 12 January 2022; Accepted: 19 May 2022;
Published: 15 June 2022.
Edited by:
Longxiang Su, Peking Union Medical College Hospital (CAMS), ChinaReviewed by:
Yi Yang, Southeast University, ChinaJun Lyu, First Affiliated Hospital of Jinan University, China
Copyright © 2022 Luo, Yan, Duan, Kang, Deng, Liu, Wu and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Shao-Bin Duan, ZHVhbnNiNTI4JiN4MDAwNDA7Y3N1LmVkdS5jbg==