AUTHOR=Bai Yu , Xia Jingen , Huang Xu , Chen Shengsong , Zhan Qingyuan 

TITLE=Using machine learning for the early prediction of sepsis-associated ARDS in the ICU and identification of clinical phenotypes with differential responses to treatment

JOURNAL=Frontiers in Physiology

VOLUME=Volume 13 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/physiology/articles/10.3389/fphys.2022.1050849

DOI=10.3389/fphys.2022.1050849

ISSN=1664-042X

ABSTRACT=<p><bold>Background:</bold> An early diagnosis model with clinical phenotype classification is key for the early identification and precise treatment of sepsis-associated acute respiratory distress syndrome (ARDS). This study aimed to: 1) build a machine learning diagnostic model for patients with sepsis-associated ARDS using easily accessible early clinical indicators, 2) conduct rapid classification of clinical phenotypes in this population, and 3) explore the differences in clinical characteristics, outcomes, and treatment responses of different phenotypes.</p><p><bold>Methods:</bold> This study is based on data from the Telehealth Intensive Care Unit (eICU) and Medical Information Mart for Intensive Care IV (MIMIC-IV). We trained and tested the early diagnostic model of sepsis-associated ARDS patients in the eICU. We used key predictive indicators to cluster sepsis-associated ARDS patients and determine the characteristics and clinical outcomes of different phenotypes, as well to explore the differences of in-hospital mortality among different the positive end-expiratory pressure (PEEP) levels in different phenotypes. These results are verified in MIMIC-IV to evaluate whether they are repeatable.</p><p><bold>Results:</bold> Among the diagnostic models constructed in 19,249 sepsis patients and 5,947 sepsis-associated ARDS patients, the AdaBoost (Decision Tree) model achieved the best performance with an area under the receiver operating characteristic curve (AUC) of 0.895, which is higher than that of the traditional Logistic Regression model (Z = −2.40,<italic>p</italic> = 0.013), and the accuracy of 70.06%, sensitivity of 78.11% and specificity of 78.74%. We simultaneously identified three sepsis-associated ARDS phenotypes. Cluster 0 (<italic>n</italic> = 3,669) had the lowest in-hospital mortality rate (6.51%) and fewer laboratory abnormalities (lower WBC (median:15.000 K/mcL), lower blood glucose (median:158.000 mg/dl), lower creatinine (median:1.200 mg/dl), lower lactic acid (median:3.000 mmol/L); <italic>p</italic> &lt; 0.001). Cluster 1 (<italic>n</italic> = 1,554) had the highest in-hospital mortality rate (75.29%) and the most laboratory abnormalities (higher WBC (median:18.300 K/mcL), higher blood glucose (median:188.000 mg/dl), higher creatinine (median:2.300 mg/dl), higher lactic acid (median:3.900 mmol/L); <italic>p</italic> &lt; 0.001). Cluster 2 (<italic>n</italic> = 724) had the most complex condition, with a moderate in-hospital mortality rate (29.7%) and the longest intensive care unit stay. In Clusters 0 and 1, patients with high PEEP had higher in-hospital mortality rate than those with low PEEP, but the opposite trend was seen in Cluster 2. These results were repeatable in 11,935 patients with sepsis and 2,699 patients with sepsis-associated ARDS patients in the MIMIC-IV cohort.</p><p><bold>Conclusion:</bold> A machine learning diagnostic model of sepsis-associated ARDS patients was established. Three phenotypes with different clinical features and outcomes were clustered, and these had different therapeutic responses. These findings are helpful for the early and rapid identification of sepsis-associated ARDS patients and their precise individualized treatment.</p>