Identification and Prediction of Novel Clinical Phenotypes for Intensive Care Patients With SARS-CoV-2 Pneumonia: An Observational Cohort Study

Chen, Hui; Zhu, Zhu; Su, Nan; Wang, Jun; Gu, Jun; Lu, Shu; Zhang, Li; Chen, Xuesong; Xu, Lei; Shao, Xiangrong; Yin, Jiangtao; Yang, Jinghui; Sun, Baodi; Li, Yongsheng

doi:10.3389/fmed.2021.681336

ORIGINAL RESEARCH article

Front. Med., 04 June 2021

Sec. Intensive Care Medicine and Anesthesiology

Volume 8 - 2021 | https://doi.org/10.3389/fmed.2021.681336

This article is part of the Research TopicClinical Application of Artificial Intelligence in Emergency and Critical Care Medicine, Volume IView all 20 articles

Identification and Prediction of Novel Clinical Phenotypes for Intensive Care Patients With SARS-CoV-2 Pneumonia: An Observational Cohort Study

Hui Chen¹^*^†

Zhu Zhu²^†

Nan Su³

Jun Wang¹

Jun Gu⁴

Shu Lu⁵

Li Zhang⁶

Xuesong Chen⁷

Lei Xu⁸

Xiangrong Shao⁹

Jiangtao Yin¹⁰

Jinghui Yang¹¹

Baodi Sun¹²

Yongsheng Li¹³^*

¹Department of Critical Care Medicine, The First Affiliated Hospital of Soochow University, Soochow University, Suzhou, China
²Department of General Surgery, The Affiliated Suzhou Science & Technology Town Hospital of Nanjing Medical University, Suzhou, China
³Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Soochow University, Soochow University, Suzhou, China
⁴Department of Respiratory Medicine, Affiliated Hospital of Nantong University, Nantong, China
⁵Department of Intensive Care Unit, Affiliated Hospital of Nantong University, Nantong, China
⁶Department of Respiratory Medicine, Zhongda Hospital Southeast University, Nanjing, China
⁷Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
⁸Department of Emergency Medicine, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
⁹Department of Respiratory Medicine, The Affliliation Hospital of Yangzhou University, Yangzhou, China
¹⁰Department of Intensive Care Unit, The Affiliated Hospital of Jiangsu University, Zhenjiang, China
¹¹Department of Critical Care Medicine, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
¹²Department of Emergency, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
¹³Department of Intensive Care Medicine, Tongji Medical College, Tongji Hospital, Huazhong University of Science and Technology, Wuhan, China

Background: Phenotypes have been identified within heterogeneous disease, such as acute respiratory distress syndrome and sepsis, which are associated with important prognostic and therapeutic implications. The present study sought to assess whether phenotypes can be derived from intensive care patients with coronavirus disease 2019 (COVID-19), to assess the correlation with prognosis, and to develop a parsimonious model for phenotype identification.

Methods: Adult patients with COVID-19 from Tongji hospital between January 2020 and March 2020 were included. The consensus k means clustering and latent class analysis (LCA) were applied to identify phenotypes using 26 clinical variables. We then employed machine learning algorithms to select a maximum of five important classifier variables, which were further used to establish a nested logistic regression model for phenotype identification.

Results: Both consensus k means clustering and LCA showed that a two-phenotype model was the best fit for the present cohort (N = 504). A total of 182 patients (36.1%) were classified as hyperactive phenotype, who exhibited a higher 28-day mortality and higher rates of organ dysfunction than did those in hypoactive phenotype. The top five variables used to assign phenotypes were neutrophil-to-lymphocyte ratio (NLR), ratio of pulse oxygen saturation to the fractional concentration of oxygen in inspired air (Spo₂/Fio₂) ratio, lactate dehydrogenase (LDH), tumor necrosis factor α (TNF-α), and urea nitrogen. From the nested logistic models, three-variable (NLR, Spo₂/Fio₂ ratio, and LDH) and four-variable (three-variable plus TNF-α) models were adjudicated to be the best performing, with the area under the curve of 0.95 [95% confidence interval (CI) = 0.94–0.97] and 0.97 (95% CI = 0.96–0.98), respectively.

Conclusion: We identified two phenotypes within COVID-19, with different host responses and outcomes. The phenotypes can be accurately identified with parsimonious classifier models using three or four variables.

Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pneumonia is a newly recognized infectious disease first reported in Wuhan, China, and expeditiously spread to hundreds of countries with massive mortality rate (1–4). The clinical spectrum of coronavirus disease 2019 (COVID-19) ranges from asymptomatic infection to critical illness and results in high rates of hospitalization and intensive care unit (ICU) admission (5). However, COVID-19 ICU mortality was various (6–8), and the treatment responses were disparate (9–11), indicating that COVID-19 is clinically and biologically heterogeneous.

Various studies have proposed different phenotypes of COVID-19. According to 85 consecutive ICU COVID-19 patients, Azoulay et al. identified three clinical and biological phenotypes at ICU admission using hierarchical clustering. ICU mortality rates were 8, 18, and 39% in clusters 1, 2, and 3, respectively (12). Gattinoni et al. identified two primary phenotypes based on respiratory mechanics and response to ventilatory support (13). Rello et al. classified COVID-19 patients into five specific individual phenotypes, according to the disease severity and hypoxemia management strategy (14). Whereas these phenotypes were isolated and limited by sample size, host responses to SARS-CoV-2 infection were vast and multidimensional and include immune dysfunction, abnormal coagulation, and varying degrees of organ failure (15). Different combinations of these features may cluster into novel clinical phenotypes, and patients in each phenotype may respond differently to treatments. However, whether such COVID-19 phenotypes can be derived from clinical data have never been explored.

Unsupervised machine learning approaches, such as consensus k means clustering (16) and latent class analysis (LCA) (17), have been used to identify distinct phenotypes in sepsis (18), acute respiratory distress syndrome (ARDS) (19) and other critical illnesses (20). Consensus clustering is a partitioning approach in which the clustering framework incorporates results from multiple runs of an inner-loop clustering algorithm. LCA is a well-validated statistical technique, which is a form of distribution mixture modeling used to estimate the best-fitting model for a dataset, based on the hypothesis that the data contain several unobserved groups or classes that are concealed within the observed multivariate distribution. Here, we used consensus k means clustering to derive phenotypes and assessed the reproducibility of the phenotypes using LCA.

The first goal of the study was to identify novel clinical phenotypes in ICU COVID-19 patients, using consensus k means clustering and LCA. The second goal was to develop parsimonious models that could ultimately be used prospectively to identify COVID-19 phenotypes.

Materials and Methods

Study Design and Participants

This single-center, retrospective, observational study was performed at Tongji Hospital, which was designated to admit patients with SARS-CoV-2 infection in Wuhan. Adult patients (≥18 years) with laboratory-confirmed SARS-CoV-2 infection and admitted to ICUs between January 2020 and March 2020 were included in the present study. According to the World Health Organization guidance (21), laboratory confirmation for SARS-Cov-2 was defined as a positive result of real-time reverse transcriptase–polymerase chain reaction assay of nasal and pharyngeal swabs.

This study was approved by the Research Ethics Commission of Tongji Hospital. Written informed consent was waived by the Ethics Commission because of the emergency circumstance. Patient-level informed consent was not required. Part of present patients have been described previously by Chen et al. (22) and Wang et al. (23).

Data Collection

All data were drawn from electronic health record data at Tongji hospital (Tongji cohort). Demographic data, chronic comorbidities, vital signs, and laboratory results within the first 24 h after ICU admission were collected, as well as treatments and outcomes. Because of incomplete measurement and recording of arterial oxygen partial pressure (PaO₂), we adopted pulse oxygen saturation (SpO₂) instead of PaO₂, as well as the fraction of inspired oxygen (FIO₂). Sequential Organ Failure Assessment (SOFA) scores were calculated to determine the severity of illness using data from the first 24 h of ICU admission. All patients were closely followed until 28 days after ICU admission. Data were collected using a case record form modified from the standardized International Severe Acute Respiratory and Emerging Infection Consort.

Outcomes

The primary outcome in the present study was 28-day mortality. Secondary outcomes were the duration of hospital stay and complications during hospitalization, which included ARDS, septic shock, acute kidney injury, acute cardiac injury, and coagulopathy. The diagnosis of complications is presented in the Supplementary Material.

Clinical Variables for Phenotyping

We selected 26 candidate clinical variables based on their association with severity or outcome of COVID-19, including age, vital signs (heart rate, respiratory rate, temperature, mean blood pressure), markers of inflammation [white blood cell count (WBC count), neutrophil-to-lymphocyte ratio (NLR), high-sensitivity C-reactive protein (hs-CRP), interleukin 2R (IL-2R), IL-6, IL-8, and tumor necrosis factor α (TNF-α)], markers of organ dysfunction [hypersensitive troponin I (hs-TnI), international normalized ratio (INR), platelet (PLT) count, total bilirubin, creatinine, urea nitrogen, lactate dehydrogenase (LDH), and SpO₂/FIO₂ ratio], hemoglobin, red blood cell distribution width (RDW), D-dimer, fibrinogen, albumin, and glucose. All variables were collected within 24 h of ICU admission, and we recorded the most abnormal value if a variable was recorded more than once.

Consensus k Means Clustering

Consensus k means clustering was conducted to 26 variables using a partitioning approach. We first assessed the candidate variable distributions, missingness, and correlation. Multiple imputations with chained equations (Additional Methods in Supplementary Material) were used to account for missing data; standardized transformation was used for the dataset, and non–normally distributed variables were log-transformed prior to standardized transformation. We then determine the optimal number of phenotypes with consensus k means clustering, according to the gap statistics, consensus matrix heatmaps, and adequate pairwise-consensus values between cluster members (>0.8). Once the optimal number was determined, we selected rank plots of variables by mean standardized difference between phenotypes to visualize the patterns of clinical variables. We also conducted a sensitivity analysis after excluding highly correlated variables using rank-order statistics (r > 0.5). Additional details of consensus k means clustering are presented in Supplementary Material.

Latent Class Analysis

We further employed LCA to assess the reproducibility of the phenotypes. Similarly, all variables underwent standardized transformation and were log-transformed as appropriate. In the LCA, we estimated models ranging from to five classes. Akaike information criterion (AIC), Bayesian information criteria, entropy, class size (classes containing relatively small numbers were not considered clinically meaningful), and the Vuong–Lo–Mendell–Rubin (VLMR) likelihood ratio test (which compares fit of model k classes to k-1 classes) were used to determine the optimal number of classes. Once determined, each individual was assigned a class according to model-generated probabilities. More details of LCA are presented in the Supplementary Material.

Parsimonious Algorithms to Classify COVID-19

Based on previous research, we attempted to construct a parsimonious model (three-variable or four-variable model) to predict phenotypes. First, machine learning algorithms, including classification tree with bootstrapped aggregating (bagging), extreme gradient boosting (XGBoost), and gradient boosted model (GBM), were used to identify the most important classifier variables. To select the most important variables, variable importance was used for the bagging model and XGBoost. Relative influence factor of variable was used for GBM. More details of machine learning algorithms are presented in the Supplementary Material. Second, the five most important classifier variables common to all three machine learning algorithms were then used to generate five logistic regression models (generated by sequential addition of the variables), and the receiver operating characteristic curve and area under the curve (AUC) were calculated for each model. AIC and DeLong's test were used to compare model performance. The best model was determined by a combination of accuracy, parsimony, and simplicity in clinical. Additionally, to assess the clinical usefulness of the best model, decision curve analysis (DCA) was conducted by quantifying the net benefits at different threshold probabilities. Finally, after the best model selected, a 10-fold cross-validation was applied to internally validate the stability of the model. This was performed by randomly splitting the patients into 10 equal samples. Nine-tenths of these samples were used to construct logistic regression models, and the model coefficients were applied to the remaining sample (1/10). This process was repeated 10 times, and the AUC to each fold was generated.

Statistical Analysis

Values are presented as the mean (standard deviation) or median (interquartile range) for continuous variables as appropriate and as the total number (percentage) for categorical variables. Comparisons between groups were made using the χ² test or Fisher exact test for categorical variables and Student t-test or Mann–Whitney U-test for continuous variables as appropriate. A p < 0.05 was used to determine statistical significance for all tests. LCA was conducted using Mplus software (version 8.3). All other analyses were done using R (version 3.6.0).

Results

Patients

During the study period, a total of 504 patients with COVID-19 were included in the Tongji cohort. The schematic of study is shown in Figure 1. Among the Tongji cohort, 259 patients (51.4%) were male, the age was 64 (52–72) years, and the SOFA score was 3 (2–6). Within the first 24 h after ICU admission, 16 patients (3.2%) received vasopressor therapy, and 23 patients (4.6%) received invasive mechanical ventilation. The overall 28-day mortality rate was 33.7%.

FIGURE 1

Figure 1. Schematic of study. LCA, latent class analysis.

Derivation of Clinical Phenotypes for COVID-19

In Tongji cohort, based on gap statistics, consensus matrix plots, and consensus values (Supplementary Figure 1), the consensus k means clustering found that a two-class model was the optimal fit with the two distinct phenotypes of COVID-19. Ultimately, 322 patients (63.9%) were classified as hypoactive phenotype, and 182 (36.1%) were classified as hyperactive phenotype. Sensitivity analysis indicated that no substantial changes were evident after excluding variables with high correlation (Supplementary Table 3 and Supplementary Figure 2).

The characteristics of phenotypes in the two-class model are shown in Table 1 and Supplementary Figure 3. Rank plots of variables by the standardized mean difference between phenotypes are presented in Figure 2. Most variables were significantly different between the two phenotypes. Compared to patients with the hypoactive phenotype, those with the hyperactive phenotype were older, prone to have elevated measures of inflammation (e.g., WBC count, NLR, hs-CRP, IL-2R, IL-6, IL-8, TNF-α), higher D-dimer, higher heart rate, higher respiratory rate, and extreme laboratory values regarding the organ dysfunction (e.g., hs-TnI, INR, PLT count, total bilirubin, creatinine, urea nitrogen, LDH, and SpO₂/FIO₂). Additionally, in comparison with the hypoactive phenotype, the hyperactive phenotype had significantly higher SOFA score on ICU admission and higher comorbidity rates (Supplementary Table 4).

TABLE 1

Table 1. Class-defining variables of phenotypes using consensus k means clustering.

FIGURE 2

Figure 2. Comparison of variables that contribute to clinical phenotypes in the Tongji cohort. Clinical phenotypes were derived from consensus k means clustering (A) and LCA (B). In all panels, the variables are standardized such that all means are scaled to 0 and SDs to 1. A value of 1 for the standardized variable value (x-axis) signifies that the mean value for the phenotype was 1 SD higher than the mean value for both phenotypes shown in the graph as a whole. RDW, red blood cell distribution width; MAP, mean arterial pressure; TNF-α, tumor necrosis factor α; INR, international normalized ratio; hs-CRP, high-sensitivity C-reactive protein; BUN, urea nitrogen; hs-TnI, hypersensitive troponin I; NLR, neutrophil-to-lymphocyte ratio.

Treatments and Outcomes in COVID-19 Phenotypes

A large proportion of patients with the hyperactive phenotype received corticosteroid therapy (78.6 vs. 44.1%; p < 0.001), high-flow nasal cannula oxygen therapy (17.0 vs. 4.7%; p < 0.001), non-invasive mechanical ventilation (45.6 vs. 7.1%; p < 0.001), invasive mechanical ventilation (59.3 vs. 3.4%; p < 0.001), and renal replacement therapy (11.5 vs. 1.6%; p < 0.001) during their ICU stay, compared to those with hypoactive phenotype (Supplementary Table 4). Patients assigned to hyperactive phenotype had significantly higher 28-day mortality (74.3 vs. 10.8%; p < 0.001) and higher rates of organ dysfunction during their ICU stay compared to those assigned to hypoactive phenotype (Table 2).

TABLE 2

Table 2. Comparison of clinical outcomes according to phenotypes using consensus k means clustering.

Reproducibility Using LCA

LCA confirmed statistical fit of the two-class model. In LCA, using the VLMR test, a two-class model showed significantly improved fit compared with one-class mode (p = 0.0066), and no further improvement in model fit was observed when the three-class (p = 0.058), four-class (p = 0.41), or five-class model (p = 0.40) was involved. Good class separation was observed in the two-class model (entropy > 0.80), indicating strong separation between the classes (Supplementary Table 5). The two-class model classified 341 patients (67.7%) in class 1 (referred as hypoactive phenotype) and 163 patients (32.3%) in class 2 (referred as hyperactive phenotype). Average latent class probabilities were 0.98 for class 1 and 0.96 for class 2. The clinical characteristics of the phenotypes were similar when derived using this method, as well as by rank plots (Figure 2 and Supplementary Table 6).

Parsimonious Algorithms to Predict Phenotypes of COVID-19

The most important classifier variables from the bagging, XGBoost, and GBM are presented (Supplementary Table 7, Supplementary Figures 4, 5). The top five variables were consistent across all three machine learning models, which included NLR, SpO₂/FIO₂ ratio, LDH, TNF-α, and urea nitrogen, and were therefore selected as the best predictors for the parsimonious models. After five logistic models constructed by sequential addition of the best predictors, an improved model performance, increased AUC, and decreased AIC were observed when model 1 went to model 4 (Supplementary Table 8). Considering that TNF-α was not routinely tested in other hospitals, therefore, the three-variable (NLR, SpO₂/FIO₂ ratio, and LDH) and four-variable models (NLR, SpO₂/FIO₂ ratio, LDH, TNF-α) were both the best in terms of balancing classifying accuracy and model simplicity.

Multivariable analyses showed that three variables or four variables in the model were all predictors of the phenotypes (Supplementary Table 9). The AUC was 0.95 [95% confidence interval (95% CI) = 0.94–0.97] for the three-variable model and 0.97 (95% CI = 0.96–0.98) for the four-variable model. The DCA curves indicated that the threshold probabilities were 0–0.95 for the three-variable model and 0–0.94 for the four-variable model (Figure 3). The mean AUCs of cross-validation for the three- and four-variable models were 0.95 (0.03) and 0.97 (0.02), respectively.

FIGURE 3

Figure 3. Receiver operating characteristic curves (A) and DCA (B) of the two best-performing regression models in Tongji cohort.

Discussion

The novel findings of our analyses can be summarized as follows. We identified two distinct COVID-19 phenotypes with different clinical and biological characteristics, mortality, and other clinical outcomes. We also developed a parsimonious model to predict phenotypes of COVID-19 using machine learning algorithms. These findings have important implications for early detection of patients who are likely to develop critical illness, as well as future researches in COVID-19.

Clinical and biological heterogeneity of critical illness (e.g., ARDS, sepsis) is thought to be dead ends for pharmacotherapy trials. Not a single clinical or biological variable was sufficient to identify phenotype (24). To put it simple, none of the clinical variables could be used to subdivide COVID-19. By contrast, based on 26 candidate clinical variables, we found two distinct phenotypes of COVID-19 most sufficiently describing the present cohort using consensus k means clustering, which strongly correlated with degrees of the host response to SARS-CoV-2 infection. Specifically, compared to patients with hypoactive phenotype, the host response of patients with hyperactive phenotype seems to be more dysregulated, characterized by high plasma concentrations of inflammatory biomarkers, extreme coagulation, and high proportion of organ failure or injury on ICU admission. Furthermore, replication of these findings using LCA substantiates the robustness of the two phenotypes in the present cohort.

Several phenotypes of COVID-19 have been documented, with the aim to receive “precision therapy.” Patients with COVID-19 pneumonia presents with low elastance, low ventilation-to-perfusion ratio, low lung weight, and low lung recruitability were classified as type L, whereas type H patients were characterized by high elastance, high ventilation-to-perfusion ratio, high lung weight, and high lung recruitability. Response to treatments, including higher FIO₂ and higher positive end-expiratory pressure (PEEP), and prone positioning may differ in type L and type H (13). Compared to phenotypes in the present study, similarly, hyperactive phenotype and type H seemed to represent a subset of COVID-19 patients who were severely ill. Unlike previous COVID-19 phenotypes, the COVID-19 phenotypes in the present study only used routinely available data associated with the degrees of host response, regardless of the characteristics of chest imaging or the respiratory mechanics, which can be identified at the time of patient admitted to the ICU. Besides, these phenotypes were multidimensional, differed in their laboratory abnormalities, patterns of organ dysfunction, and were not homologous with traditional patient groupings such as by severity score or a single variable.

We proposed a three-variable (NLR, SpO₂/FIO₂ ratio, and LDH) and four-variable model (NLR, SpO₂/FIO₂ ratio, LDH, and TNF-α) for identifying the hyperactive phenotype of COVID-19. Unlike traditional forward stepwise modeling, we used three machine algorithms to identify the most important classifier variables. The ability to identify phenotypes using a small set of variables is a crucial step toward their clinical application. On the one hand, to predict the occurrence of critical illness in COVID-19: according to 1,590 COVID-19 patients, Wenhua Liang et al. (25) constructed a predictive risk score including 10 variables to predict a patient's risk of developing critical illness; likewise, NLR [odds ratio (OR) = 1.06; 95% CI = 1.02–1.10] and LDH (OR = 1.002; 95% CI = 1.001–1.004) were included in the risk model. However, the definition of “critical illness” was obscure, which was described as a composite of admission to the ICU, invasive ventilation, or death. Besides, the overall mortality was only 3.2%, implying that such risk score may not be validated in real intensive care patients with COVID-19. In the present study, the ICU mortality of Tongji cohort was in line with prior reports, and critically ill patients (hyperactive phenotype) were identified based on the clustering analysis and LCA, which maximized the differences between patients, without taking the clinical outcome into account (26). On the other hand, to select more homogeneity patients for clinical trials: hypothetically, like the series research of ARDS, the interactions between phenotypes and treatments (PEEP, fluid management, and simvastatin) were significant.

Interestingly, different from the ARDS phenotypes (24, 27), we observed that none of inflammatory cytokines could predict COVID-19 phenotypes, except for TNF-α. Proinflammatory cytokines levels (IL-6, IL-8) in hyperinflammatory ARDS were at least 20-fold higher than hyperactive COVID-19 in our study, suggesting that COVID-19 is associated with only mild inflammatory cytokine elevation. An alternative mechanism of disease therefore seems likely (28) and warrants further researches. Additionally, pulmonary-specific variables, such as PaO₂/FIO₂ ratio, seem to contribute less to phenotype identification in ARDS; nevertheless, SpO₂/FIO₂ ratio is a primary variable to classify COVID-19 phenotype in the present study. A potential explanation for this finding is that patients were enrolled into ARDS clinical trials based on specific pulmonary criteria (e.g., PaO₂/FIO₂ ratio), but COVID-19 patients in Tongji cohort are more heterogeneous with respect to pulmonary variables (e.g., SpO₂/FIO₂ ratio).

The first strength of our study is the identification of two class phenotypes for intensive care patients with COVID-19 and development of the first parsimonious model for predicting hyperactive phenotype. The observational nature of the present study is another strength as it included all consecutive patients with COVID-19 during 3 months, and the results are therefore more likely to represent the population as encountered in the ICU in clinical practice.

This study also has several limitations. First, our study is a single-center, retrospective, observational study, and we lack the external validation of the phenotypes and the parsimonious model. Testing for COVID-19 phenotypes in more heterogeneous samples is an important direction in future researches. Second, the 26 candidate clinical variables did not fully reflect the host response to SARS-CoV-2 infection; we cannot exclude that adding other markers would provide different phenotypes. Third, whether these phenotypes are dynamic and change over time, resulting in distinct COVID-19 trajectories, is unknown. Finally, although a three- or four-variable model has a good accuracy in predicting the phenotypes, when phenotypes are defined by the parsimonious model rather than the clustering analysis or LCA, we may no longer detect the statistically significant differences in outcomes and treatment responses.

Conclusion

In summary, this analysis confirmed the existence of two distinct phenotypes for intensive care patients with COVID-19. We also provide evidence for accurate parsimonious classifier models of COVID-19 phenotypes. Promisingly, these simple models may aid clinicians in predicting which COVID-19 patients are likely to develop critical illness, delivering timely treatments, and improving patient selection in clinical trials, which in turn could significantly impact patient outcomes.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

Ethics Statement

The studies involving human participants were reviewed and approved by Tongji Hospital Ethics Committee. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author Contributions

YL and HC conceptualized the research aims, design the study, take responsibility for the integrity of the data and the accuracy of the data analysis. HC did the statistical analysis. HC and ZZ wrote the first draft of the manuscript. All authors contributed to acquisition of data, provided comments and approved the final manuscript.

Funding

This work was supported by The Emergency Project for the Prevention and Control of the Novel Coronavirus Outbreak in Suzhou, Jiangsu Province, China (sys2020012).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We thank all doctors who worked in the hospital during the period of patient recruitment as well as the patients who were involved in this study.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2021.681336/full#supplementary-material

References

1. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. (2020) 395:497–506. doi: 10.1016/S0140-6736(20)30183-5

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Guan WJ, Ni ZY, Hu Y, Liang WH, Ou CQ, He JX, et al. Clinical characteristics of coronavirus disease 2019 in China. New Engl J Med. (2020) 382:1708–20. doi: 10.1056/NEJMoa2002032

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Richardson S, Hirsch JS, Narasimhan M, Crawford JM, McGinn T, Davidson KW, et al. Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City area. JAMA. (2020) 323:2052–9. doi: 10.1001/jama.2020.6775

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. (2020) 395:507–13. doi: 10.1016/S0140-6736(20)30211-7

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Grasselli G, Pesenti A, Cecconi M. Critical care utilization for the COVID-19 outbreak in Lombardy, Italy: early experience and forecast during an emergency response. JAMA. (2020) 323:1545–6. doi: 10.1001/jama.2020.4031

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Yang X, Yu Y, Xu J, Shu H, Liu H, Wu Y, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. (2020) 8:475–81. doi: 10.1016/S2213-2600(20)30079-5

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Grasselli G, Zangrillo A, Zanella A, Antonelli M, Cabrini L, Castelli A, et al. Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy Region, Italy. JAMA. (2020) 323:1574–81. doi: 10.1001/jama.2020.5394

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA. (2020) 323:1061–9. doi: 10.1001/jama.2020.1585

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Shang L, Zhao J, Hu Y, Du R, Cao B. On the use of corticosteroids for 2019-nCoV pneumonia. Lancet. (2020) 395:683–4. doi: 10.1016/S0140-6736(20)30361-5

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Russell CD, Millar JE, Baillie JK. Clinical evidence does not support corticosteroid treatment for 2019-nCoV lung injury. Lancet. (2020) 395:473–5. doi: 10.1016/S0140-6736(20)30317-2

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Gattinoni L, Chiumello D, Rossi S. COVID-19 pneumonia: ARDS or not? Crit Care. (2020) 24:154. doi: 10.1186/s13054-020-02880-z

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Azoulay E, Zafrani L, Mirouse A, Lengliné E, Darmon M, Chevret S. Clinical phenotypes of critically ill COVID-19 patients. Intens Care Med. (2020) 46:1651–2. doi: 10.1007/s00134-020-06120-4

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Gattinoni L, Chiumello D, Caironi P, Busana M, Romitti F, Brazzi L, et al. COVID-19 pneumonia: different respiratory treatments for different phenotypes? Intens Care Med. (2020) 46:1099–102. doi: 10.1007/s00134-020-06033-2

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Rello J, Storti E, Belliato M, Serrano R. Clinical phenotypes of SARS-CoV-2: implications for clinicians and researchers. Eur Respir J. (2020) 55:2001028. doi: 10.1183/13993003.01028-2020

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Li H, Liu L, Zhang D, Xu J, Dai H, Tang N, et al. SARS-CoV-2 and viral sepsis: observations and hypotheses. Lancet. (2020) 395:1517–20. doi: 10.1016/S0140-6736(20)30920-X

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. (2010) 26:1572–3. doi: 10.1093/bioinformatics/btq170

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Rindskopf D, Rindskopf W. The value of latent class analysis in medical diagnosis. Stat Med. (1986) 5:21–7. doi: 10.1002/sim.4780050105

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Seymour CW, Kennedy JN, Wang S, Chang CC, Elliott CF, Xu Z, et al. Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. JAMA. (2019) 321:2003–17. doi: 10.1001/jama.2019.5791

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Sinha P, Delucchi KL, McAuley DF, O'Kane CM, Matthay MA, Calfee CS. Development and validation of parsimonious algorithms to classify acute respiratory distress syndrome phenotypes: a secondary analysis of randomised controlled trials. Lancet Respir Med. (2020) 8:247–57. doi: 10.1016/S2213-2600(19)30369-8

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Vranas KC, Jopling JK, Sweeney TE, Ramsey MC, Milstein AS, Slatore CG, et al. Identifying distinct subgroups of ICU patients: a machine learning approach. Crit Care Med. (2017) 45:1607–15. doi: 10.1097/CCM.0000000000002548

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Arabi YM, Mandourah Y, Al-Hameed F, Sindi AA, Almekhlafi GA, Hussein MA, et al. Corticosteroid therapy for critically ill patients with middle east respiratory syndrome. Am J Respir Crit Care Med. (2018) 197:757–67. doi: 10.1164/rccm.201706-1172OC

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Chen H, Wang J, Su N, Bao X, Li Y, Jin J. Simplified immune-dysregulation index: a novel marker predicts 28-day mortality of intensive care patients with COVID-19. Intens Care Med. (2020) 6:1645–7. doi: 10.1007/s00134-020-06114-2

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Wang Y, Lu X, Li Y, Chen H, Chen T, Su N, et al. Clinical course and outcomes of 344 intensive care patients with COVID-19. Am J Respir Crit Care Med. (2020) 201:1430–4. doi: 10.1164/rccm.202003-0736LE

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Sinha P, Delucchi KL, Thompson BT, McAuley DF, Matthay MA, Calfee CS. Latent class analysis of ARDS subphenotypes: a secondary analysis of the statins for acutely injured lungs from sepsis (SAILS) study. Intens Care Med. (2018) 44:1859–69. doi: 10.1007/s00134-018-5378-3

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Liang W, Liang H, Ou L, Chen B, Chen A, Li C, et al. Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19. JAMA Intern Med. (2020) 180:1–9. doi: 10.1001/jamainternmed.2020.2033

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Bos LD, Schouten LR, Van Vught LA, Wiewel MA, Ong DS, Cremer O, et al. Identification and validation of distinct biological phenotypes in patients with acute respiratory distress syndrome by cluster analysis. Thorax. (2017) 72:876–83. doi: 10.1136/thoraxjnl-2016-209719

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Calfee CS, Delucchi K, Parsons PE, Thompson BT, Ware LB, Matthay MA. Subphenotypes in acute respiratory distress syndrome: latent class analysis of data from two randomised controlled trials. Lancet Respir Med. (2014) 2:611–20. doi: 10.1016/S2213-2600(14)70097-9

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Leisman DE, Deutschman CS, Legrand M. Facing COVID-19 in the ICU: vascular dysfunction, thrombosis, and dysregulated inflammation. Intens Care Med. (2020) 6:1105–8. doi: 10.1007/s00134-020-06059-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: COVID-19, phenotypes, machine learning, intensive care unit, 28-day mortality

Citation: Chen H, Zhu Z, Su N, Wang J, Gu J, Lu S, Zhang L, Chen X, Xu L, Shao X, Yin J, Yang J, Sun B and Li Y (2021) Identification and Prediction of Novel Clinical Phenotypes for Intensive Care Patients With SARS-CoV-2 Pneumonia: An Observational Cohort Study. Front. Med. 8:681336. doi: 10.3389/fmed.2021.681336

Received: 16 March 2021; Accepted: 29 April 2021;
Published: 04 June 2021.

Edited by:

Zhongheng Zhang, Sir Run Run Shaw Hospital, China

Reviewed by:

Arif Nur Muhammad Ansori, Airlangga University, Indonesia
Qilin Yang, The Second Affiliated Hospital of Guangzhou Medical University, China
Wei Cao, Peking Union Medical College Hospital (CAMS), China

Copyright © 2021 Chen, Zhu, Su, Wang, Gu, Lu, Zhang, Chen, Xu, Shao, Yin, Yang, Sun and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yongsheng Li, ZHJfeXNsaUAxMjYuY29t; Hui Chen, MTU5MDUxNjI0MjlAMTYzLmNvbQ==

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.