AUTHOR=Chen Hui , Zhu Zhu , Su Nan , Wang Jun , Gu Jun , Lu Shu , Zhang Li , Chen Xuesong , Xu Lei , Shao Xiangrong , Yin Jiangtao , Yang Jinghui , Sun Baodi , Li Yongsheng
TITLE=Identification and Prediction of Novel Clinical Phenotypes for Intensive Care Patients With SARS-CoV-2 Pneumonia: An Observational Cohort Study
JOURNAL=Frontiers in Medicine
VOLUME=8
YEAR=2021
URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2021.681336
DOI=10.3389/fmed.2021.681336
ISSN=2296-858X
ABSTRACT=
Background: Phenotypes have been identified within heterogeneous disease, such as acute respiratory distress syndrome and sepsis, which are associated with important prognostic and therapeutic implications. The present study sought to assess whether phenotypes can be derived from intensive care patients with coronavirus disease 2019 (COVID-19), to assess the correlation with prognosis, and to develop a parsimonious model for phenotype identification.
Methods: Adult patients with COVID-19 from Tongji hospital between January 2020 and March 2020 were included. The consensus k means clustering and latent class analysis (LCA) were applied to identify phenotypes using 26 clinical variables. We then employed machine learning algorithms to select a maximum of five important classifier variables, which were further used to establish a nested logistic regression model for phenotype identification.
Results: Both consensus k means clustering and LCA showed that a two-phenotype model was the best fit for the present cohort (N = 504). A total of 182 patients (36.1%) were classified as hyperactive phenotype, who exhibited a higher 28-day mortality and higher rates of organ dysfunction than did those in hypoactive phenotype. The top five variables used to assign phenotypes were neutrophil-to-lymphocyte ratio (NLR), ratio of pulse oxygen saturation to the fractional concentration of oxygen in inspired air (Spo2/Fio2) ratio, lactate dehydrogenase (LDH), tumor necrosis factor α (TNF-α), and urea nitrogen. From the nested logistic models, three-variable (NLR, Spo2/Fio2 ratio, and LDH) and four-variable (three-variable plus TNF-α) models were adjudicated to be the best performing, with the area under the curve of 0.95 [95% confidence interval (CI) = 0.94–0.97] and 0.97 (95% CI = 0.96–0.98), respectively.
Conclusion: We identified two phenotypes within COVID-19, with different host responses and outcomes. The phenotypes can be accurately identified with parsimonious classifier models using three or four variables.