- 1Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- 2National Clinical Research Center for Geriatric Disorders, Changsha, China
- 3Shenzhen Center for Chronic Disease Control, Shenzhen, China
- 4Department of Pulmonary and Critical Care Medicine, Hunan Provincial People's Hospital, The First Affiliated Hospital of Hunan Normal University, Changsha, China
- 5Department of Pulmonary and Critical Care Medicine, The Second Xiangya Hospital, Central South University, Changsha, China
- 6Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- 7B7 Department, Zhongfa District of Tongji Hospital, Tongji Medical, Huazhong University of Science and Technology, Wuhan, China
Motivation: Patients with novel coronavirus disease 2019 (COVID-19) worsen into critical illness suddenly is a matter of great concern. Early identification and effective triaging of patients with a high risk of developing critical illness COVID-19 upon admission can aid in improving patient care, increasing the cure rate, and mitigating the burden on the medical care system. This study proposed and extended classical least absolute shrinkage and selection operator (LASSO) logistic regression to objectively identify clinical determination and risk factors for the early identification of patients at high risk of progression to critical illness at the time of hospital admission.
Methods: In this retrospective multicenter study, data of 1,929 patients with COVID-19 were assessed. The association between laboratory characteristics measured at admission and critical illness was screened with logistic regression. LASSO logistic regression was utilized to construct predictive models for estimating the risk that a patient with COVID-19 will develop a critical illness.
Results: The development cohort consisted of 1,363 patients with COVID-19 with 133 (9.7%) patients developing the critical illness. Univariate logistic regression analysis revealed 28 variables were prognosis factors for critical illness COVID-19 (p < 0.05). Elevated CK-MB, neutrophils, PCT, α-HBDH, D-dimer, LDH, glucose, PT, APTT, RDW (SD and CV), fibrinogen, and AST were predictors for the early identification of patients at high risk of progression to critical illness. Lymphopenia, a low rate of basophils, eosinophils, thrombopenia, red blood cell, hematocrit, hemoglobin concentration, blood platelet count, and decreased levels of K, Na, albumin, albumin to globulin ratio, and uric acid were clinical determinations associated with the development of critical illness at the time of hospital admission. The risk score accurately predicted critical illness in the development cohort [area under the curve (AUC) = 0.83, 95% CI: 0.78–0.86], also in the external validation cohort (n = 566, AUC = 0.84).
Conclusion: A risk prediction model based on laboratory findings of patients with COVID-19 was developed for the early identification of patients at high risk of progression to critical illness. This cohort study identified 28 indicators associated with critical illness of patients with COVID-19. The risk model might contribute to the treatment of critical illness disease as early as possible and allow for optimized use of medical resources.
Introduction
The coronavirus disease 2019 (COVID-19) pandemic is spreading worldwide. As a communicable disease, COVID-19 is caused by severe acute respiratory syndrome coronavirus 2. Until 14 February 2022, the WHO reported 412,044,520 COVID-19 confirmed cases globally, with an average mortality rate of 1.4%. The clinical spectrum of COVID-19 infection ranges from asymptomatic infection, and mild upper respiratory tract illness to critically ill cases (1). It has been reported that about 5% of patients with COVID-19 infection experience rapid deterioration from the onset of symptoms into critical illness (2) and with a mortality rate of 61.5% for critical ones within 28 days of hospital admission (3). Treatment of patients with critical illnesses constitutes great pressure on medical services, especially results in the lack of intensive care resources. Therefore, early identification and effective triaging of patients with a high risk of developing critical illness COVID-19 upon admission can aid in improving patient care, increasing the cure rate, and mitigating the burden on the medical care system.
The risk factors for critical illness are not well-revealed. Previous reports have identified that older age, organ dysfunction, neutrophilia, preexisting concurrent cardiovascular or cerebrovascular diseases, coagulopathy, amounts of CD3+CD8+ T cells, and elevated D-dimer levels are associated with the development of acute respiratory distress syndrome and increased mortality risk (1, 4–9). A limited number of publications have identified chest radiographic abnormality, older age, hemoptysis, dyspnea, unconsciousness, number of comorbidities, cancer history, neutrophil-to-lymphocyte ratio (10), lactate dehydrogenase (LDH), and direct bilirubin are risk factors associated with the development of critical illness (11, 12). Clinical scores for predicting which patients with COVID-19 will develop critical illness were developed with these above 10 factors (11, 12), which show well-discrimination. In addition, an integrated model was developed with patient history, laboratory markers, and chest radiography at hospital admission to predict critical illness by Schalekamp et al. (13). However, in these models, some diagnoses of co-existing illness and symptoms were from patients' self-reports at admission, which might lead to recall bias.
Mathematical modeling with appropriate inputs can make predictions in the dynamics and control of the infectious disease. A series of mathematical models have been developed on the transmission dynamics and control of COVID-19 or SARS-CoV-2 virus in different countries (14–24), namely, Wuhan, Italy, and the USA. In this retrospective multicenter study, we proposed and extended classical least absolute shrinkage and selection operator (LASSO) logistic regression for the early identification of patients at high risk of progression to critical illness. We systematically analyzed the accessible laboratory findings of confirmed 1,929 patients with COVID-19 having clear prognostic information in 32 hospitals in Hubei and Hunan provinces of China and identified robust and meaningful factors associated with a critical illness. The laboratory findings were measured objectively. A risk prediction model was constructed according to LASSO logistic regression to help identify patients at the time of hospital admission who are at high risk of developing a critical illness. This model aims at distinguishing patients at imminent risk of critical illness, thereby optimizing the allocation of limited healthcare resources and potentially lowering the mortality rate.
Methods
Data Collection
This study has been proved by the Institute of Clinical Pharmacology, Central South University. For the urgent need to collect and analyze data on this emerging pathogen, the ethics committee of the Institute of Clinical Pharmacology, Central South University granted a waiver of written informed consent from study participants. Medical records of hospitalized patients with COVID-19 diagnosed in 31 hospitals in China (4 hospitals in Hubei Province and 27 hospitals in Hunan Province) were collected. All patients who were diagnosed with COVID-19 by positive high-throughput sequencing or real-time reverse-transcription PCR (RT-PCR) assay for nasal and pharyngeal swab specimens were screened, our study enrolled all adult inpatients (≥18 years old) who were hospitalized for COVID-19 and had an explicit outcome of critical illness. The data were cross-checked by experienced respiratory clinicians. All patients with data on clinical status at hospitalization (laboratory findings, critical illness, and discharge status) were included.
Clinical Outcome
The outcome of this study is a critical illness, which is defined as a composite of invasive ventilation, admission to the intensive care unit (ICU), or fatal of patients with COVID-19 (25). The follow-up time was calculated from the first day of hospitalization to the date of death or discharge, or the censored date (12th April 2020 for the development cohort and 11 June 2020 for the validation cohort).
Potential Predictive Variables
Demographic variables and laboratory findings of patients at hospital admission were collected as potential predictive variables. Demographic variables included age and gender. Laboratory findings were conducted as the first measurement within 2 days after at admission, laboratory indexes with complete measurements for more than 50% of the patients in the development cohort were collected: hematologic (hematokrit, basophils, eosinophils, lymphocytes, monocytes, neutrophils, mean corpuscular volume, hemoglobin concentration, coefficient of variation [CV] and SD of red blood cell volume distribution width [RDW], blood platelet count, thrombocytocrit, red blood cell, and white blood cells), biochemical [levels of glucose, K, Na, total Ca, Cl, total protein, lactate dehydrogenase (LDH), glutamic-pyruvic transaminase, creatine kinase, aspartate transaminase (AST), creatine kinase muscle-brain isoform (CK-MB), creatinine, ureophil, albumin, globulin, albumin to globulin ratio, and glomerular filtration rate (GFR)], coagulation function indexes [levels of D-dimer and fibrinogen, activated partial thromboplastin time (APTT), and prothrombin time (PT)], infection-related indices [levels of C-reactive protein (CRP), procalcitonin (PCT), and alpha hydroxybutyrate dehydrogenase (α-HBDH)], and also the level of uric acid. For the complete laboratory findings and corresponding ratio of missing values, please refer to Supplementary Table 1.
Statistical Analysis
Continuous and categorical variables were presented as mean, SD [interquartile range (IQR)], and n (%), respectively.
A total of 1,255 patients hospitalized with COVID-19 in the development cohort were included for variable selection. To access the association between the quantitative laboratory findings described above and the occurrence of critical illness, a univariate logistic regression analysis was conducted. Since the odds ratio (OR) is interpreted per unit change, to standardize ORs between variables with a different range, logistic regression analysis was applied to dichotomies data (1 = with the occurrence of critical illness and 0 = without the occurrence of critical illness) with quartiles of each of the 38 laboratory findings modeled as continuous (<25% quartile = 1; ≥25% and <50% quartile = 2; ≥50% quartile and <75% quartile = 3; and ≥ 75% quartile = 4). The associations between the occurrence of critical illness and age (≥55 vs. <55 years) were also evaluated.
The statistically significant 28 covariates (p < 0.05) in the univariate logistic analysis were selected as candidates for risk score development of critical illness. A total of 1,064 patients with at least 80% data completeness of the above 28 variables were utilized for model establishment. We applied predictive mean matching to impute numeric features (laboratory findings) with “mice” packages in R for these 1,064 patients.
Prediction models were developed with the LASSO logistic regression, support vector regression (SVR), artificial neural network (ANN), regression tree (RT), and multivariate adaptive regression splines (MARS) machine learning techniques. We used the “glmnet” (14) package for LASSO, “e1071” package for SVR, “RSNNS” package for ANN, “rpart” package for RT, and “earth” package for MARS. Default parameters were used. L1-penalized least absolute shrinkage and selection regression augmented with 1,000-fold cross-validation for internal validation was utilized. LASSO logistic regression is a logistic regression model that penalizes the absolute size of the coefficients of a regression model according to the value of λ. In the process of LASSO regression coefficients, some unimportant regression coefficients can be directly reduced to 0 to achieve the function of variable screening. In comparison to the ridge regression model, the penalty term in the LASSO regression is an absolute value, namely, L1 regular. The estimates of weaker factors shrink toward zero with larger penalties, then only the greatest predictors were left in the model. We select the most predictive covariates by the minimum value of λ. Subsequently, variables identified by LASSO regression analysis were used to construct the risk score with their coefficients:
where n stands for the number of prognostic variables in the model; Valuei is the original value of variablei; and Coei is the estimated coefficient of Valuei in the LASSO logistic regression model. The probability of developing critical illness was calculated with the following formula: probability = exp (RS)/[1+ exp(RS)].
We used receiver operating characteristic (ROC) curves to compare the sensitivity and specificity of scores generated with different machine learning techniques. The abscissa and ordinate coordinates of ROC curves are false-positive rate and true probability, respectively. The points of ROC curves reflect the susceptibility to the same signal stimulus. By comparing the false-positive and true numbers, ROC curves show the performance of a classification model at all classification thresholds. The area under the receiver operating characteristics (AUROC), namely, the entire two-dimensional area underneath the entire ROC curve, was used as the precision measurement. AUROC shows how much the model is capable of distinguishing between classes. The larger the AUROC value, the better will be the model at predicting different classes. R-package “ROCR” was utilized for the calculation of the AUROC curve.
To explore temporal changes in laboratory findings during hospitalization, differences between critical illness groups during follow-up in laboratory findings were estimated from linear mixed models with R package “nlme.”
Details of samples used at each stage of statistical analysis were depicted in Figure 1. All statistical analysis was conducted with R software (version 3.6.2, R Foundation), and p-values were computed from two-tailed tests of statistical significance with a type I error rate of 5%.
Figure 1. Study flowchart detailing which samples were utilized at each phase of statistical analysis. COVID-19: severe coronavirus disease 2019.
External Model Validation
To validate the generalizability of the risk scores, we used an independent cohort from hospitals in Hunan province including 566 patients. We collected the same variables required for calculating the risk score from the validation cohort and cross-checked them. The 432 patients with at least 80% data completeness of the 28 variables used for model development were selected. The laboratory findings were imputed and the risk score was calculated as described for the development cohort. To assess the discriminative ability, the AUCs were evaluated.
Results
Characteristics of the Cohorts
The development cohort with 1,363 patients, of which a total of 133 patients eventually developed critical illness (9.8%), from 4 hospitals in Hubei. The median follow-up time for patients was 14 days. The average (SD) age of patients in this cohort was 57.84 (16.29) years; 634 patients (46.52%) were men. The validation cohort included 566 patients with a mean (SD) age of 45.94 (15.33) years, 291 (51.41) were men. The median follow-up time for patients was 13 days. The critical illness eventually developed in 28 (4.24%) of these patients.
Prognostic Factors of Critical Illness
A total of 39 features were tested for associations with critical illness in the development cohort with univariate logistic regression analysis. The results of the 1,255 patients showed that 28 variables were prognosis factors for critical illness COVID-19 (p < 0.05, Table 1, Figure 2). The odds of critical illness were higher in patients older than 65 years. Laboratory results show that elevated CK-MB, neutrophils, PCT, α-HBDH, D-dimer, LDH, glucose, PT, APTT, RDW (SD and CV), fibrinogen, and AST were associated with a critical illness. Patients in the critical illness group showed lymphopenia and had a low rate of basophils, eosinophils, thrombopenia, red blood cell, hematocrit, hemoglobin concentration, and blood platelet count and represented decreased levels of K, Na, albumin, albumin to globulin ratio, and uric acid, compared with the non-critical illness group.
Table 1. Laboratory characteristics among patients who did not or did develop critical illness in the development cohort.
Figure 2. Prognostic associations of clinical characteristics and laboratory findings in the development dataset. Unadjusted ORs (boxes) and corresponding 95% CIs (horizontal lines) for variables associated with the development of critical illness are represented. Box size is inversely proportional to the standard error of OR. The variables are stratified as quartiles. OR, odds ratio. CI, confidence interval.
Longitudinal Observations of Laboratory Variables
To determine the major clinical features that appeared during COVID-19 disease progression, the dynamic changes in 28 clinical laboratory parameters were measured within 2 days after hospital admission and associated with critical illness, namely, hematological and biochemical parameters, were recorded from day 3 to day 25 after hospital admission. The temporal changes in laboratory findings during hospitalization were explored (Figure 3). Baseline lymphocyte count was significantly lower in critical illness than in non-critical illness patients. Levels of CRP, D-dimer, LDH, and glucose were clearly elevated in the critical illness group compared with the non-critical illness group throughout the clinical course either in the developing dataset. Furthermore, we found that compared to that in the non-critical illness group, neutrophils, α-HBDH, and globulin were increased in the critical illness group, while eosinophils and albumin were decreased in the critical illness group.
Figure 3. Temporal changes in laboratory findings from illness onset in patients hospitalized with COVID-19. Temporal changes in neutrophils (A), lymphocytes (B), eosinophils (C), D-dimer (D), alpha hydroxybutyrate dehydrogenase (E), lactate dehydrogenase (F), C-reaction protein (G), albumin (H), and glucose (I) in the development dataset were presented. Differences between critical illness patients and non-critical illness patients were demonstrated with p-values calculated with mixed linear models. The dashed lines in black and red color show the lower and upper normal limits of each laboratory finding.
Construction of the Risk Models and their Performances
A total of 28 variables determined at hospital admission and associated with a critical illness (Figure 2) were included in the model development. Prediction models were constructed using LASSO logistic regression, SVR, ANN, RT, and MARS, their performance was evaluated by the ROC analysis (Figure 4). Although the predictive ability of ANN and SVR in the development cohort was better than other algorithms, the predictive ability using models of LASSO logistic regression and ANN outperformed the other algorithms in the validating dataset (Figure 4D). The LASSO logistic regression model was selected by us for its high predictive power and interpretability. In LASSO regression, after excluding irrelevant and redundant features (Figures 4A,B), 21 features remained for LASSO regression analysis, including age, whether take ARB drugs and blood test results, lymphocytes, neutrophils, blood platelet, thrombocytocrit, RDW (CV and SD), hematocrit, hemoglobin concentration, AST, CK-MB, albumin, LDH, glucose, K, Na, CRP, PCT, PT, APTT, fibrinogen, and uric acid. The risk score was constructed based on the coefficients from the LASSO logistic model (Table 2) and then converted into a probability with formulas presented in the method and materials section. By internal 100 times bootstrap validation, the mean AUC based on data from the development cohort was 0.83 (95% CI, 0.78–0.86) (Figure 4C). Variables utilized in the risk score for the validation cohort are shown in Table 3. The accuracy of the COVID risk score in the validation cohort was like that of the development cohort with an AUC in the validation cohort of 0.84 (Figure 4D).
Figure 4. Feature selection using the least absolute shrinkage and selection operator (LASSO) logistic regression model. (A) LASSO coefficient profiles of the 29 baseline features. (B) Tuning parameter (λ) selection in the LASSO model used 1,000-fold cross-validation via minimum criteria. Receiver operating characteristic curve for the performance of different machine learning techniques to distinguish individuals with COVID-19 from those with critical illness COVID-19 in the training cohort (C) and validation cohort 1 (D), respectively. AUC, area under the receiver operating characteristic curve. The true positive rate represents module sensitivity, whereas the false positive rate is one minus the specificity.
Table 2. Coefficients of LASSO logistic regression model for predicting development of critical illness in 1,064 patients hospitalized with COVID-19 in the development dataset.
Discussion
Early identification of patients with COVID-19 at risk of progression to critical illness disease will aid in better patient management and effective usage of healthcare resources. In this study, we unraveled that older age and higher levels of laboratory test indexes such as CRP, LDH, and glucose, and lower levels of laboratory findings such as lymphocytes and albumin on admission were associated with higher probabilities of critical illness COVID-19. In addition, a clinical risk score based on LASSO logistic regression was developed to predict the development of critical illness patients with COVID-19 with satisfactory accuracy according to AUC (0.83). Generally, the 21 variables required for estimating the probability of developing critical illness can be easily obtained from routine tests at hospital admission. The robustness and applicability of the risk score were confirmed in the independent validation dataset (AUC = 0.84).
Univariate analyses revealed that factors, namely, age, neutrophils, D-dimer, LDH, CRP, glucose, APTT, fibrinogen, AST, and several other biochemical parameters were associated with a critical illness. In addition, the dynamic profile of the significant laboratory findings was tracked. Levels of LDH, D-dimer, glucose, CRP, α-HBDH, and globulin are higher in the critical illness group compared with the non-critical illness group. And neutrophil counts and albumin are lower in the critical illness group compared with the non-critical illness group. A prediction model for critical illness was developed with 21 predictors that were found to be independently correlated with critical illness via multivariate LASSO logistic regression analysis. Previous studies have found several of these variables to be prognosis factors for patients with COVID-19. It has been reported that elderly patients were more commonly critically ill with COVID-19 (3, 26, 27) and have a higher probability of a death outcome (28, 29). Modelli and colleagues revealed that the 28-day fatality rate was associated with increasing age, hypertension, cardiovascular disease, and higher body mass index (17), in agreement with the previous work.
Lymphopenia, leukocytosis (with increased absolute neutrophil counts), eosinopenia, neutrophilia, increased CRP and PCT which reflects a persistent state of inflammation (30) may be related to cytokine storm and cellar immune deficiency induced by virus invasion (27, 31). Zhou et al. found lower lymphocyte counts and higher LDH in patients who died from COVID-19 (1). Injured alveolar epithelial cells could lead to the infiltration of lymphocytes, resulting in persistent lymphopenia (32, 33). Lymphopenia is a common characteristic in patients with COVID-19 and might play an important role in the disease process (34, 35). Zhang et al. noted that 53% of patients admitted with COVID-19 had eosinopenia on the day of hospital admission (36). Calabrese et al. reported that lymphocyte and platelet counts were the most important features able to stratify patients into different clinical clusters (37). Ewan et al. demonstrated that risk stratification was improved by blood and physiological parameters (C-reactive protein, neutrophil/lymphocyte ratio, and neutrophil count) measured at hospital admission (20). Such findings were consistent with this work. A higher level of LDH was an indication of the activity and severity of idiopathic pulmonary fibrosis and is one of the most important prognostic biomarkers of lung injury (37). LDH was reported to be higher in severe and patients who received ICU treatment with COVID-19 than in mild and non-ICU patients (27, 30, 38, 39), which is utilized as a valuable prognosis predictor (40, 41). In addition, patients with elevated CK-MB levels on hospital admission were at significantly increased risk of critical illness. Li and colleagues found that cardiac injury (elevated LDH and CK-MB levels) were associated with severe disease or ICU admission and death in patients with COVID-19 (42). Increased PT and APTT, decreased blood platelet, thrombocytocrit, and fibrinogen which reflect the coagulation activation might be associated with the sustained inflammatory response. Banoei et al. noted that prothrombin and lactate were the most differentiating biochemical markers in the mortality prediction model (18).
Since hyperglycemia is harmful to the management of inflammation and viremia, the association between the level of glucose and critical illness in COVID-19 viral infections is not surprising. Based on big data analysis with a cohort with 7,337 COVID-19 cases, Zhu et al. revealed that diabetics with better-controlled blood glucose were associated with a decreased death risk than diabetics with poorly controlled blood glucose (43). Banoei and colleagues demonstrated that disease, coronary artery disease, dementia, age > 65, and altered mental status were the topmost differentiating mortality predictors (22).
Previous studies have identified that 15–53% of cases reported abnormal levels of AST during disease progression (44–47). In a study conducted by Huang et al. (48), the elevation of AST was found in 8 (62%) of 13 patients in the ICU compared with 7 (25%) of 28 COVID-19 infected cases who did not need ICU care. Abnormal liver tests occur in most hospitalized patients with COVID-19 and may be associated with ICU admission, mechanical ventilation (48), and death (28, 48). Liver damage (decreased albumin and increased globulin) in patients with COVID-19 infections might be associated with the direct effect of the viral infection of liver cells, drug hepatotoxicity, or immune-mediated inflammation (37), such as cytokine storm and pneumonia-associated hypoxia.
Prediction models for the dynamic and control of COVID-19 infection found broad similarities with the features retained in our models, particularly regarding aging, hypertension, CRP, LDH, prothrombin, lactate, and neutrophil levels (14–24). The main advantage of the LASSO logistic regression is that the variable with a large parameter estimation is compressed to a smaller variable, while the variable with the smaller parameter estimate is compressed to 0. The parameter estimation of the LASSO analysis is continuous, which is suitable for model selection with high-dimensional data.
In the development dataset, we found that the discriminative abilities of SVR, ANN, RT, and MARS were outperforming that of LASSO logistic regression as evaluated by AUCs. However, in the independent validation dataset, the predictive ability of LASSO logistic regression was the best within all algorithms and was selected by us. The phenomenon that the model that incorporates the highest level of non-linearity displayed better in-sample prediction, but also yielded the worse out-of-sample performances may account for the over-fitting problem of the ANN, RT, MARS, and SVR algorithms (45). The linear Kernel function utilized in LASSO logistic regression performed badly in-sample but generated the best out-of-sample predictions.
There are inevitably limitations in our retrospective study. The primary one is incomplete laboratory findings in the electronic database and the lacking of CT images, which decreases the statistical power of the LASSO logistic regression model. Therefore, important information might be missed and further prospective studies are required. However, our model has a certain tolerance to missing data, as high performance as measured by AUC on the developing and external validation dataset for samples missing 20% of the predictors was achieved. Second, since the algorithms we tried are purely data-driven, the performances of these models may vary if developed with different datasets. We believe that more accurate models can be obtained with the increasing of available datasets. Third, the data for risk probability development and validation are from two provinces of China, which could potentially limit the generalizability of the risk model. Further studies on different populations all over the world with larger patient cohorts are needed to validate our findings.
Conclusion
In summary, this study identified 28 indicators (such as age, LDH, CRP, and lymphocytes) associated with critical illness of patients with COVID-19. The longitudinal laboratory variables were explored. A risk score to estimate the risk of developing critical illness among patients with COVID-19 was developed based on 21 variables independently associated with critical illness and commonly measured on hospital admission. The risk model is especially valuable for early detection and intervention of the incidence of critical illness COVID-19, thus making improvements to clinical strategies against COVID-19, optimizing the use of healthcare resources, and potentially reducing mortality in patients with COVID-19.
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Author Contributions
YF: conceptualization and writing. WZho: resources and data curation. TL, JL, KX, XM, LX, and JJ: resources. HZ: supervision. RL: project administration and supervision. WZha: funding acquisition. All authors contributed to the article and approved the submitted version.
Funding
This study was supported by the National Scientific Foundation of China (Nos. 81874329, 81573511, and 81522048).
Conflict of Interest
YF was employed by Cofoe Medical Technology Co., Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
We thank all study participants, their families, medical staff, and participating hospitals for their involvement and support in this study. We are grateful to the High Performance Computing Center of Central South University for assistance with the computations.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpubh.2022.880999/full#supplementary-material
References
1. Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. (2020) 395:1054–62. doi: 10.1016/S0140-6736(20)30566-3
2. Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in china: summary of a report of 72 314 cases from the Chinese center for disease control and prevention. JAMA. (2020) 323:1239–42. doi: 10.1001/jama.2020.2648
3. Yang X, Yu Y, Xu J, Shu H, Xia J, Liu H, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Res Med. (2020) 8:475–81. doi: 10.1016/S2213-2600(20)30079-5
4. Wu C, Chen X, Cai Y, Xia J, Zhou X, Xu S, et al. Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China. JAMA Intern Med. (2020) 180:1–11. doi: 10.1001/jamainternmed.2020.0994
5. Du RH, Liang LR, Yang CQ, Wang W, Cao TZ, Li M, et al. Predictors of mortality for patients with COVID-19 pneumonia caused by SARS-CoV-2: a prospective cohort study. The Eur Res J. (2020) 55:2000524. doi: 10.1183/13993003.00524-2020
6. Siddiqi H, Mandeep M COVID-19 illness in native and immunosuppressed states: a clinical-therapeutic staging proposal. J. Heart Lung Transplant. (2020) 39:405–7. doi: 10.1016/j.healun.2020.03.012
7. Zhao X, Li Y, Ge Y, Shi Y, Lv P, Zhang J, et al. Evaluation of nutrition risk and its association with mortality risk in severely and critically ill COVID-19 patients. JPEN J Parenter Enteral Nutr. (2021) 45:32–42. doi: 10.1002/jpen.1953
8. Alhasan K, Shalaby M, Temsah MH, Aljamaan F, Shagal R, AlFaadhel T, et al. Factors that influence mortality in critically ill patients with SARS-CoV-2 infection: a multicenter study in the kingdom of Saudi Arabia. Healthcare. (2021) 9:1608. doi: 10.3390/healthcare9121608
9. Jakob CEM, Mahajan UM, Oswald M, Stecher M, Schons M, Mayerle J, et al. Prediction of COVID-19 deterioration in high-risk patients at diagnosis: an early warning score for advanced COVID-19 developed by machine learning. Infection. (2021) 50:359–70. doi: 10.1007/s15010-021-01656-z
10. Liu J, Liu Y, Xiang P, Pu L, Xiong H, Li C, et al. Neutrophil-to-lymphocyte ratio predicts critical illness patients with 2019 coronavirus disease in the early stage. J Transl Med. (2020) 18:206. doi: 10.1186/s12967-020-02374-0
11. Liang W, Liang H, Ou L, Chen B, Chen A, Li C, et al. Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19. JAMA Intern Med. (2020) 180:1081–9. doi: 10.1001/jamainternmed.2020.2033
12. Liang W, Yao J, Chen A, Lv Q, Zanin M, Liu J, et al. Early triage of critically ill COVID-19 patients using deep learning. Nat Commun. (2020) 11:3543. doi: 10.1038/s41467-020-17280-8
13. Schalekamp S, Huisman M, Dijk RA, Boomsma MF, Freire Jorge PJ, Boer WS, et al. Model-based prediction of critical illness in hospitalized patients with COVID-19. Radiology. (2021) 298:E46–54. doi: 10.1148/radiol.2020202723
14. Jerome TH, Robert T. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. (2010) 33:1–22. doi: 10.18637/jss.v033.i01
15. Kamran F, Tang S, Otles E, McEvoy S, Saleh N, Gong J, et al. Early identification of patients admitted to hospital for covid-19 at risk of clinical deterioration: model development and multisite external validation study. BMJ. (2022) 376:e068576. doi: 10.1136/bmj-2021-068576
16. Wanga B, Mondal J, Samui P, Chatterjee AN, Yusuf A, Effect Effect of an antiviral drug control and its variable order fractional network in host COVID-19 kinetics. Eur Phys J Spec Top. (2022) 1–15. doi: 10.1140/epjs/s11734-022-00454-4
17. Modelli LG, Sandes-Freitas TV, Requiao-Moura LR, Viana LA, Cristelli MP, Garcia VD, et al. Development and validation of a simple web-based tool for early prediction of COVID-19-associated death in kidney transplant recipients. Am J Transplant. (2022) 22:610–25. doi: 10.1111/ajt.16807
18. Chatterjee AN, Ahmad B. A fractional-order differential equation model of COVID-19 infection of epithelial cells. Chaos Solitons Fractals. (2021) 147:110952. doi: 10.1016/j.chaos.2021.110952
19. Mondal J, Samui P, Chatterjee N. Dynamical demeanour of SARS-CoV-2 virus undergoing immuneresponse mechanism in COVID-19 pandemic. Eur Phys. J Spec Top. (2022). doi: 10.1140/epjs/s11734-022-00437-5
20. Rai RK, Khajanchi S, Tiwari PK, Venturino E, Misra AK. Impact of social media advertisements on the transmission dynamics of COVID-19 pandemic in India. J Appl Math Comput. (2021). https://doi.org/10.1007/s12190-021-01507-y
21. Chatterjee N, Fahad B, Muqrin A, Jayanta M, Ilyas K. SARS-CoV-2 infection with lytic and non-lytic immune responses: a fractional order optimal control theoretical study. Results Phys. (2021) 26:104260. doi: 10.1016/j.rinp.2021.104260
22. Banoei MM, Dinparastisaleh R, Zadeh AV, Mirsaeidi M. Machine-learning-based COVID-19 mortality prediction model and identification of patients at low and high risk of dying. Crit Care. (2021) 25:328. doi: 10.1186/s13054-021-03749-5
23. Chatterjee N, Fahad B. A model for sars-cov-2 infection with treatment. Comput Math Methods Med. (2020) 1352982. doi: 10.1101/2020.04.24.20077958
24. Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. (2020) 395:507–13. doi: 10.1016/S0140-6736(20)30211-7
25. Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA. (2020) 323:1061–9. doi: 10.1001/jama.2020.1585
26. Ruan Q, Yang K, Wang W, Jiang L, Song J. Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China. Intensive Care Med. (2020) 46:846–8. doi: 10.1007/s00134-020-05991-x
27. Chen R, Liang W, Jiang M, Guan W, Zhan C, Wang T, et al. Risk factors of fatal outcome in hospitalized subjects with coronavirus disease 2019 from a nationwide analysis in China. Chest. (2020) 158:97–105.
28. Hu C, Li J, Xing X, Gao J, Zhao S, Xing L, et al. The effect of age on the clinical and immune characteristics of critically ill patients with COVID-19: a preliminary report. PLoS ONE. (2021) 16:e0248675. doi: 10.1371/journal.pone.0248675
29. Bajwa EK, Khan UA, Januzzi JL, Gong MN, Thompson BT, Christiani DC. Plasma C-reactive protein levels are associated with improved outcome in ARDS. Chest. (2009) 136:471–80. doi: 10.1378/chest.08-2413
30. Li F, Li W, Farzan M, Harrison SC. Structure of SARS coronavirus spike receptor-binding domain complexed with receptor. Science. (2005) 309:1864–8. doi: 10.1126/science.1116480
31. Ge XY, Li JL, Yang XL, Chmura AA, Zhu G, Epstein JH, et al. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature. (2013) 503:535–8. doi: 10.1038/nature12711
32. Chan JF, Yuan S, Kok KH, To KK, Chu H, Yang J, et al. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet. (2020) 395:514–23. doi: 10.1016/S0140-6736(20)30154-9
33. Ziadi A, Hachimi A, Admou B, Hazime R, Brahim I, Douirek F, et al. Lymphopenia in critically ill COVID-19 patients: a predictor factor of severity and mortality. Intern J Hematol. (2021) 43:e38–40. doi: 10.1111/ijlh.13351
34. Zhang J, Dong X, Cao Y, Yuan Y, Yang Y, Yan Y, et al. Clinical characteristics of 140 patients infected with SARS-CoV-2 in Wuhan, China. Allergy. (2020) 75:1730–41. doi: 10.1111/all.14238
35. Calabrese F, Pezzuto F, Boscolo A, Lunardi F, Giraudo C, Giraudo C, et al. Machine learning-based analysis of alveolar and vascular injury in SARS-CoV-2 acute respiratory failure. J Pathol. (2021) 254:173–84. doi: 10.1002/path.5653
36. Chen T, Wu D, Chen H, Yan W, Yang D, Chen G, et al. Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study. BMJ. (2020) 368:m1295. doi: 10.1136/bmj.m1295
37. Drent M, Cobben NA, Henderson RF, Wouters EF, Dieijen-Visser M. Usefulness of lactate dehydrogenase and its isoenzymes as indicators of lung damage or inflammation. Eur Respir J. (1996) 9:1736–42. doi: 10.1183/09031936.96.09081736
38. Yan L, Zhang HT, Goncalves J, Xiao Y, Wang M, Guo Y, et al. An interpretable mortality prediction model for COVID-19 patients. Nat Mach Intell. (2020) 2:283–8. doi: 10.1038/s42256-020-0180-7
39. Zhang S, Guo M, Duan L, Wu F, Hu G, Wang Z, et al. Development and validation of a risk factor-based system to predict short-term survival in adult hospitalized patients with COVID-19: a multicenter, retrospective, cohort study. Crit Care. (2020) 24:438. doi: 10.1186/s13054-020-03123-x
40. Mo P, Xing Y, Xiao Y, Deng L, Zhao Q, Wang H, et al. Clinical characteristics of refractory COVID-19 pneumonia in Wuhan, China. Clin Infect Dis. (2021) 73:e4208–13 doi: 10.1093/cid/ciaa270
41. Li X, Pan X, Li Y, An N, Xing Y, Yang F, et al. Cardiac injury associated with severe disease or ICU admission and death in hospitalized patients with COVID-19: a meta-analysis and systematic review. Crit Care. (2020) 24:468. doi: 10.1186/s13054-020-03183-z
42. Zhu L, She ZG, Cheng X, Qin JJ, Zhang XJ, Cai J, et al. Association of blood glucose control and outcomes in patients with COVID-19 and pre-existing type 2 diabetes. Cell Metab. (2020) 31:1068–77.e3. doi: 10.1016/j.cmet.2020.04.021
43. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. (2020) 395:497–506. doi: 10.1016/S0140-6736(20)30183-5
44. Shi H, Han X, Jiang N, Cao Y, Alwalid O, Gu J, et al. Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study. Lancet Infect Dis. (2020) 20:425–34. doi: 10.1016/S1473-3099(20)30086-4
45. Eastin C, Eastin T. Clinical characteristics of coronavirus disease 2019 in Gansu province. J Emerg Med. (2020) 58:711–2. doi: 10.21037/apm-20-887
46. Cai Q, Huang D, Yu H, Zhu Z, Xia Z, Su Y, et al. COVID-19: Abnormal liver function tests. J Hepatol. (2020) 73:566–74. doi: 10.1016/j.jhep.2020.04.006
47. Fan Z, Chen L, Li J, Cheng X, Yang J, Tian C, et al. Clinical features of COVID-19-related liver functional abnormality. Clin Gastroenterol Hepatol. (2020) 18:1561–6. doi: 10.1016/j.cgh.2020.04.002
Keywords: COVID-19, risk factors, critical illness, machine learning, LASSO regression
Citation: Fu Y, Zhong W, Liu T, Li J, Xiao K, Ma X, Xie L, Jiang J, Zhou H, Liu R and Zhang W (2022) Early Prediction Model for Critical Illness of Hospitalized COVID-19 Patients Based on Machine Learning Techniques. Front. Public Health 10:880999. doi: 10.3389/fpubh.2022.880999
Received: 22 February 2022; Accepted: 13 April 2022;
Published: 24 May 2022.
Edited by:
Subhas Khajanchi, Presidency University, IndiaReviewed by:
Amar Nath Chatterjee, K.L.S. College, IndiaFahad Al Basir, Asansol Girls' College, India
Copyright © 2022 Fu, Zhong, Liu, Li, Xiao, Ma, Xie, Jiang, Zhou, Liu and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Rong Liu, bGl1cm9uZ2h5d0Bjc3UuZWR1LmNu; Wei Zhang, Y3N1emhhbmd3ZWlAY3N1LmVkdS5jbg==
†These authors have contributed equally to this work