AUTHOR=Hwangbo Suhyun , Kim Yoonjung , Lee Chanhee , Lee Seungyeoun , Oh Bumjo , Moon Min Kyong , Kim Shin-Woo , Park Taesung TITLE=Machine learning models to predict the maximum severity of COVID-19 based on initial hospitalization record JOURNAL=Frontiers in Public Health VOLUME=10 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.1007205 DOI=10.3389/fpubh.2022.1007205 ISSN=2296-2565 ABSTRACT=Background

As the worldwide spread of coronavirus disease 2019 (COVID-19) continues for a long time, early prediction of the maximum severity is required for effective treatment of each patient.

Objective

This study aimed to develop predictive models for the maximum severity of hospitalized COVID-19 patients using artificial intelligence (AI)/machine learning (ML) algorithms.

Methods

The medical records of 2,263 COVID-19 patients admitted to 10 hospitals in Daegu, Korea, from February 18, 2020, to May 19, 2020, were comprehensively reviewed. The maximum severity during hospitalization was divided into four groups according to the severity level: mild, moderate, severe, and critical. The patient's initial hospitalization records were used as predictors. The total dataset was randomly split into a training set and a testing set in a 2:1 ratio, taking into account the four maximum severity groups. Predictive models were developed using the training set and were evaluated using the testing set. Two approaches were performed: using four groups based on original severity levels groups (i.e., 4-group classification) and using two groups after regrouping the four severity level into two (i.e., binary classification). Three variable selection methods including randomForestSRC were performed. As AI/ML algorithms for 4-group classification, GUIDE and proportional odds model were used. For binary classification, we used five AI/ML algorithms, including deep neural network and GUIDE.

Results

Of the four maximum severity groups, the moderate group had the highest percentage (1,115 patients; 49.5%). As factors contributing to exacerbation of maximum severity, there were 25 statistically significant predictors through simple analysis of linear trends. As a result of model development, the following three models based on binary classification showed high predictive performance: (1) Mild vs. Above Moderate, (2) Below Moderate vs. Above Severe, and (3) Below Severe vs. Critical. The performance of these three binary models was evaluated using AUC values 0.883, 0.879, and, 0.887, respectively. Based on results for each of the three predictive models, we developed web-based nomograms for clinical use (http://statgen.snu.ac.kr/software/nomogramDaeguCovid/).

Conclusions

We successfully developed web-based nomograms predicting the maximum severity. These nomograms are expected to help plan an effective treatment for each patient in the clinical field.