AUTHOR=Qin Qiangqiang , Li Qingxuan , Zhu Guiyin , Yu Haiyang , Peng Mingyan , Wu Shuang , Xu Xue , Gu Wen , Guo Xuejun TITLE=Development of a COVID-19 early risk assessment system based on multiple machine learning algorithms and routine blood tests: a real-world study JOURNAL=Frontiers in Immunology VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2024.1430899 DOI=10.3389/fimmu.2024.1430899 ISSN=1664-3224 ABSTRACT=Backgrounds

During the Coronavirus Disease 2019 (COVID-19) epidemic, the massive spread of the disease has placed an enormous burden on the world’s healthcare and economy. The early risk assessment system based on a variety of machine learning (ML) algorithms may be able to provide more accurate advice on the classification of COVID-19 patients, offering predictive, preventive, and personalized medicine (PPPM) solutions in the future.

Methods

In this retrospective study, we divided a portion of the data into training and validation cohorts in a 7:3 ratio and established a model based on a combination of two ML algorithms first. Then, we used another portion of the data as an independent testing cohort to determine the most accurate and stable model and compared it with other scoring systems. Finally, patients were categorized according to risk scores and then the correlation between their clinical data and risk scores was studied.

Results

The elderly accounted for the majority of hospitalized patients with COVID-19. The C-index of the model constructed by combining the stepcox[both] and survivalSVM algorithms was 0.840 in the training cohort and 0.815 in the validation cohort, which was calculated to have the highest C-index in the testing cohort compared to the other 119 ML model combinations. Compared with current scoring systems, including the CURB-65 and several reported prognosis models previously, our model had the highest AUC value of 0.778, representing an even higher predictive performance. In addition, the model’s AUC values for specific time intervals, including days 7,14 and 28, demonstrate excellent predictive performance. Most importantly, we stratified patients according to the model’s risk score and demonstrated a difference in survival status between the high-risk, median-risk, and low-risk groups, which means a new and stable risk assessment system was built. Finally, we found that COVID-19 patients with a history of cerebral infarction had a significantly higher risk of death.

Conclusion

This novel risk assessment system is highly accurate in predicting the prognosis of patients with COVID-19, especially elderly patients with COVID-19, and can be well applied within the PPPM framework. Our ML model facilitates stratified patient management, meanwhile promoting the optimal use of healthcare resources.