AUTHOR=Sheng Wenbo , Wang Xiaoli , Xu Wenxiang , Hao Zedong , Ma Handong , Zhang Shaodian TITLE=Development and validation of machine learning models for venous thromboembolism risk assessment at admission: a retrospective study JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=10 YEAR=2023 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2023.1198526 DOI=10.3389/fcvm.2023.1198526 ISSN=2297-055X ABSTRACT=Introduction

Venous thromboembolism (VTE) risk assessment at admission is of great importance for early screening and timely prophylaxis and management during hospitalization. The purpose of this study is to develop and validate novel risk assessment models at admission based on machine learning (ML) methods.

Methods

In this retrospective study, a total of 3078 individuals were included with their Caprini variables within 24 hours at admission. Then several ML models were built, including logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB). The prediction performance of ML models and the Caprini risk score (CRS) was then validated and compared through a series of evaluation metrics.

Results

The values of AUROC and AUPRC were 0.798 and 0.303 for LR, 0.804 and 0.360 for RF, and 0.796 and 0.352 for XGB, respectively, which outperformed CRS significantly (0.714 and 0.180, P < 0.001). When prediction scores were stratified into three risk levels for application, RF could obtain more reasonable results than CRS, including smaller false positive alerts and larger lower-risk proportions. The boosting results of stratification were further verified by the net-reclassification-improvement (NRI) analysis.

Discussion

This study indicated that machine learning models could improve VTE risk prediction at admission compared with CRS. Among the ML models, RF was found to have superior performance and great potential in clinical practice.