AUTHOR=Yu Tao , Shen Runnan , You Guochang , Lv Lin , Kang Shimao , Wang Xiaoyan , Xu Jiatang , Zhu Dongxi , Xia Zuqi , Zheng Junmeng , Huang Kai TITLE=Machine learning-based prediction of the post-thrombotic syndrome: Model development and validation study JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2022.990788 DOI=10.3389/fcvm.2022.990788 ISSN=2297-055X ABSTRACT=Background

Prevention is highly involved in reducing the incidence of post-thrombotic syndrome (PTS). We aimed to develop accurate models with machine learning (ML) algorithms to predict whether PTS would occur within 24 months.

Materials and methods

The clinical data used for model building were obtained from the Acute Venous Thrombosis: Thrombus Removal with Adjunctive Catheter-Directed Thrombolysis study and the external validation cohort was acquired from the Sun Yat-sen Memorial Hospital in China. The main outcome was defined as the occurrence of PTS events (Villalta score ≥5). Twenty-three clinical variables were included, and four ML algorithms were applied to build the models. For discrimination and calibration, F scores were used to evaluate the prediction ability of the models. The external validation cohort was divided into ten groups based on the risk estimate deciles to identify the hazard threshold.

Results

In total, 555 patients with deep vein thrombosis (DVT) were included to build models using ML algorithms, and the models were further validated in a Chinese cohort comprising 117 patients. When predicting PTS within 2 years after acute DVT, logistic regression based on gradient descent and L1 regularization got the highest area under the curve (AUC) of 0.83 (95% CI:0.76–0.89) in external validation. When considering model performance in both the derivation and external validation cohorts, the eXtreme gradient boosting and gradient boosting decision tree models had similar results and presented better stability and generalization. The external validation cohort was divided into low, intermediate, and high-risk groups with the prediction probability of 0.3 and 0.4 as critical points.

Conclusion

Machine learning models built for PTS had accurate prediction ability and stable generalization, which can further facilitate clinical decision-making, with potentially important implications for selecting patients who will benefit from endovascular surgery.