Prevention is highly involved in reducing the incidence of post-thrombotic syndrome (PTS). We aimed to develop accurate models with machine learning (ML) algorithms to predict whether PTS would occur within 24 months.
The clinical data used for model building were obtained from the Acute Venous Thrombosis: Thrombus Removal with Adjunctive Catheter-Directed Thrombolysis study and the external validation cohort was acquired from the Sun Yat-sen Memorial Hospital in China. The main outcome was defined as the occurrence of PTS events (Villalta score ≥5). Twenty-three clinical variables were included, and four ML algorithms were applied to build the models. For discrimination and calibration, F scores were used to evaluate the prediction ability of the models. The external validation cohort was divided into ten groups based on the risk estimate deciles to identify the hazard threshold.
In total, 555 patients with deep vein thrombosis (DVT) were included to build models using ML algorithms, and the models were further validated in a Chinese cohort comprising 117 patients. When predicting PTS within 2 years after acute DVT, logistic regression based on gradient descent and L1 regularization got the highest area under the curve (AUC) of 0.83 (95% CI:0.76–0.89) in external validation. When considering model performance in both the derivation and external validation cohorts, the eXtreme gradient boosting and gradient boosting decision tree models had similar results and presented better stability and generalization. The external validation cohort was divided into low, intermediate, and high-risk groups with the prediction probability of 0.3 and 0.4 as critical points.
Machine learning models built for PTS had accurate prediction ability and stable generalization, which can further facilitate clinical decision-making, with potentially important implications for selecting patients who will benefit from endovascular surgery.