AUTHOR=Li Wenle , Zhou Qian , Liu Wencai , Xu Chan , Tang Zhi-Ri , Dong Shengtao , Wang Haosheng , Li Wanying , Zhang Kai , Li Rong , Zhang Wenshi , Hu Zhaohui , Shibin Su , Liu Qiang , Kuang Sirui , Yin Chengliang TITLE=A Machine Learning-Based Predictive Model for Predicting Lymph Node Metastasis in Patients With Ewing’s Sarcoma JOURNAL=Frontiers in Medicine VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2022.832108 DOI=10.3389/fmed.2022.832108 ISSN=2296-858X ABSTRACT=Objective

In order to provide reference for clinicians and bring convenience to clinical work, we seeked to develop and validate a risk prediction model for lymph node metastasis (LNM) of Ewing’s sarcoma (ES) based on machine learning (ML) algorithms.

Methods

Clinicopathological data of 923 ES patients from the Surveillance, Epidemiology, and End Results (SEER) database and 51 ES patients from multi-center external validation set were retrospectively collected. We applied ML algorithms to establish a risk prediction model. Model performance was checked using 10-fold cross-validation in the training set and receiver operating characteristic (ROC) curve analysis in external validation set. After determining the best model, a web-based calculator was made to promote the clinical application.

Results

LNM was confirmed or unable to evaluate in 13.86% (135 out of 974) ES patients. In multivariate logistic regression, race, T stage, M stage and lung metastases were independent predictors for LNM in ES. Six prediction models were established using random forest (RF), naive Bayes classifier (NBC), decision tree (DT), xgboost (XGB), gradient boosting machine (GBM), logistic regression (LR). In 10-fold cross-validation, the average area under curve (AUC) ranked from 0.705 to 0.764. In ROC curve analysis, AUC ranged from 0.612 to 0.727. The performance of the RF model ranked best. Accordingly, a web-based calculator was developed (https://share.streamlit.io/liuwencai2/es_lnm/main/es_lnm.py).

Conclusion

With the help of clinicopathological data, clinicians can better identify LNM in ES patients. Risk prediction models established in this study performed well, especially the RF model.