AUTHOR=Qiu Binxu , Shen Zixiong , Yang Dongliang , Wang Quan TITLE=Applying machine learning techniques to predict the risk of lung metastases from rectal cancer: a real-world retrospective study JOURNAL=Frontiers in Oncology VOLUME=13 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1183072 DOI=10.3389/fonc.2023.1183072 ISSN=2234-943X ABSTRACT=Background

Metastasis in the lungs is common in patients with rectal cancer, and it can have severe consequences on their survival and quality of life. Therefore, it is essential to identify patients who may be at risk of developing lung metastasis from rectal cancer.

Methods

In this study, we utilized eight machine-learning methods to create a model for predicting the risk of lung metastasis in patients with rectal cancer. Our cohort consisted of 27,180 rectal cancer patients selected from the Surveillance, Epidemiology and End Results (SEER) database between 2010 and 2017 for model development. Additionally, we validated our models using 1118 rectal cancer patients from a Chinese hospital to evaluate model performance and generalizability. We assessed our models’ performance using various metrics, including the area under the curve (AUC), the area under the precision-recall curve (AUPR), the Matthews Correlation Coefficient (MCC), decision curve analysis (DCA), and calibration curves. Finally, we applied the best model to develop a web-based calculator for predicting the risk of lung metastasis in patients with rectal cancer.

Result

Our study employed tenfold cross-validation to assess the performance of eight machine-learning models for predicting the risk of lung metastasis in patients with rectal cancer. The AUC values ranged from 0.73 to 0.96 in the training set, with the extreme gradient boosting (XGB) model achieving the highest AUC value of 0.96. Moreover, the XGB model obtained the best AUPR and MCC in the training set, reaching 0.98 and 0.88, respectively. We found that the XGB model demonstrated the best predictive power, achieving an AUC of 0.87, an AUPR of 0.60, an accuracy of 0.92, and a sensitivity of 0.93 in the internal test set. Furthermore, the XGB model was evaluated in the external test set and achieved an AUC of 0.91, an AUPR of 0.63, an accuracy of 0.93, a sensitivity of 0.92, and a specificity of 0.93. The XGB model obtained the highest MCC in the internal test set and external validation set, with 0.61 and 0.68, respectively. Based on the DCA and calibration curve analysis, the XGB model had better clinical decision-making ability and predictive power than the other seven models. Lastly, we developed an online web calculator using the XGB model to assist doctors in making informed decisions and to facilitate the model’s wider adoption (https://share.streamlit.io/woshiwz/rectal_cancer/main/lung.py).

Conclusion

In this study, we developed an XGB model based on clinicopathological information to predict the risk of lung metastasis in patients with rectal cancer, which may help physicians make clinical decisions.