The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Oncol.
Sec. Genitourinary Oncology
Volume 14 - 2024 |
doi: 10.3389/fonc.2024.1477166
Predicting distant metastasis of bladder cancer using multiple machine learni ng models:a study based on the SEER database with external validation
Provisionally accepted- Second Affiliated Hospital of Nanchang University, Nanchang, China
Background and purpose:Distant metastasis in bladder cancer is linked to poor prognosis and significant mortality.Machine learning,a key area of artificial intelligence,has shown promise in the diagnosis,staging,and treatment of bladder cancer.This study aims to employ various machine learning techniques to predict distant metastasis in bladder cancer patients. Patients and methods:Patients diagnosed with bladder cancer in the Surveillance,Epidemiology,and End Results(SEER) database from 2000 to 2021 were included in this study.After a rigorous screening process,a total of 4,108 patients were selected for further analysis,divided in a 7:3 ratio into a training cohort and an internal validation cohort.Additionally,118 patients treated at the Second Affiliated Hospital of Nanchang University were collected for an external validation cohort.Features were filtered using the Least Absolute Shrinkage and Selection Operator(LASSO) regression algorithm.Based on the significant features identified,three machine learning algorithms were employed to develop prediction models:Logistic Regression,Support Vector Machine(SVM),and Linear Discriminant Analysis(LDA).The predictive performance of the three models was evaluated by obtaining the area under the receiver operating curve(AUC),precision,accuracy,f1 score,etc. Results:According to the statistics,the final probability of distant metastasis in the population was 12.0%(n=495).LASSO regression analysis revealed that age,chemotherapy,tumor size,the examination of non-regional lymph nodes,and regional lymph node evaluation were significantly associated with distant metastasis of bladder cancer.In the internal validation cohort,the prediction accuracy rates for Logistic Regression,SVM,and LDA were 0.874,0.877,and 0.845,respectively.The precision rates are 0.805,0.769,0.827 respectively.The f1 scores are 0.821,0.819,0.835 respectively.The ROC curve demonstrated that the area under the curve(AUC) for all models was greater than 0.7.In the external validation cohort,the prediction accuracy rates for Logistic Regression,SVM,and LDA were 0.856,0.848,and 0.797,respectively,with the ROC curve indicating that the AUCs also exceeded 0.7.The precision rates are 0.877,0.718,0.736 respectively.The f1 scores are 0.797,0.778,0.762 respectively.Among them,logistic regression demonstrates better predictive efficiency than the other two methods.The top three variables with the highest importance scores in logistic regression are Non-regional lymph nodes,Age,and Chemotherapy. Conclusion:The prediction model developed using three machine learning algorithms demonstrates strong accuracy and discriminative capability in predicting distant metastasis in patients with bladder cancer.This may aid clinicians in understanding patient prognosis and in formulating personalized treatment strategies,ultimately improving the overall prognosis for bladder cancer patients.
Keywords: machine learning, Bladder cancer, SEER database, distant metastasis, Predictive Value
Received: 07 Aug 2024; Accepted: 19 Nov 2024.
Copyright: © 2024 Zou, Rao, Huang, Zhou, Zeng and Chao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Tao Zeng, Second Affiliated Hospital of Nanchang University, Nanchang, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.