Lymph node (LN) metastasis is strongly associated with distant metastasis of renal cell carcinoma (RCC) and indicates an adverse prognosis. Accurate LN-status prediction is essential for individualized treatment of patients with RCC and to help physicians make appropriate surgical decisions. Thus, a prediction model to assess the hazard index of LN metastasis in patients with RCC is needed.
Partial data were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. Data of 492 individuals with RCC, collected from the Southwest Hospital in Chongqing, China, were used for external validation. Eight indicators of risk of LN metastasis were screened out. Six machine learning (ML) classifiers were established and tuned, focused on predicting LN metastasis in patients with RCC. The models were integrated with big data analytics and ML algorithms. Based on the optimal model, we developed an online risk calculator and plotted overall survival using Kaplan–Meier analysis.
The extreme gradient-boosting (XGB) model was superior to the other models in both internal and external trials. The area under the curve, accuracy, sensitivity, and specificity were 0.930, 0.857, 0.856, and 0.873, respectively, in the internal test and 0.958, 0.935, 0.769, and 0.944, respectively, in the external test. These parameters show that XGB has an excellent ability for clinical application. The survival analysis showed that patients with predicted N1 tumors had significantly shorter survival (
Our study shows that integrating ML algorithms and clinical data can effectively predict LN metastasis in patients with confirmed RCC. Subsequently, a freely available online calculator (