AUTHOR=Yang Kangping , Wu Jiaqiang , Xu Tian , Zhou Yuepeng , Liu Wenchun , Yang Liang TITLE=Machine learning to predict distant metastasis and prognostic analysis of moderately differentiated gastric adenocarcinoma patients: a novel focus on lymph node indicators JOURNAL=Frontiers in Immunology VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2024.1398685 DOI=10.3389/fimmu.2024.1398685 ISSN=1664-3224 ABSTRACT=Background

Moderately differentiated gastric adenocarcinoma (MDGA) has a high risk of metastasis and individual variation, which strongly affects patient prognosis. Using large-scale datasets and machine learning algorithms for prediction can improve individualized treatment. The specific efficacy of several lymph node indicators in predicting distant metastasis (DM) and patient prognosis in MDGA remains obscure.

Methods

We collected data from MDGA patients from the SEER database from 2010 to 2019. Additionally, we collected data from MDGA patients in China. We used nine machine learning algorithms to predict DM. Subsequently, we used Cox regression analysis to determine the risk factors affecting overall survival (OS) and cancer-specific survival (CSS) in DM patients and constructed nomograms. Furthermore, we used logistic regression and Cox regression analyses to assess the specific impact of six lymph node indicators on DM incidence and patient prognosis.

Results

We collected data from 5,377 MDGA patients from the SEER database and 109 MDGC patients from hospitals. T stage, N stage, tumor size, primary site, number of positive lymph nodes, and chemotherapy were identified as independent risk factors for DM. The random forest prediction model had the best overall predictive performance (AUC = 0.919). T stage, primary site, chemotherapy, and the number of regional lymph nodes were identified as prognostic factors for OS. Moreover, T stage, number of regional lymph nodes, primary site, and chemotherapy were also influential factors for CSS. The nomograms showed good predictive value and stability in predicting the 1-, 3-, and 5-year OS and CSS in DM patients. Additionally, the log odds of a metastatic lymph node and the number of negative lymph nodes may be risk factors for DM, while the regional lymph node ratio and the number of regional lymph nodes are prognostic factors for OS.

Conclusion

The random forest prediction model accurately identified high-risk populations, and we established OS and CSS survival prediction models for MDGA patients with DM. Our hospital samples demonstrated different characteristics of lymph node indicators in terms of distant metastasis and prognosis.