Skip to main content

ORIGINAL RESEARCH article

Front. Oncol.
Sec. Cancer Immunity and Immunotherapy
Volume 14 - 2024 | doi: 10.3389/fonc.2024.1488118
This article is part of the Research Topic Unveiling Biomarkers and Mechanisms in the Tumor-Immune Nexus View all 22 articles

Personalized three-year survival prediction and prognosis forecast by interpretable machine learning for pancreatic cancer patients: a population-based study and an external validation

Provisionally accepted
Buwei Teng Buwei Teng 1Xiaofeng Zhang Xiaofeng Zhang 1Mingshu Ge Mingshu Ge 1*Miao Miao Miao Miao 1*Wei Li Wei Li 1*Jun Ma Jun Ma 2*
  • 1 The First People’s Hospital of Lianyungang, Lianyungang, Jiangsu Province, China
  • 2 Affiliated Huai’an Hospital of Xuzhou Medical University, Xuzhou, Jiangsu Province, China

The final, formatted version of the article will be published soon.

    Purpose: The overall survival of patients with pancreatic cancer is extremely low. We aimed to establish machine learning (ML) based model to accurately predict three-year survival and prognosis of pancreatic cancer patients.Methods: We analyzed pancreatic cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database between 2000 and 2021. Univariate and multivariate logistic analysis were employed to select variables. Recursive Feature Elimination (RFE) method based on 6 ML algorithms was utilized in feature selection. To construct predictive model, 13 ML algorithms were evaluated by area under the curve (AUC), accuracy, sensitivity, specificity, precision, cross-entropy and Brier scores. An optimal ML model was constructed to predict three-year survival, and the predictive results were explained by SHapley Additive exPlanations (SHAP) framework.Meanwhile, 101 ML algorithm combinations were developed to select the best model with highest C-index to predict prognosis of pancreatic cancer patients.Results: A total of 20,064 pancreatic cancer patients from SEER database was consecutively enrolled. We utilized eight clinical variables to establish prediction model for three-year survival. CatBoost model was selected as the best prediction model, and AUC was 0.932 [0.924, 0.939], 0.899 [0.873, 0.934] and 0.826 [0.735, 0.919] in training, internal test and external test sets, with 0.839 [0.831, 0.847] accuracy, 0.872 [0.858, 0.887] sensitivity, 0.803 [0.784, 0.825] specificity and 0.832 [0.821, 0.853] precision. Surgery type had the greatest effects on three-year survival according to SHAP results. For prognosis prediction, "RSF+GBM" algorithm was the best prognostic model with C-index of 0.774, 0.722 and 0.674 in training, internal test and external test sets. Conclusions: Our ML models demonstrate excellent accuracy and reliability, offering more precise personalized prognostic prediction to pancreatic cancer patients.

    Keywords: machine learning, Pancreatic Cancer, three-year survival, Prognosis prediction, SEER

    Received: 29 Aug 2024; Accepted: 19 Sep 2024.

    Copyright: © 2024 Teng, Zhang, Ge, Miao, Li and Ma. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Mingshu Ge, The First People’s Hospital of Lianyungang, Lianyungang, 222002, Jiangsu Province, China
    Miao Miao, The First People’s Hospital of Lianyungang, Lianyungang, 222002, Jiangsu Province, China
    Wei Li, The First People’s Hospital of Lianyungang, Lianyungang, 222002, Jiangsu Province, China
    Jun Ma, Affiliated Huai’an Hospital of Xuzhou Medical University, Xuzhou, 221004, Jiangsu Province, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.