The final, formatted version of the article will be published soon.
METHODS article
Front. Oncol.
Sec. Gastrointestinal Cancers: Gastric and Esophageal Cancers
Volume 14 - 2024 |
doi: 10.3389/fonc.2024.1503047
Machine Learning Based Models for Predicting Presentation Delay Risk Among Gastric Cancer Patients
Provisionally accepted- 1 School of Nursing, Chengdu Medical College, Chengdu, Sichuan Province, China
- 2 Sichuan Cancer Hospital, Chengdu, Sichuan Province, China
- 3 Zigong Fourth People's Hospital, Zigong, Sichuan, China
- 4 Chuanbei Medical College, Sichuan, China
- 5 Meishan Hospital of Traditional Chinese Medicine, Meishan, China
Objective: Presentation delay of cancer patients prevents the patient from timely diagnosis and treatment leading to poor prognosis. Predicting the risk of presentation delay is crucial to improve the treatment outcomes. This study aimed to develop and validate prediction models of presentation delay risk in gastric cancer patients by using various machine learning models.Methods: 875 cases of gastric cancer patients admitted to a tertiary oncology hospital from July 2023 to June 2024 were used as derivation cohort, 200 cases of gastric cancer patients admitted to other 4 tertiary hospital were used as external validation cohort. Statistical analysis was performed to identify discriminative variables and 13 statistically significant variables are selected to develop machine learning models. The derivation cohort was randomly assigned to the training and internal validation set by the ratio of 7:3. Prediction models were developed based on six machine learning algorithms, which are LR, SVM, RF, GBDT, XGBoost and MLP. The discrimination and calibration of each model were assessed based on various metrics including accuracy, sensitivity, specificity, PPV, NPV, F1-Score and AUC, calibration curves and Brier scores. The impact of features to the prediction result was analyzed with the permutation feature importance method.Results: The incidence of presentation delay for gastric cancer patients was 39.3%. The developed models achieved performance metrics as AUC (0.893-0.925), accuracy (0.817-0.847), sensitivity (0.857-0.905), specificity (0.783-0.854), PPV (0.728-0.798), NPV (0.897-0.927), F1 score (0.791-0.826) and Brier score (0.107-0.138) in internal validation set, which indicated good discrimination and calibration for the prediction of presentation delay in gastric cancer patients. RF based model was selected as the best one as it achieved good discrimination and calibration performance on both of internal and external validation set.Feature ranking results indicated that both of subjective and objective factors have significant impact on the occurrence of presentation delay in gastric cancer patients.This study demonstrated that the RF based model has favorable performance for the prediction of presentation delay in gastric cancer patients. It can help medical staffs to screen out high-risk gastric cancer patients for presentation delay, and to take appropriate and specific interventions to reduce the risk of presentation delay.
Keywords: gastric cancer, Presentation delay, risk prediction, machine learning, Prediction model
Received: 28 Sep 2024; Accepted: 16 Dec 2024.
Copyright: © 2024 Zhou, Yang, Gu, Bao, Qiu, Zhang, Wang, Liu, Wu, Li, Ren, Qiu, Wang, Zhang, Qiao, Yuan, Ren, Luo and Huang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Qing Yang, Sichuan Cancer Hospital, Chengdu, 610041, Sichuan Province, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.