AUTHOR=Tran Tao Thi , Lee Jeonghee , Gunathilake Madhawa , Kim Junetae , Kim Sun-Young , Cho Hyunsoon , Kim Jeongseon 

TITLE=A comparison of machine learning models and Cox proportional hazards models regarding their ability to predict the risk of gastrointestinal cancer based on metabolic syndrome and its components

JOURNAL=Frontiers in Oncology

VOLUME=Volume 13 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1049787

DOI=10.3389/fonc.2023.1049787

ISSN=2234-943X

ABSTRACT=Background: Little is known about applying machine learning (ML) techniques to identify the important variables contributing to the occurrence of gastrointestinal (GI) cancer in epidemiological studies. We aimed to compare different ML models to a Cox proportional hazards (CPH) model regarding their ability to predict the risk of GI cancer based on metabolic syndrome (MetS) and its components.
Methods: A total of 41,837 participants were included in a prospective cohort study. Incident cancer cases were identified by following up with participants until December 2019. We used CPH, random survival forest (RSF), survival trees (ST), gradient boosting (GB), survival support vector machine (SSVM), and extra survival trees (EST) models to explore the impact of MetS on GI cancer prediction. We used the C-index and integrated Brier score (IBS) to compare the models. 
Results: In all, 540 incident GI cancer cases were identified. The GB and SSVM models exhibited comparable performance to the CPH model concerning the C-index (0.725). We also recorded a similar IBS for all models (0.017). Fasting glucose and waist circumference were considered important predictors. 
Conclusions: Our study found comparably good performance concerning the C-index for the ML models and CPH model. This finding suggests that ML models may be considered another method for survival analysis when the CPH model’s conditions are not satisfied.