Background

AUTHOR=Qiu Xianxin , Gao Jing , Yang Jing , Hu Jiyi , Hu Weixu , Kong Lin , Lu Jiade J. 

TITLE=A Comparison Study of Machine Learning (Random Survival Forest) and Classic Statistic (Cox Proportional Hazards) for Predicting Progression in High-Grade Glioma after Proton and Carbon Ion Radiotherapy

JOURNAL=Frontiers in Oncology

VOLUME=10

YEAR=2020

URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2020.551420

DOI=10.3389/fonc.2020.551420

ISSN=2234-943X

ABSTRACT=<sec><title>Background</title><p>Machine learning (ML) algorithms are increasingly explored in glioma prognostication. Random survival forest (RSF) is a common ML approach in analyzing time-to-event survival data. However, it is controversial which method between RSF and traditional cornerstone method Cox proportional hazards (CPH) is better fitted. The purpose of this study was to compare RSF and CPH in predicting tumor progression of high-grade glioma (HGG) after particle beam radiotherapy (PBRT).</p></sec><sec><title>Methods</title><p>The study enrolled 82 consecutive HGG patients who were treated with PBRT at Shanghai Proton and Heavy Ion Center between 6/2015 and 11/2019. The entire cohort was split into the training and testing set in an 80/20 ratio. Ten variables from patient-related, tumor-related and treatment-related information were utilized for developing CPH and RSF for predicting progression-free survival (PFS). The model performance was compared in concordance index (C-index) for discrimination (accuracy), brier score (BS) for calibration (precision) and variable importance for interpretability.</p></sec><sec><title>Results</title><p>The CPH model demonstrated a better performance in terms of integrated C-index (62.9%) and BS (0.159) compared to RSF model (C-index = 61.1%, BS = 0.174). In the context of variable importance, CPH model indicated that age (P = 0.024), WHO grade (P = 0.020), IDH gene (P = 0.019), and MGMT promoter status (P = 0.040) were significantly correlated with PFS in the univariate analysis; multivariate analysis showed that age (P = 0.041), surgical completeness (P = 0.084), IDH gene (P = 0.057), and MGMT promoter (P = 0.092) had a significant or trend toward the relation with PFS. RSF showed that merely IDH and age were of positive importance for predicting PFS. A final nomogram was developed to predict tumor progression at the individual level based on CPH model.</p></sec><sec><title>Conclusions</title><p>In a relatively small dataset with HGG patients treated with PBRT, CPH outperformed RSF for predicting tumor progression. A comprehensive criterion with accuracy, precision, and interpretability is recommended in evaluating ML prognostication approaches for clinical deployment.</p></sec>