AUTHOR=Bao Li , Wang Yu-tong , Zhuang Jun-ling , Liu Ai-jun , Dong Yu-jun , Chu Bin , Chen Xiao-huan , Lu Min-qiu , Shi Lei , Gao Shan , Fang Li-juan , Xiang Qiu-qing , Ding Yue-hua 

TITLE=Machine Learning–Based Overall Survival Prediction of Elderly Patients With Multiple Myeloma From Multicentre Real-Life Data

JOURNAL=Frontiers in Oncology

VOLUME=Volume 12 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.922039

DOI=10.3389/fonc.2022.922039

ISSN=2234-943X

ABSTRACT=Objective: To use machine learning methods to explore overall survival (OS)-related prognostic factors in elderly multiple myeloma (MM) patients. 
Methods: Data were cleaned and imputed using simple imputation methods. Two data resampling methods were implemented to facilitate model building and cross validation. Four algorithms including the cox proportional hazards model (CPH); DeepSurv; DeepHit; and the random survival forest (RSF) were applied to incorporate 30 parameters, such as baseline data, genetic abnormalities and treatment options, to construct a prognostic model for OS prediction in 338 elderly MM patients (>65 years old) from four hospitals in Beijing. The C-index and the integrated Brier score (IBS) were used to evaluate model performances.
Results: The 30 variables incorporated in the models comprised MM baseline data, induction treatment data and maintenance therapy data. The variable importance test showed that the OS predictions were largely affected by the maintenance schema variable. Visualizing the survival curves by maintenance schema, we realized that the immunomodulator group had the best survival rate. C-indexes of 0.769, 0.780, 0.785, 0.798 and IBS score of 0.142, 0.112, 0.108, 0.099 were obtained from the CPH model, DeepSurv, DeepHit, and the RSF model respectively. The RSF model yield best scores from the fivefold cross-validation, and the results showed that different data resampling methods did affect our model results.
Conclusion: We established an OS model for elderly MM patients without genomic data based on 30 characteristics and treatment data by machine learning.