AUTHOR=Lu Feng , Yang Linlan , Luo Zhenglian , He Qiao , Shangguan Lijuan , Cao Mingfei , Wu Lichun TITLE=Laboratory blood parameters and machine learning for the prognosis of esophageal squamous cell carcinoma JOURNAL=Frontiers in Oncology VOLUME=14 YEAR=2024 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2024.1367008 DOI=10.3389/fonc.2024.1367008 ISSN=2234-943X ABSTRACT=Background

In contemporary study, the death of esophageal squamous cell carcinoma (ESCC) patients need precise and expedient prognostic methodologies.

Objective

To develop and validate a prognostic model tailored to ESCC patients, leveraging the power of machine learning (ML) techniques and drawing insights from comprehensive datasets of laboratory-derived blood parameters.

Methods

Three ML approaches, including Gradient Boosting Machine (GBM), Random Survival Forest (RSF), and the classical Cox method, were employed to develop models on a dataset of 2521 ESCC patients with 27 features. The models were evaluated by concordance index (C-index) and time receiver operating characteristics (Time ROC) curves. We used the optimal model to evaluate the correlation between features and prognosis and divide patients into low- and high-risk groups by risk stratification. Its performance was analyzed by Kaplan-Meier curve and the comparison with AJCC8 stage. We further evaluate the comprehensive effectiveness of the model in ESCC subgroup by risk score and KDE (kernel density estimation) plotting.

Results

RSF’s C-index (0.746) and AUC (three-year AUC 0.761, five-year AUC 0.771) had slight advantage over GBM and the classical Cox method. Subsequently, 14 features such as N stage, T stage, surgical margin, tumor length, age, Dissected LN number, MCH, Na, FIB, DBIL, CL, treatment, vascular invasion, and tumor grade were selected to build the model. Based on these, we found significant difference for survival rate between low-(3-year OS 81.8%, 5-year OS 69.8%) and high-risk (3-year OS 25.1%, 5-year OS 11.5%) patients in training set, which was also verified in test set (all P < 0.0001). Compared with the AJCC8th stage system, it showed a greater discriminative ability which is also in good agreement with its staging ability.

Conclusion

We developed an ESCC prognostic model with good performance by clinical features and laboratory blood parameters.