AUTHOR=Yu Hao , Lam Ka-On , Wu Huanmei , Green Michael , Wang Weili , Jin Jian-Yue , Hu Chen , Jolly Shruti , Wang Yang , Kong Feng-Ming Spring TITLE=Weighted-Support Vector Machine Learning Classifier of Circulating Cytokine Biomarkers to Predict Radiation-Induced Lung Fibrosis in Non-Small-Cell Lung Cancer Patients JOURNAL=Frontiers in Oncology VOLUME=10 YEAR=2021 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2020.601979 DOI=10.3389/fonc.2020.601979 ISSN=2234-943X ABSTRACT=Background

Radiation-induced lung fibrosis (RILF) is an important late toxicity in patients with non-small-cell lung cancer (NSCLC) after radiotherapy (RT). Clinically significant RILF can impact quality of life and/or cause non-cancer related death. This study aimed to determine whether pre-treatment plasma cytokine levels have a significant effect on the risk of RILF and investigate the abilities of machine learning algorithms for risk prediction.

Methods

This is a secondary analysis of prospective studies from two academic cancer centers. The primary endpoint was grade≥2 (RILF2), classified according to a system consistent with the consensus recommendation of an expert panel of the AAPM task for normal tissue toxicity. Eligible patients must have at least 6 months’ follow-up after radiotherapy commencement. Baseline levels of 30 cytokines, dosimetric, and clinical characteristics were analyzed. Support vector machine (SVM) algorithm was applied for model development. Data from one center was used for model training and development; and data of another center was applied as an independent external validation.

Results

There were 57 and 37 eligible patients in training and validation datasets, with 14 and 16.2% RILF2, respectively. Of the 30 plasma cytokines evaluated, SVM identified baseline circulating CCL4 as the most significant cytokine associated with RILF2 risk in both datasets (P = 0.003 and 0.07, for training and test sets, respectively). An SVM classifier predictive of RILF2 was generated in Cohort 1 with CCL4, mean lung dose (MLD) and chemotherapy as key model features. This classifier was validated in Cohort 2 with accuracy of 0.757 and area under the curve (AUC) of 0.855.

Conclusions

Using machine learning, this study constructed and validated a weighted-SVM classifier incorporating circulating CCL4 levels with significant dosimetric and clinical parameters which predicts RILF2 risk with a reasonable accuracy. Further study with larger sample size is needed to validate the role of CCL4, and this SVM classifier in RILF2.