Radiation-induced dermatitis is one of the most common side effects for breast cancer patients treated with radiation therapy (RT). Acute complications can have a considerable impact on tumor control and quality of life for breast cancer patients. In this study, we aimed to develop a novel quantitative high-accuracy machine learning tool for prediction of radiation-induced dermatitis (grade ≥ 2) (RD 2+) before RT by using data encapsulation screening and multi-region dose-gradient-based radiomics techniques, based on the pre-treatment planning computed tomography (CT) images, clinical and dosimetric information of breast cancer patients.
214 patients with breast cancer who underwent RT between 2018 and 2021 were retrospectively collected from 3 cancer centers in China. The CT images, as well as the clinical and dosimetric information of patients were retrieved from the medical records. 3 PTV dose related ROIs, including irradiation volume covered by 100%, 105%, and 108% of prescribed dose, combined with 3 skin dose-related ROIs, including irradiation volume covered by 20-Gy, 30-Gy, 40-Gy isodose lines within skin, were contoured for radiomics feature extraction. A total of 4280 radiomics features were extracted from all 6 ROIs. Meanwhile, 29 clinical and dosimetric characteristics were included in the data analysis. A data encapsulation screening algorithm was applied for data cleaning. Multiple-variable logistic regression and 5-fold-cross-validation gradient boosting decision tree (GBDT) were employed for modeling training and validation, which was evaluated by using receiver operating characteristic analysis.
The best predictors for symptomatic RD 2+ were the combination of 20 radiomics features, 8 clinical and dosimetric variables, achieving an area under the curve (AUC) of 0.998 [95% CI: 0.996-1.0] and an AUC of 0.911 [95% CI: 0.838-0.983] in the training and validation dataset, respectively, in the 5-fold-cross-validation GBDT model. Meanwhile, the top 12 most important characteristics as well as their corresponding importance measures for RD 2+ prediction in the GBDT machine learning process were identified and calculated.
A novel multi-region dose-gradient-based GBDT machine learning framework with a random forest based data encapsulation screening method integrated can achieve a high-accuracy prediction of acute RD 2+ in breast cancer patients.