Purpose

AUTHOR=Stenhouse Kailyn , Roumeliotis Michael , Ciunkiewicz Philip , Banerjee Robyn , Yanushkevich Svetlana , McGeachy Philip 

TITLE=Development of a Machine Learning Model for Optimal Applicator Selection in High-Dose-Rate Cervical Brachytherapy

JOURNAL=Frontiers in Oncology

VOLUME=11

YEAR=2021

URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.611437

DOI=10.3389/fonc.2021.611437

ISSN=2234-943X

ABSTRACT=<sec><title>Purpose</title><p>To develop and validate a preliminary machine learning (ML) model aiding in the selection of intracavitary (IC) versus hybrid interstitial (IS) applicators for high-dose-rate (HDR) cervical brachytherapy.</p></sec><sec><title>Methods</title><p>From a dataset of 233 treatments using IC or IS applicators, a set of geometric features of the structure set were extracted, including the volumes of OARs (bladder, rectum, sigmoid colon) and HR-CTV, proximity of OARs to the HR-CTV, mean and maximum lateral and vertical HR-CTV extent, and offset of the HR-CTV centre-of-mass from the applicator tandem axis. Feature selection using an ANOVA F-test and mutual information removed uninformative features from this set. Twelve classification algorithms were trained and tested over 100 iterations to determine the highest performing individual models through nested 5-fold cross-validation. Three models with the highest accuracy were combined using soft voting to form the final model. This model was trained and tested over 1,000 iterations, during which the relative importance of each feature in the applicator selection process was determined.</p></sec><sec><title>Results</title><p>Feature selection indicated that the mean and maximum lateral and vertical extent, volume, and axis offset of the HR-CTV were the most informative features and were thus provided to the ML models. Relative feature importances indicated that the HR-CTV volume and mean lateral extent were most important for applicator selection. From the comparison of the individual classification algorithms, it was found that the highest performing algorithms were tree-based ensemble methods – AdaBoost Classifier (ABC), Gradient Boosting Classifier (GBC), and Random Forest Classifier (RFC). The accuracy of the individual models was compared to the voting model for 100 iterations (ABC = 91.6 ± 3.1%, GBC = 90.4 ± 4.1%, RFC = 89.5 ± 4.0%, Voting Model = 92.2 ± 1.8%) and the voting model was found to have superior accuracy. Over the final 1,000 evaluation iterations, the final voting model demonstrated a high predictive accuracy (91.5 ± 0.9%) and F1 Score (90.6 ± 1.1%).</p></sec><sec><title>Conclusion</title><p>The presented model demonstrates high discriminative performance, highlighting the potential for utilization in informing applicator selection prospectively following further clinical validation.</p></sec>