AUTHOR=Liu Zhikai , Chen Wanqi , Guan Hui , Zhen Hongnan , Shen Jing , Liu Xia , Liu An , Li Richard , Geng Jianhao , You Jing , Wang Weihu , Li Zhouyu , Zhang Yongfeng , Chen Yuanyuan , Du Junjie , Chen Qi , Chen Yu , Wang Shaobin , Zhang Fuquan , Qiu Jie TITLE=An Adversarial Deep-Learning-Based Model for Cervical Cancer CTV Segmentation With Multicenter Blinded Randomized Controlled Validation JOURNAL=Frontiers in Oncology VOLUME=11 YEAR=2021 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.702270 DOI=10.3389/fonc.2021.702270 ISSN=2234-943X ABSTRACT=Purpose

To propose a novel deep-learning-based auto-segmentation model for CTV delineation in cervical cancer and to evaluate whether it can perform comparably well to manual delineation by a three-stage multicenter evaluation framework.

Methods

An adversarial deep-learning-based auto-segmentation model was trained and configured for cervical cancer CTV contouring using CT data from 237 patients. Then CT scans of additional 20 consecutive patients with locally advanced cervical cancer were collected to perform a three-stage multicenter randomized controlled evaluation involving nine oncologists from six medical centers. This evaluation system is a combination of objective performance metrics, radiation oncologist assessment, and finally the head-to-head Turing imitation test. Accuracy and effectiveness were evaluated step by step. The intra-observer consistency of each oncologist was also tested.

Results

In stage-1 evaluation, the mean DSC and the 95HD value of the proposed model were 0.88 and 3.46 mm, respectively. In stage-2, the oncologist grading evaluation showed the majority of AI contours were comparable to the GT contours. The average CTV scores for AI and GT were 2.68 vs. 2.71 in week 0 (P = .206), and 2.62 vs. 2.63 in week 2 (P = .552), with no significant statistical differences. In stage-3, the Turing imitation test showed that the percentage of AI contours, which were judged to be better than GT contours by ≥5 oncologists, was 60.0% in week 0 and 42.5% in week 2. Most oncologists demonstrated good consistency between the 2 weeks (P > 0.05).

Conclusions

The tested AI model was demonstrated to be accurate and comparable to the manual CTV segmentation in cervical cancer patients when assessed by our three-stage evaluation framework.