The purpose of this study was to evaluate and explore the difference between an atlas-based and deep learning (DL)-based auto-segmentation scheme for organs at risk (OARs) of nasopharyngeal carcinoma cases to provide valuable help for clinical practice.
120 nasopharyngeal carcinoma cases were established in the MIM Maestro (atlas) database and trained by a DL-based model (AccuContour®), and another 20 nasopharyngeal carcinoma cases were randomly selected outside the atlas database. The experienced physicians contoured 14 OARs from 20 patients based on the published consensus guidelines, and these were defined as the reference volumes (Vref). Meanwhile, these OARs were auto-contoured using an atlas-based model, a pre-built DL-based model, and an on-site trained DL-based model. These volumes were named Vatlas, VDL-pre-built, and VDL-trained, respectively. The similarities between Vatlas, VDL-pre-built, VDL-trained, and Vref were assessed using the Dice similarity coefficient (DSC), Jaccard coefficient (JAC), maximum Hausdorff distance (HDmax), and deviation of centroid (DC) methods. A one-way ANOVA test was carried out to show the differences (between each two of them).
The results of the three methods were almost similar for the brainstem and eyes. For inner ears and temporomandibular joints, the results of the pre-built DL-based model are the worst, as well as the results of atlas-based auto-segmentation for the lens. For the segmentation of optic nerves, the trained DL-based model shows the best performance (p < 0.05). For the contouring of the oral cavity, the DSC value of VDL-pre-built is the smallest, and VDL-trained is the most significant (p < 0.05). For the parotid glands, the DSC of Vatlas is the minimum (about 0.80 or so), and VDL-pre-built and VDL-trained are slightly larger (about 0.82 or so). In addition to the oral cavity, parotid glands, and the brainstem, the maximum Hausdorff distances of the other organs are below 0.5 cm using the trained DL-based segmentation model. The trained DL-based segmentation method behaves well in the contouring of all the organs that the maximum average deviation of the centroid is no more than 0.3 cm.
The trained DL-based segmentation performs significantly better than atlas-based segmentation for nasopharyngeal carcinoma, especially for the OARs with small volumes. Although some delineation results still need further modification, auto-segmentation methods improve the work efficiency and provide a level of help for clinical work.