Consistent training and testing datasets can lead to good performance for deep learning (DL) models. However, a large high-quality training dataset for unusual clinical scenarios is usually not easy to collect. The work aims to find optimal training data collection strategies for DL-based dose prediction models.
A total of 325 clinically approved cervical IMRT plans were utilized. We designed comparison experiments to investigate the impact of (1) beam angles, (2) the number of beams, and (3) patient position for DL dose prediction models. In addition, a novel geometry-based beam mask generation method was proposed to provide beam setting information in the model training process. What is more, we proposed a new training strategy named “full-database pre-trained strategy”.
The model trained with a homogeneous dataset with the same beam settings achieved the best performance [mean prediction errors of planning target volume (PTV), bladder, and rectum: 0.29 ± 0.15%, 3.1 ± 2.55%, and 3.15 ± 1.69%] compared with that trained with large mixed beam setting plans (mean errors of PTV, bladder, and rectum: 0.8 ± 0.14%, 5.03 ± 2.2%, and 4.45 ± 1.4%). A homogeneous dataset is more accessible to train an accurate dose prediction model (mean errors of PTV, bladder and rectum: 2.2 ± 0.15%, 5 ± 2.1%, and 3.23 ± 1.53%) than a non-homogeneous one (mean errors of PTV, bladder and rectum: 2.55 ± 0.12%, 6.33 ± 2.46%, and 4.76 ± 2.91%) without other processing approaches. The added beam mask can constantly improve the model performance, especially for datasets with different beam settings (mean errors of PTV, bladder, and rectum improved from 0.8 ± 0.14%, 5.03 ± 2.2%, and 4.45 ± 1.4% to 0.29 ± 0.15%, 3.1 ± 2.55%, and 3.15 ± 1.69%).
A consistent dataset is recommended to form a patient-specific IMRT dose prediction model. When a consistent dataset is not accessible to collect, a large dataset with different beam angles and a training model with beam information can also get a relatively good model. The full-database pre-trained strategies can rapidly form an accuracy model from a pre-trained model. The proposed beam mask can effectively improve the model performance. Our study may be helpful for further dose prediction studies in terms of training strategies or database establishment.