The purpose of this study was to explore the performance of different parameter combinations of deep learning (DL) models (Xception, DenseNet121, MobileNet, ResNet50 and EfficientNetB0) and input image resolutions (REZs) (224 × 224, 320 × 320 and 488 × 488 pixels) for breast cancer diagnosis.
This multicenter study retrospectively studied gray-scale ultrasound breast images enrolled from two Chinese hospitals. The data are divided into training, validation, internal testing and external testing set. Three-hundreds images were randomly selected for the physician-AI comparison. The Wilcoxon test was used to compare the diagnose error of physicians and models under
A total of 13,684 images of 3447 female patients are finally included. In external test the 224 and 320 REZ achieve the best performance in MobileNet and EfficientNetB0 respectively (AUC: 0.893 and 0.907). Meanwhile, 448 REZ achieve the best performance in Xception, DenseNet121 and ResNet50 (AUC: 0.900, 0.883 and 0.871 respectively). In physician-AI test set, the 320 REZ for EfficientNetB0 (AUC: 0.896,
Based on the gray-scale ultrasound breast images, we obtained the best DL combination which was better than the physicians.