Understanding performance of convolutional neural networks (CNNs) for binary (benign vs. malignant) lesion classification based on real world images is important for developing a meaningful clinical decision support (CDS) tool.
We developed a CNN based on real world smartphone images with histopathological ground truth and tested the utility of structured electronic health record (EHR) data on model performance. Model accuracy was compared against three board-certified dermatologists for clinical validity.
At a classification threshold of 0.5, the sensitivity was 79 vs. 77 vs. 72%, and specificity was 64 vs. 65 vs. 57% for image-alone vs. combined image and clinical data vs. clinical data-alone models, respectively. The PPV was 68 vs. 69 vs. 62%, AUC was 0.79 vs. 0.79 vs. 0.69, and AP was 0.78 vs. 0.79 vs. 0.64 for image-alone vs. combined data vs. clinical data-alone models. Older age, male sex, and number of prior dermatology visits were important positive predictors for malignancy in the clinical data-alone model.
Additional clinical data did not significantly improve CNN image model performance. Model accuracy for predicting malignant lesions was comparable to dermatologists (model: 71.31% vs. 3 dermatologists: 77.87, 69.88, and 71.93%), validating clinical utility. Prospective validation of the model in primary care setting will enhance understanding of the model’s clinical utility.