AUTHOR=Lerch Luc , Huber Lukas S. , Kamath Amith , Pöllinger Alexander , Pahud de Mortanges Aurélie , Obmann Verena C. , Dammann Florian , Senn Walter , Reyes Mauricio 

TITLE=DreamOn: a data augmentation strategy to narrow the robustness gap between expert radiologists and deep learning classifiers

JOURNAL=Frontiers in Radiology

VOLUME=Volume 4 - 2024

YEAR=2024

URL=https://www.frontiersin.org/journals/radiology/articles/10.3389/fradi.2024.1420545

DOI=10.3389/fradi.2024.1420545

ISSN=2673-8740

ABSTRACT=Purpose. Successful performance of deep learning models for medical image analysis is highly dependent on the quality of the images being analysed. Factors like differences in imaging equipment and calibration, as well as patient-specific factors such as movements or biological variability (e.g., tissue density), lead to a large variability in the quality of obtained medical images. Consequently, robustness against the presence of noise is a crucial factor for the application of deep learning models in clinical contexts.

Materials and Methods. We evaluate the effect of various data augmentation strategies on the robustness of a ResNet-18 trained to classify breast ultrasound images and benchmark the performance against trained human radiologists. Additionally, we introduce DreamOn, a novel, biologically inspired data augmentation strategy for medical image analysis. DreamOn is based on a conditional generative adversarial network (GAN) to generate REM-dream-inspired interpolations of training images.

Results. We find that while available data augmentation approaches substantially improve robustness compared to models trained without any data augmentation, radiologists outperform models on noisy images. Using DreamOn data augmentation, we obtain a substantial improvement in robustness in the high noise regime.

Conclusions. We show that REM-dream-inspired conditional GAN-based data augmentation is a promising approach to improving deep learning model robustness against noise perturbations in medical imaging. Additionally, we highlight a gap in robustness between deep learning models and human experts, emphasizing the imperative for ongoing developments in AI to match human diagnostic expertise.