AUTHOR=Miranda Ruiz Felipe , Lahrmann Bernd , Bartels Liam , Krauthoff Alexandra , Keil Andreas , Härtel Steffen , Tao Amy S. , Ströbel Philipp , Clarke Megan A. , Wentzensen Nicolas , Grabe Niels TITLE=CNN stability training improves robustness to scanner and IHC-based image variability for epithelium segmentation in cervical histology JOURNAL=Frontiers in Medicine VOLUME=10 YEAR=2023 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2023.1173616 DOI=10.3389/fmed.2023.1173616 ISSN=2296-858X ABSTRACT=Background

In digital pathology, image properties such as color, brightness, contrast and blurriness may vary based on the scanner and sample preparation. Convolutional Neural Networks (CNNs) are sensitive to these variations and may underperform on images from a different domain than the one used for training. Robustness to these image property variations is required to enable the use of deep learning in clinical practice and large scale clinical research.

Aims

CNN Stability Training (CST) is proposed and evaluated as a method to increase CNN robustness to scanner and Immunohistochemistry (IHC)-based image variability.

Methods

CST was applied to segment epithelium in immunohistological cervical Whole Slide Images (WSIs). CST randomly distorts input tiles and factors the difference between the CNN prediction for the original and distorted inputs within the loss function. CNNs were trained using 114 p16-stained WSIs from the same scanner, and evaluated on 6 WSI test sets, each with 23 to 24 WSIs of the same tissue but different scanner/IHC combinations. Relative robustness (rAUC) was measured as the difference between the AUC on the training domain test set (i.e., baseline test set) and the remaining test sets.

Results

Across all test sets, The AUC of CST models outperformed “No CST” models (AUC: 0.940–0.989 vs. 0.905–0.986, p < 1e − 8), and obtained an improved robustness (rAUC: [−0.038, −0.003] vs. [−0.081, −0.002]). At a WSI level, CST models showed an increase in performance in 124 of the 142 WSIs. CST models also outperformed models trained with random on-the-fly data augmentation (DA) in all test sets ([0.002, 0.021], p < 1e-6).

Conclusion

CST offers a path to improve CNN performance without the need for more data and allows customizing distortions to specific use cases. A python implementation of CST is publicly available at https://github.com/TIGACenter/CST_v1.