AUTHOR=Wang Wei , Jiang Ran , Cui Ning , Li Qian , Yuan Feng , Xiao Zhifeng 

TITLE=Semi-supervised vision transformer with adaptive token sampling for breast cancer classification

JOURNAL=Frontiers in Pharmacology

VOLUME=Volume 13 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2022.929755

DOI=10.3389/fphar.2022.929755

ISSN=1663-9812

ABSTRACT=Various imaging techniques combined with machine learning (ML) models have been utilized to build computer-aided diagnosis (CAD) systems for breast cancer (BC) detection and classification. The rise of deep learning models in recent years, represented by convolutional neural network (CNN) models, has pushed the the accuracy of ML-based CAD system to a new level that is comparable to human experts. Existing studies have explored the usage of a wide spectrum of CNN models for BC detection, and supervised learning has been the mainstream. In this paper, we propose a semi-supervised learning framework based on Vision Transformer (ViT). ViT is a model that has been validated to outperform CNN models on numerous classification benchmarks but its application in BC detection has been rare. The proposed method offers a custom semi-supervised learning procedure that unifies both supervised and self-supervised training. In addition, the method employs an adaptive token sampling technique that can strategically sample the most significant tokens from the input image, leading to an effective performance gain. We validate our method on two datasets with ultrasound and histopathology images. Results demonstrate that our method can consistently outperform the CNN baselines for both learning tasks. The code repository of the project is available at \url{https://github.com/FeiYee/Breast-area-TWO}.