Skip to main content

ORIGINAL RESEARCH article

Front. Med.
Sec. Pathology
Volume 11 - 2024 | doi: 10.3389/fmed.2024.1499393
This article is part of the Research Topic Artificial Intelligence-Assisted Medical Imaging Solutions for Integrating Pathology and Radiology Automated Systems - Volume II View all 5 articles

Multiscale Attention-over-Attention Network for Retinal Disease Recognition in OCT Radiology Images

Provisionally accepted
  • 1 Qassim University, Buraidah, Al-Qassim, Saudi Arabia
  • 2 Islamic University of Madinah, Medina, Saudi Arabia
  • 3 Department of Computer Engineering, Marwadi University, Rajkot, India
  • 4 Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Alkharj, Saudi Arabia

The final, formatted version of the article will be published soon.

    Retinal disease recognition using Optical Coherence Tomography (OCT) images plays a pivotal role in the early diagnosis and treatment of conditions. However, the previous attempts relied on extracting single-scale features often refined by stacked layered attentions. This paper presents a novel deep learning-based Multiscale Feature Enhancement via a Dual Attention Network specifically designed for retinal disease recognition in OCT images. Our approach leverages the EfficientNetB7 backbone to extract multiscale features from OCT images, ensuring a comprehensive representation of global and local retinal structures. To further refine feature extraction, we propose a Pyramidal Attention mechanism that integrates Multi-Head Self-Attention (MHSA) with Dense Atrous Spatial Pyramid Pooling (DASPP), effectively capturing long-range dependencies and contextual information at multiple scales. Additionally, Efficient Channel Attention (ECA) and Spatial Refinement modules are introduced to enhance channel-wise and spatial feature representations, enabling precise localization of retinal abnormalities. A comprehensive ablation study confirms the progressive impact of integrated blocks and attention mechanisms that enhance overall performance. Our findings underscore the potential of advanced attention mechanisms and multiscale processing, highlighting the effectiveness of the network. Extensive experiments on two benchmark datasets demonstrate the superiority of the proposed network over existing state-of-the-art methods.

    Keywords: Retinal recognition, OCT imaging, attention mechanism, deep learning, medical imaging, Multi-level features

    Received: 20 Sep 2024; Accepted: 14 Oct 2024.

    Copyright: © 2024 A. Aloqalaa, M. Alenezi, Singh, Alrabiah, Habib, Islam and Daradkeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Muhammmad Islam, Qassim University, Buraidah, 52571, Al-Qassim, Saudi Arabia

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.