AUTHOR=Wang Zhaoran , Lim Gilbert , Ng Wei Yan , Tan Tien-En , Lim Jane , Lim Sing Hui , Foo Valencia , Lim Joshua , Sinisterra Laura Gutierrez , Zheng Feihui , Liu Nan , Tan Gavin Siew Wei , Cheng Ching-Yu , Cheung Gemmy Chui Ming , Wong Tien Yin , Ting Daniel Shu Wei TITLE=Synthetic artificial intelligence using generative adversarial network for retinal imaging in detection of age-related macular degeneration JOURNAL=Frontiers in Medicine VOLUME=10 YEAR=2023 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2023.1184892 DOI=10.3389/fmed.2023.1184892 ISSN=2296-858X ABSTRACT=Introduction

Age-related macular degeneration (AMD) is one of the leading causes of vision impairment globally and early detection is crucial to prevent vision loss. However, the screening of AMD is resource dependent and demands experienced healthcare providers. Recently, deep learning (DL) systems have shown the potential for effective detection of various eye diseases from retinal fundus images, but the development of such robust systems requires a large amount of datasets, which could be limited by prevalence of the disease and privacy of patient. As in the case of AMD, the advanced phenotype is often scarce for conducting DL analysis, which may be tackled via generating synthetic images using Generative Adversarial Networks (GANs). This study aims to develop GAN-synthesized fundus photos with AMD lesions, and to assess the realness of these images with an objective scale.

Methods

To build our GAN models, a total of 125,012 fundus photos were used from a real-world non-AMD phenotypical dataset. StyleGAN2 and human-in-the-loop (HITL) method were then applied to synthesize fundus images with AMD features. To objectively assess the quality of the synthesized images, we proposed a novel realness scale based on the frequency of the broken vessels observed in the fundus photos. Four residents conducted two rounds of gradings on 300 images to distinguish real from synthetic images, based on their subjective impression and the objective scale respectively.

Results and discussion

The introduction of HITL training increased the percentage of synthetic images with AMD lesions, despite the limited number of AMD images in the initial training dataset. Qualitatively, the synthesized images have been proven to be robust in that our residents had limited ability to distinguish real from synthetic ones, as evidenced by an overall accuracy of 0.66 (95% CI: 0.61–0.66) and Cohen’s kappa of 0.320. For the non-referable AMD classes (no or early AMD), the accuracy was only 0.51. With the objective scale, the overall accuracy improved to 0.72. In conclusion, GAN models built with HITL training are capable of producing realistic-looking fundus images that could fool human experts, while our objective realness scale based on broken vessels can help identifying the synthetic fundus photos.