There is emerging evidence which suggests the utility of artificial intelligence (AI) in the diagnostic assessment and pre-treatment evaluation of thyroid eye disease (TED). This scoping review aims to (1) identify the extent of the available evidence (2) provide an in-depth analysis of AI research methodology of the studies included in the review (3) Identify knowledge gaps pertaining to research in this area.
This review was performed according to the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (PRISMA). We quantify the diagnostic accuracy of AI models in the field of TED assessment and appraise the quality of these studies using the modified QUADAS-2 tool.
A total of 13 studies were included in this review. The most common AI models used in these studies are convolutional neural networks (CNN). The majority of the studies compared algorithm performance against healthcare professionals. The overall risk of bias and applicability using the modified Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool led to most of the studies being classified as low risk, although higher deficiency was noted in the risk of bias in flow and timing.
While the results of the review showed high diagnostic accuracy of the AI models in identifying features of TED relevant to disease assessment, deficiencies in study design causing study bias and compromising study applicability were noted. Moving forward, limitations and challenges inherent to machine learning should be addressed with improved standardized guidance around study design, reporting, and legislative framework.