Skip to main content

ORIGINAL RESEARCH article

Front. Vet. Sci.
Sec. Veterinary Imaging
Volume 12 - 2025 | doi: 10.3389/fvets.2025.1502790
This article is part of the Research Topic Monitoring and Reducing Errors in Veterinary Radiology View all 4 articles

Comparison of radiological interpretation made by veterinary radiologists and state-of-the-art commercial AI software for canine and feline radiographic studies

Provisionally accepted
  • 1 University of Cologne, Cologne, Germany
  • 2 Max Planck Institute for Research on Collective Goods, Bonn, North Rhine-Westphalia, Germany
  • 3 University of Maryland, Baltimore, Maryland, United States
  • 4 Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, United Kingdom
  • 5 University of Veterinary Medicine Hannover, Hanover, Lower Saxony, Germany

The final, formatted version of the article will be published soon.

    As human medical diagnostics are scarcely available, especially in veterinary care, artificial intelligence (AI) has been increasingly used as a remedy. AI promises to improve human diagnostics or provide accurate diagnostics at a lower cost, thereby increasing access. We established whether the performance of commonly used AI matched the performance of a typical radiologist and whether it could be reliably used. Secondly, we sought to identify cases in which AI is effective. Fifty canine and feline radiographic studies in DICOM format were anonymized and reported by 11 board-certified veterinary radiologists (ECVDI or ACVR) and processed with commercial and widely used AI software dedicated to small animal radiography (SignalRAY). The AI software used a deep-learning algorithm and returned a coded abnormal or normal diagnosis for each finding in the study. The radiologists provided a written report in English. All reports' findings were coded into categories matching the codes from the AI software and classified as normal or abnormal. Sensitivity, specificity, and accuracy were calculated for both the radiologists and the AI software. The variance in agreement between each radiologist and the AI software was measured to calculate the ambiguity of each radiological finding. AI matched the best radiologist in accuracy , showing greater specificity but lower sensitivity than humans. AI performed better than the median radiologist overall in both low- and high-ambiguity cases. In high-ambiguity cases, AI's accuracy remained high; however, it was less effective at detecting abnormalities while excelling at identifying normal findings. The study confirmed AI's reliability, particularly in low-ambiguity scenarios, while highlighting its limitations in detecting abnormalities. Our findings suggest that AI performs almost as well as the best veterinary radiologist in all settings of descriptive radiographic findings. However, its strengths lie more in confirming normality than detecting abnormalities, and it does not provide differential diagnoses. Therefore, the broader use of AI could reliably increase diagnostic availability but requires further human input. Given the unique strengths of human experts and AI, along with differences in sensitivity, specificity, and performance in low- versus high-ambiguity settings, AI is more likely to complement rather than replace human experts.

    Keywords: Veterinary diagnostic imaging, artificial intelligence, machine learning, dog, cat

    Received: 27 Sep 2024; Accepted: 13 Jan 2025.

    Copyright: © 2025 Ndiaye, Cramton, Chernev, Ockenfels and Schwarz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Yero Samuel Ndiaye, University of Cologne, Cologne, Germany

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.