The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Medicine and Public Health
Volume 8 - 2025 |
doi: 10.3389/frai.2025.1512910
Clinical validation of an artificial intelligence algorithm for classifying tuberculosis and pulmonary findings in chest radiographs
Provisionally accepted- 1 Albert Einstein Israelite Hospital, São Paulo, São Paulo, Brazil
- 2 Universidade Federal de Goiás, Goiânia, Goiás, Brazil
- 3 Clemente Ferreira Institute, Sao Paulo, Brazil
- 4 Faculty of Medicine, University of São Paulo, São Paulo, São Paulo, Brazil
- 5 Medical Sciences Center Telehealth Center, Recife, Brazil
- 6 Diagnostic Imaging Research and Study Institute Foundation, SAO PAULO, Brazil
- 7 Goiano Federal Institute (IFGOIANO), Goiania, Goias, Brazil
Background: Chest X-ray (CXR) interpretation is critical in diagnosing various lung diseases. However, physicians, not specialists, are often the first ones to read them, frequently facing challenges in accurate interpretation. Artificial Intelligence (AI) algorithms could be of great help, but using real-world data is crucial to ensure their effectiveness in diverse healthcare settings. This study evaluates a deep learning algorithm designed for CXR interpretation, focusing on its utility for non-specialists in thoracic radiology physicians. Purpose: To assess the performance of a Convolutional Neural Networks (CNNs)-based AI algorithm in interpreting CXRs and compare it with a team of physicians, including thoracic radiologists, who served as the gold-standard. Methods: A retrospective study from January 2021 to July 2023 evaluated an algorithm with three independent models for Lung Abnormality, Radiological Findings, and Tuberculosis. The algorithm's performance was measured using accuracy, sensitivity, and specificity. Two groups of physicians validated the model: one with varying specialties and experience levels in interpreting chest radiographs (Group A) and another of board-certified thoracic radiologists (Group B). The study also assessed the agreement between the two groups on the algorithm's heatmap and its influence on their decisions. Results: In the internal validation, the Lung Abnormality and Tuberculosis models achieved an AUC of 0.94, while the Radiological Findings model yielded a mean AUC of 0.84. During the external validation, utilizing the ground truth generated by board-certified thoracic radiologists, the algorithm achieved better sensitivity in 6 out of 11 classes than physicians with varying experience levels. Furthermore, Group A physicians demonstrated higher agreement with the algorithm in identifying markings in specific lung regions than Group B (37.56% Group A vs. 21.75% Group B). Additionally, physicians declared that the algorithm did not influence their decisions in 93% of the cases. Conclusion: This retrospective clinical validation study assesses an AI algorithm's effectiveness in interpreting Chest X-rays (CXR). The results show the algorithm's performance is comparable to Group A physicians, using gold-standard analysis (Group B) as the reference. Notably, both Groups reported minimal influence of the algorithm on their decisions in most cases.
Keywords: Chest X-rays, artificial intelligence, deep learning, clinical validation, Convolutional Neural Network
Received: 17 Oct 2024; Accepted: 16 Jan 2025.
Copyright: © 2025 Camargo, Ribeiro, Silva, Silva, Torres, Rodrigues, Santos, Filho, Rosa, Novaes, Massarutto, Landi, Yanata, Reis, Szarf, Netto and Paiva. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Guilherme Alberto Sousa Ribeiro, Albert Einstein Israelite Hospital, São Paulo, 05652-900, São Paulo, Brazil
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.