The purpose of this paper is to demonstrate a mechanism for deploying and validating an AI-based system for detecting abnormalities on chest X-ray scans at the Phu Tho General Hospital, Vietnam. We aim to investigate the performance of the system in real-world clinical settings and compare its effectiveness to the in-lab performance.
The AI system was directly integrated into the Hospital's Picture Archiving and Communication System (PACS) after being trained on a fixed annotated dataset from other sources. The system's performance was prospectively measured by matching and comparing the AI results with the radiology reports of 6,285 chest X-ray examinations extracted from the Hospital Information System (HIS) over the last 2 months of 2020. The normal/abnormal status of a radiology report was determined by a set of rules and served as the ground truth.
Our system achieves an F1 score—the harmonic average of the recall and the precision—of 0.653 (95% CI 0.635, 0.671) for detecting any abnormalities on chest X-rays. This corresponds to an accuracy of 79.6%, a sensitivity of 68.6%, and a specificity of 83.9%.
Computer-Aided Diagnosis (CAD) systems for chest radiographs using artificial intelligence (AI) have recently shown great potential as a second opinion for radiologists. However, the performances of such systems were mostly evaluated on a fixed dataset in a retrospective manner and, thus, far from the real performances in clinical practice. Despite a significant drop from the in-lab performance, our result establishes a reasonable level of confidence in applying such a system in real-life situations.