AUTHOR=Frood Russell , Willaime Julien M. Y. , Miles Brad , Chambers Greg , Al-Chalabi H’ssein , Ali Tamir , Hougham Natasha , Brooks Naomi , Petrides George , Naylor Matthew , Ward Daniel , Sulkin Tom , Chaytor Richard , Strouhal Peter , Patel Chirag , Scarsbrook Andrew F. TITLE=Comparative effectiveness of standard vs. AI-assisted PET/CT reading workflow for pre-treatment lymphoma staging: a multi-institutional reader study evaluation JOURNAL=Frontiers in Nuclear Medicine VOLUME=3 YEAR=2024 URL=https://www.frontiersin.org/journals/nuclear-medicine/articles/10.3389/fnume.2023.1327186 DOI=10.3389/fnume.2023.1327186 ISSN=2673-8880 ABSTRACT=Background

Fluorine-18 fluorodeoxyglucose (FDG)-positron emission tomography/computed tomography (PET/CT) is widely used for staging high-grade lymphoma, with the time to evaluate such studies varying depending on the complexity of the case. Integrating artificial intelligence (AI) within the reporting workflow has the potential to improve quality and efficiency. The aims of the present study were to evaluate the influence of an integrated research prototype segmentation tool implemented within diagnostic PET/CT reading software on the speed and quality of reporting with variable levels of experience, and to assess the effect of the AI-assisted workflow on reader confidence and whether this tool influenced reporting behaviour.

Methods

Nine blinded reporters (three trainees, three junior consultants and three senior consultants) from three UK centres participated in a two-part reader study. A total of 15 lymphoma staging PET/CT scans were evaluated twice: first, using a standard PET/CT reporting workflow; then, after a 6-week gap, with AI assistance incorporating pre-segmentation of disease sites within the reading software. An even split of PET/CT segmentations with gold standard (GS), false-positive (FP) over-contour or false-negative (FN) under-contour were provided. The read duration was calculated using file logs, while the report quality was independently assessed by two radiologists with >15 years of experience. Confidence in AI assistance and identification of disease was assessed via online questionnaires for each case.

Results

There was a significant decrease in time between non-AI and AI-assisted reads (median 15.0 vs. 13.3 min, p < 0.001). Sub-analysis confirmed this was true for both junior (14.5 vs. 12.7 min, p = 0.03) and senior consultants (15.1 vs. 12.2 min, p = 0.03) but not for trainees (18.1 vs. 18.0 min, p = 0.2). There was no significant difference between report quality between reads. AI assistance provided a significant increase in confidence of disease identification (p < 0.001). This held true when splitting the data into FN, GS and FP. In 19/88 cases, participants did not identify either FP (31.8%) or FN (11.4%) segmentations. This was significantly greater for trainees (13/30, 43.3%) than for junior (3/28, 10.7%, p = 0.05) and senior consultants (3/30, 10.0%, p = 0.05).

Conclusions

The study findings indicate that an AI-assisted workflow achieves comparable performance to humans, demonstrating a marginal enhancement in reporting speed. Less experienced readers were more influenced by segmentation errors. An AI-assisted PET/CT reading workflow has the potential to increase reporting efficiency without adversely affecting quality, which could reduce costs and report turnaround times. These preliminary findings need to be confirmed in larger studies.