AUTHOR=Alabed Samer , Maiter Ahmed , Salehi Mahan , Mahmood Aqeeb , Daniel Sonali , Jenkins Sam , Goodlad Marcus , Sharkey Michael , Mamalakis Michail , Rakocevic Vera , Dwivedi Krit , Assadi Hosamadin , Wild Jim M. , Lu Haiping , O’Regan Declan P. , van der Geest Rob J. , Garg Pankaj , Swift Andrew J. TITLE=Quality of reporting in AI cardiac MRI segmentation studies – A systematic review and recommendations for future studies JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2022.956811 DOI=10.3389/fcvm.2022.956811 ISSN=2297-055X ABSTRACT=Background

There has been a rapid increase in the number of Artificial Intelligence (AI) studies of cardiac MRI (CMR) segmentation aiming to automate image analysis. However, advancement and clinical translation in this field depend on researchers presenting their work in a transparent and reproducible manner. This systematic review aimed to evaluate the quality of reporting in AI studies involving CMR segmentation.

Methods

MEDLINE and EMBASE were searched for AI CMR segmentation studies in April 2022. Any fully automated AI method for segmentation of cardiac chambers, myocardium or scar on CMR was considered for inclusion. For each study, compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) was assessed. The CLAIM criteria were grouped into study, dataset, model and performance description domains.

Results

209 studies published between 2012 and 2022 were included in the analysis. Studies were mainly published in technical journals (58%), with the majority (57%) published since 2019. Studies were from 37 different countries, with most from China (26%), the United States (18%) and the United Kingdom (11%). Short axis CMR images were most frequently used (70%), with the left ventricle the most commonly segmented cardiac structure (49%). Median compliance of studies with CLAIM was 67% (IQR 59–73%). Median compliance was highest for the model description domain (100%, IQR 80–100%) and lower for the study (71%, IQR 63–86%), dataset (63%, IQR 50–67%) and performance (60%, IQR 50–70%) description domains.

Conclusion

This systematic review highlights important gaps in the literature of CMR studies using AI. We identified key items missing—most strikingly poor description of patients included in the training and validation of AI models and inadequate model failure analysis—that limit the transparency, reproducibility and hence validity of published AI studies. This review may support closer adherence to established frameworks for reporting standards and presents recommendations for improving the quality of reporting in this field.

Systematic Review Registration

[www.crd.york.ac.uk/prospero/], identifier [CRD42022279214].