AUTHOR=Benger Matthew , Wood David A. , Kafiabadi Sina , Al Busaidi Aisha , Guilhem Emily , Lynch Jeremy , Townend Matthew , Montvila Antanas , Siddiqui Juveria , Gadapa Naveen , Barker Gareth , Ourselin Sebastian , Cole James H. , Booth Thomas C. TITLE=Factors affecting the labelling accuracy of brain MRI studies relevant for deep learning abnormality detection JOURNAL=Frontiers in Radiology VOLUME=3 YEAR=2023 URL=https://www.frontiersin.org/journals/radiology/articles/10.3389/fradi.2023.1251825 DOI=10.3389/fradi.2023.1251825 ISSN=2673-8740 ABSTRACT=

Unlocking the vast potential of deep learning-based computer vision classification systems necessitates large data sets for model training. Natural Language Processing (NLP)—involving automation of dataset labelling—represents a potential avenue to achieve this. However, many aspects of NLP for dataset labelling remain unvalidated. Expert radiologists manually labelled over 5,000 MRI head reports in order to develop a deep learning-based neuroradiology NLP report classifier. Our results demonstrate that binary labels (normal vs. abnormal) showed high rates of accuracy, even when only two MRI sequences (T2-weighted and those based on diffusion weighted imaging) were employed as opposed to all sequences in an examination. Meanwhile, the accuracy of more specific labelling for multiple disease categories was variable and dependent on the category. Finally, resultant model performance was shown to be dependent on the expertise of the original labeller, with worse performance seen with non-expert vs. expert labellers.