AUTHOR=Sundaresan Vaanathi , Arthofer Christoph , Zamboni Giovanna , Dineen Robert A. , Rothwell Peter M. , Sotiropoulos Stamatios N. , Auer Dorothee P. , Tozer Daniel J. , Markus Hugh S. , Miller Karla L. , Dragonu Iulius , Sprigg Nikola , Alfaro-Almagro Fidel , Jenkinson Mark , Griffanti Ludovica TITLE=Automated Detection of Candidate Subjects With Cerebral Microbleeds Using Machine Learning JOURNAL=Frontiers in Neuroinformatics VOLUME=15 YEAR=2022 URL=https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2021.777828 DOI=10.3389/fninf.2021.777828 ISSN=1662-5196 ABSTRACT=

Cerebral microbleeds (CMBs) appear as small, circular, well defined hypointense lesions of a few mm in size on T2*-weighted gradient recalled echo (T2*-GRE) images and appear enhanced on susceptibility weighted images (SWI). Due to their small size, contrast variations and other mimics (e.g., blood vessels), CMBs are highly challenging to detect automatically. In large datasets (e.g., the UK Biobank dataset), exhaustively labelling CMBs manually is difficult and time consuming. Hence it would be useful to preselect candidate CMB subjects in order to focus on those for manual labelling, which is essential for training and testing automated CMB detection tools on these datasets. In this work, we aim to detect CMB candidate subjects from a larger dataset, UK Biobank, using a machine learning-based, computationally light pipeline. For our evaluation, we used 3 different datasets, with different intensity characteristics, acquired with different scanners. They include the UK Biobank dataset and two clinical datasets with different pathological conditions. We developed and evaluated our pipelines on different types of images, consisting of SWI or GRE images. We also used the UK Biobank dataset to compare our approach with alternative CMB preselection methods using non-imaging factors and/or imaging data. Finally, we evaluated the pipeline's generalisability across datasets. Our method provided subject-level detection accuracy > 80% on all the datasets (within-dataset results), and showed good generalisability across datasets, providing a consistent accuracy of over 80%, even when evaluated across different modalities.