Introduction

AUTHOR=Duarte Kauê T. N. , Sidhu Abhijot S. , Barros Murilo C. , Gobbi David G. , McCreary Cheryl R. , Saad Feryal , Camicioli Richard , Smith Eric E. , Bento Mariana P. , Frayne Richard 

TITLE=Multi-stage semi-supervised learning enhances white matter hyperintensity segmentation

JOURNAL=Frontiers in Computational Neuroscience

VOLUME=18

YEAR=2024

URL=https://www.frontiersin.org/journals/computational-neuroscience/articles/10.3389/fncom.2024.1487877

DOI=10.3389/fncom.2024.1487877

ISSN=1662-5188

ABSTRACT=<sec><title>Introduction</title><p>White matter hyperintensities (WMHs) are frequently observed on magnetic resonance (MR) images in older adults, commonly appearing as areas of high signal intensity on fluid-attenuated inversion recovery (FLAIR) MR scans. Elevated WMH volumes are associated with a greater risk of dementia and stroke, even after accounting for vascular risk factors. Manual segmentation, while considered the ground truth, is both labor-intensive and time-consuming, limiting the generation of annotated WMH datasets. Un-annotated data are relatively available; however, the requirement of annotated data poses a challenge for developing supervised machine learning models.</p></sec><sec><title>Methods</title><p>To address this challenge, we implemented a multi-stage semi-supervised learning (M3SL) approach that first uses un-annotated data segmented by traditional processing methods (“bronze” and “silver” quality data) and then uses a smaller number of “gold”-standard annotations for model refinement. The M3SL approach enabled fine-tuning of the model weights with the gold-standard annotations. This approach was integrated into the training of a U-Net model for WMH segmentation. We used data from three scanner vendors (over more than five scanners) and from both cognitively normal (CN) adult and patients cohorts [with mild cognitive impairment and Alzheimer's disease (AD)].</p></sec><sec><title>Results</title><p>An analysis of WMH segmentation performance across both scanner and clinical stage (CN, MCI, AD) factors was conducted. We compared our results to both conventional and transfer-learning deep learning methods and observed better generalization with M3SL across different datasets. We evaluated several metrics (<italic>F</italic>-measure, <italic>IoU</italic>, and Hausdorff distance) and found significant improvements with our method compared to conventional (<italic>p</italic> &lt; 0.001) and transfer-learning (<italic>p</italic> &lt; 0.001).</p></sec><sec><title>Discussion</title><p>These findings suggest that automated, non-machine learning, tools have a role in a multi-stage learning framework and can reduce the impact of limited annotated data and, thus, enhance model performance.</p></sec>