AUTHOR=Bento Mariana , Fantini Irene , Park Justin , Rittner Leticia , Frayne Richard TITLE=Deep Learning in Large and Multi-Site Structural Brain MR Imaging Datasets JOURNAL=Frontiers in Neuroinformatics VOLUME=15 YEAR=2022 URL=https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2021.805669 DOI=10.3389/fninf.2021.805669 ISSN=1662-5196 ABSTRACT=
Large, multi-site, heterogeneous brain imaging datasets are increasingly required for the training, validation, and testing of advanced deep learning (DL)-based automated tools, including structural magnetic resonance (MR) image-based diagnostic and treatment monitoring approaches. When assembling a number of smaller datasets to form a larger dataset, understanding the underlying variability between different acquisition and processing protocols across the aggregated dataset (termed “batch effects”) is critical. The presence of variation in the training dataset is important as it more closely reflects the true underlying data distribution and, thus, may enhance the overall generalizability of the tool. However, the impact of batch effects must be carefully evaluated in order to avoid undesirable effects that, for example, may reduce performance measures. Batch effects can result from many sources, including differences in acquisition equipment, imaging technique and parameters, as well as applied processing methodologies. Their impact, both beneficial and adversarial, must be considered when developing tools to ensure that their outputs are related to the proposed clinical or research question (