AUTHOR=Zhao Xinyu , Rangaprakash D. , Yuan Bowen , Denney Jr Thomas S. , Katz Jeffrey S. , Dretsch Michael N. , Deshpande Gopikrishna TITLE=Investigating the Correspondence of Clinical Diagnostic Grouping With Underlying Neurobiological and Phenotypic Clusters Using Unsupervised Machine Learning JOURNAL=Frontiers in Applied Mathematics and Statistics VOLUME=4 YEAR=2018 URL=https://www.frontiersin.org/journals/applied-mathematics-and-statistics/articles/10.3389/fams.2018.00025 DOI=10.3389/fams.2018.00025 ISSN=2297-4687 ABSTRACT=

Many brain-based disorders are traditionally diagnosed based on clinical interviews and behavioral assessments, which are recognized to be largely imperfect. Therefore, it is necessary to establish neuroimaging-based biomarkers to improve diagnostic precision. Resting-state functional magnetic resonance imaging (rs-fMRI) is a promising technique for the characterization and classification of varying disorders. However, most of these classification methods are supervised, i.e., they require a priori clinical labels to guide classification. In this study, we adopted various unsupervised clustering methods using static and dynamic rs-fMRI connectivity measures to investigate whether the clinical diagnostic grouping of different disorders is grounded in underlying neurobiological and phenotypic clusters. In order to do so, we derived a general analysis pipeline for identifying different brain-based disorders using genetic algorithm-based feature selection, and unsupervised clustering methods on four different datasets; three of them—ADNI, ADHD-200, and ABIDE—which are publicly available, and a fourth one—PTSD and PCS—which was acquired in-house. Using these datasets, the effectiveness of the proposed pipeline was verified on different disorders: Attention Deficit Hyperactivity Disorder (ADHD), Alzheimer's Disease (AD), Autism Spectrum Disorder (ASD), Post-Traumatic Stress Disorder (PTSD), and Post-Concussion Syndrome (PCS). For ADHD and AD, highest similarity was achieved between connectivity and phenotypic clusters, whereas for ASD and PTSD/PCS, highest similarity was achieved between connectivity and clinical diagnostic clusters. For multi-site data (ABIDE and ADHD-200), we report site-specific results. We also reported the effect of elimination of outlier subjects for all four datasets. Overall, our results suggest that neurobiological and phenotypic biomarkers could potentially be used as an aid by the clinician, in additional to currently available clinical diagnostic standards, to improve diagnostic precision. Data and source code used in this work is publicly available at https://github.com/xinyuzhao/identification-of-brain-based-disorders.git.