AUTHOR=Vivar Gerome , Strobl Ralf , Grill Eva , Navab Nassir , Zwergal Andreas , Ahmadi Seyed-Ahmad 

TITLE=Using Base-ml to Learn Classification of Common Vestibular Disorders on DizzyReg Registry Data

JOURNAL=Frontiers in Neurology

VOLUME=12

YEAR=2021

URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2021.681140

DOI=10.3389/fneur.2021.681140

ISSN=1664-2295

ABSTRACT=<p><bold>Background:</bold> Multivariable analyses (MVA) and machine learning (ML) applied on large datasets may have a high potential to provide clinical decision support in neuro-otology and reveal further avenues for vestibular research. To this end, we build base-ml, a comprehensive MVA/ML software tool, and applied it to three increasingly difficult clinical objectives in differentiation of common vestibular disorders, using data from a large prospective clinical patient registry (DizzyReg).</p><p><bold>Methods:</bold> Base-ml features a full MVA/ML pipeline for classification of multimodal patient data, comprising tools for data loading and pre-processing; a stringent scheme for nested and stratified cross-validation including hyper-parameter optimization; a set of 11 classifiers, ranging from commonly used algorithms like logistic regression and random forests, to artificial neural network models, including a graph-based deep learning model which we recently proposed; a multi-faceted evaluation of classification metrics; tools from the domain of “Explainable AI” that illustrate the input distribution and a statistical analysis of the most important features identified by multiple classifiers.</p><p><bold>Results:</bold> In the first clinical task, classification of the bilateral vestibular failure (<italic>N</italic> = 66) vs. functional dizziness (<italic>N</italic> = 346) was possible with a classification accuracy ranging up to 92.5% (Random Forest). In the second task, primary functional dizziness (<italic>N</italic> = 151) vs. secondary functional dizziness (following an organic vestibular syndrome) (<italic>N</italic> = 204), was classifiable with an accuracy ranging from 56.5 to 64.2% (k-nearest neighbors/logistic regression). The third task compared four episodic disorders, benign paroxysmal positional vertigo (<italic>N</italic> = 134), vestibular paroxysmia (<italic>N</italic> = 49), Menière disease (<italic>N</italic> = 142) and vestibular migraine (<italic>N</italic> = 215). Classification accuracy ranged between 25.9 and 50.4% (Naïve Bayes/Support Vector Machine). Recent (graph-) deep learning models classified well in all three tasks, but not significantly better than more traditional ML methods. Classifiers reliably identified clinically relevant features as most important toward classification.</p><p><bold>Conclusion:</bold> The three clinical tasks yielded classification results that correlate with the clinical intuition regarding the difficulty of diagnosis. It is favorable to apply an array of MVA/ML algorithms rather than a single one, to avoid under-estimation of classification accuracy. Base-ml provides a systematic benchmarking of classifiers, with a standardized output of MVA/ML performance on clinical tasks. To alleviate re-implementation efforts, we provide base-ml as an open-source tool for the community.</p>