AUTHOR=Buhl Mareike , Akin Gülce , Saak Samira , Eysholdt Ulrich , Radeloff Andreas , Kollmeier Birger , Hildebrandt Andrea 

TITLE=Expert validation of prediction models for a clinical decision-support system in audiology

JOURNAL=Frontiers in Neurology

VOLUME=Volume 13 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2022.960012

DOI=10.3389/fneur.2022.960012

ISSN=1664-2295

ABSTRACT=For supporting clinical decision-making in audiology, Common Audiological Functional Parameters (CAFPAs) were suggested as an interpretable intermediate representation of audiological information taken from various diagnostic sources within a clinical decision-support system (CDSS). Ten different CAFPAs were proposed to represent specific functional aspects of the human auditory system, namely hearing threshold, supra-threshold deficits, binaural hearing, neural processing, cognitive abilities, and a socio-economic component. CAFPAs were established as viable basis for deriving audiological findings and treatment recommendations and it has been demonstrated that model-predicted CAFPAs--with machine learning models trained on expert-labeled patient cases--are sufficiently accurate to be included in a CDSS, but that further validation by experts is required.

The present study aimed to validate model-predicted CAFPAs based on previously unlabeled cases from the same data set. Here we ask to which extent domain experts agree with the model-predicted CAFPAs and whether potential disagreement can be understood in terms of patient characteristics. To these aims, an expert survey was designed and applied to two highly-experienced audiology specialists. They were asked to evaluate model-predicted CAFPAs and estimate audiological findings given audiological information about the patients that they were presented with simultaneously.

Results revealed strong relative agreement between the two experts, and importantly between experts and prediction for all CAFPAs, except for the neural processing and binaural hearing related ones. It turned out, however, that experts tend to score CAFPAs in a larger value-range, but on average across patients with smaller scores as compared with machine learning models. For the hearing threshold-associated CAFPA in frequencies smaller than 0.75 kHz and the cognitive CAFPA, not only the relative agreement, but also the absolute agreement between machine and experts was very high. For those CAFPAs with an average difference between the model- and expert-estimated values, patient characteristics were predictive of the disagreement. 

The findings are discussed in terms of how they can help towards further improvement of model-predicted CAFPAs to be incorporated in a CDSS for audiology.