Sound Decoding from Auditory Nerve Activity
-
1
Technische Universität München, Electrical Engineering and Information Technology, Germany
-
2
University of São Paulo, Brazil
In the inner ear sounds are converted to discrete action potentials and sent to the central nervous system. This transformation is non-linear and results in massive information loss. However, we still can hear and analyze sounds with high fidelity. This is because the crucial features of sounds are still present in the auditory nerve signals.
Here we present a method which decodes sounds from a large population of simulated auditory nerve fibers (ANFs). We also use the procedure to reconstruct sounds from a model of an impaired cochlea. This way we are able to mimic how hearing impaired subjects perceive sounds.
The problem of reconstructing stimuli from neural activity is usually only approached with relatively small numbers of spikes. Procedures usually rely on the optimization of a linear filter using the reverse correlation technique (Bialek et al. 1991). Our approach leverages responses of a large population of ANFs (close to the number present in the human ear) and non-linear reconstruction using an artificial neural network (ANN).
We used the biophysical auditory periphery model from Zilany et al. (2009) which we adapted to replicate the human hearing range and thresholds. The particular ANN used was a multi-layer perceptron (MLP) with a single hidden layer. The input to the MLP was a 10 ms sliding window from multiple spike trains across 10 different characteristic frequencies. The output was a single value of the reconstructed signal. In this way we trained and tested the MLP with sounds below 2 kHz. Unfortunately, this approach did not work for frequencies above 2 kHz, because of the lacking phase information (phase locking) in spike trains. We therefore developed a two-stage algorithm to reconstruct high frequency signals. First, spike trains were converted to a spectrogram by MLPs. Second, the spectrogram was transformed to an acoustic signal using an iterative method (Decorsiere et al. 2011). In order to convert spike trains to a spectrogram we trained 51 MLPs. The input to each MLP was a sliding window of 5 ms from multiple spike trains and the output one of 51 frequency channels of a spectrogram. Characteristic frequencies of the input fibers corresponded to the frequency of the generated output channel.
The system was trained with pure tones and a few seconds of speech samples. After training we were able to generate sound files from trains of a large population of nerve action potentials. The reconstructed speech was clearly understandable and well perceived. In addition, we reconstructed sounds from an impaired cochlear model and demonstrated how perception is degraded by outer hair cell loss.
Our reconstruction is a valuable tool to evaluate how well speech is encoded in different models of the auditory system. We can also use it to illustrate acoustically the effects of hearing loss.
References:
Bialek, W., Rieke, F., de Ruyter van Steveninck, R., and Warland, D. (1991). Reading a neural code. Science, 252(5014):1854-1857.
Zilany, M. S. A., Bruce, I. C., Nelson, P. C., and Carney, L. H. (2009). A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics. The Journal of the Acoustical Society of America, 126(5):2390-2412.
Decorsière, R., Søndergaard, P. L., Buchholz, J., and Dau, T. (2011). Modulation filtering using an optimization approach to spectrogram reconstruction. In Proceedings of the Forum Acousticum.
Acknowledgements
This work was supported by the German Federal Ministry of Education and Research within the Munich Bernstein Center of Computational Neuroscience (reference number: 01GQ1004B).
Keywords:
auditory nerve fibers,
inner ear model,
sound coding,
sound decoding
Conference:
Bernstein Conference 2012, Munich, Germany, 12 Sep - 14 Sep, 2012.
Presentation Type:
Poster
Topic:
Neural encoding and decoding
Citation:
Rudnicki
M,
Zuffo
MK and
Hemmert
W
(2012). Sound Decoding from Auditory Nerve Activity.
Front. Comput. Neurosci.
Conference Abstract:
Bernstein Conference 2012.
doi: 10.3389/conf.fncom.2012.55.00092
Copyright:
The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers.
They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.
The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.
Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.
For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.
Received:
11 May 2012;
Published Online:
12 Sep 2012.
*
Correspondence:
Dr. Marek Rudnicki, Technische Universität München, Electrical Engineering and Information Technology, Garching, 85748, Germany, marek.rudnicki@tum.de