Skip to main content

ORIGINAL RESEARCH article

Front. Comput. Neurosci.
Volume 18 - 2024 | doi: 10.3389/fncom.2024.1464603
This article is part of the Research Topic Bridging Computation, Biophysics, Medicine, and Engineering in Neural Circuits View all 5 articles

Dynamical predictive coding with reservoir computing performs noise-robust multi-sensory speech recognition

Provisionally accepted
  • Future University Hakodate, Hakodate, Japan

The final, formatted version of the article will be published soon.

    Multi-sensory integration is a perceptual process through which the brain synthesizes a unified perception by integrating inputs from multiple sensory modalities. A key issue is understanding how the brain performs multi-sensory integrations using a common neural basis in the cortex. A cortical model based on reservoir computing has been proposed to elucidate the role of recurrent connectivity among cortical neurons in this process. Reservoir computing is well-suited for time series processing, such as speech recognition. This inquiry focuses on extending a reservoir computing-based cortical model to encompass multi-sensory integration within the cortex.This research introduces a dynamical model of multi-sensory speech recognition, leveraging predictive coding combined with reservoir computing. Predictive coding offers a framework for the hierarchical structure of the cortex. The model integrates reliability weighting, derived from the computational theory of multi-sensory integration, to adapt to multi-sensory time series processing. The model addresses a multi-sensory speech recognition task, necessitating the management of complex time series. We observed that the reservoir effectively recognizes speech by extracting time-contextual information and weighting sensory inputs according to sensory noise. These findings indicate that the dynamic properties of recurrent networks are applicable to multi-sensory time series processing, positioning reservoir computing as a suitable model for multi-sensory integration.

    Keywords: multi-sensory integration, predictive coding, reservoir computing, speech recognition, Nonlinear Dynamics

    Received: 14 Jul 2024; Accepted: 05 Sep 2024.

    Copyright: © 2024 Yonemura and Katori. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Yuichi Katori, Future University Hakodate, Hakodate, Japan

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.