Event Abstract

A spiking neural network model of memory-based reinforcement learning

  • 1 Okinawa Institute of Science and Technology, Japan

A reinforcement learning framework has been actively used for modeling animal’s decision making in the field of computational neuroscience. To elucidate a biologically plausible implementation of reinforcement learning algorithms, several spiking neural network models have been proposed. However, most of these models are unable to handle high-dimensional observations or past observations though these features are inevitable constraints of learning in the real environment. In this work, we propose a spiking neural network model of memory-based reinforcement learning that can solve partially observable Markov decision processes (POMDPs) with high-dimensional observations (see Figure).
The proposed model was inspired by a reinforcement learning framework proposed by [1], referred to as the free-energy-based reinforcement learning (FERL) here. The FERL possesses many desirable characteristics: an ability to handle high-dimensional observations and to form goal-directed internal representation; population coding of action-values; and a Hebbian learning rule modulated by reward prediction errors [2]. While the original FERL was implemented by a restricted Boltzmann machine (RBM), we devise the following extensions: replacing the binary stochastic nodes of the RBM by leaky integrate-and-fire neurons; and incorporating working memory architecture to keep temporal information of observations implicitly.
Our model solved reinforcement learning tasks with high-dimensional and uncertain observations without a prior knowledge of the environment. All desirable characteristics in FERL framework were preserved in this extension. The negative free-energy properly encoded the action-values. The free energy estimated by the spiking neural network was highly correlated with that estimated by the original RBM. Finally, the activation patterns of hidden neurons reflected the latent category behind high-dimensional observations in goal-oriented and action-dependent ways after reward-based learning.

1. Sallans, B. and Hinton, G. E.,Using Free Energies to Represent Q-values in a Multiagent Reinforcement learning Task. Advances in Neural Information Processing Systems 13 2001.
2. Otsuka, M., Yoshimoto, J., Doya, K., Robust population coding in free-energy-based reinforcement learning. International Conference on Artificial Neural Networks (ICANN) 2008, Part I: 377–386

A part of this study is the result of "Bioinformatics for brain sciences" carried out under the Strategic Research Program for Brain Sciences by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan.
This work is also supported by the Strategic Programs for Innovative Research (SPIRE), MEXT, Japan.

Keywords: computational neuroscience, neural networks, spiking neural networks, A reinforcement learning framework, Decision Making

Conference: 5th INCF Congress of Neuroinformatics, Munich, Germany, 10 Sep - 12 Sep, 2012.

Presentation Type: Poster

Topic: Neuroinformatics

Citation: Nakano T, Otsuka M, Yoshimoto J and Doya K (2014). A spiking neural network model of memory-based reinforcement learning. Front. Neuroinform. Conference Abstract: 5th INCF Congress of Neuroinformatics. doi: 10.3389/conf.fninf.2014.08.00040

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 21 Mar 2013; Published Online: 27 Feb 2014.

* Correspondence: Dr. Takashi Nakano, Okinawa Institute of Science and Technology, Okinawa, Japan, nakano@oist.jp