AUTHOR=Akella Ashlesha , Lin Chin-Teng TITLE=Time and Action Co-Training in Reinforcement Learning Agents JOURNAL=Frontiers in Control Engineering VOLUME=2 YEAR=2021 URL=https://www.frontiersin.org/journals/control-engineering/articles/10.3389/fcteg.2021.722092 DOI=10.3389/fcteg.2021.722092 ISSN=2673-6268 ABSTRACT=

In formation control, a robot (or an agent) learns to align itself in a particular spatial alignment. However, in a few scenarios, it is also vital to learn temporal alignment along with spatial alignment. An effective control system encompasses flexibility, precision, and timeliness. Existing reinforcement learning algorithms excel at learning to select an action given a state. However, executing an optimal action at an appropriate time remains challenging. Building a reinforcement learning agent which can learn an optimal time to act along with an optimal action can address this challenge. Neural networks in which timing relies on dynamic changes in the activity of population neurons have been shown to be a more effective representation of time. In this work, we trained a reinforcement learning agent to create its representation of time using a neural network with a population of recurrently connected nonlinear firing rate neurons. Trained using a reward-based recursive least square algorithm, the agent learned to produce a neural trajectory that peaks at the “time-to-act”; thus, it learns “when” to act. A few control system applications also require the agent to temporally scale its action. We trained the agent so that it could temporally scale its action for different speed inputs. Furthermore, given one state, the agent could learn to plan multiple future actions, that is, multiple times to act without needing to observe a new state.