AUTHOR=Akella Ashlesha , Lin Chin-Teng 

TITLE=Time and Action Co-Training in Reinforcement Learning Agents

JOURNAL=Frontiers in Control Engineering

VOLUME=Volume 2 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/control-engineering/articles/10.3389/fcteg.2021.722092

DOI=10.3389/fcteg.2021.722092

ISSN=2673-6268

ABSTRACT=In formation control, a robot (or an agent) learns to align itself in a particular spatial alignment. However, in a few scenarios it is also vital to learn temporal alignment along with spatial alignment. An effective control system encompasses flexibility, precision and timeliness. Existing reinforcement learning algorithms excel on learning to select an action given a state. However, executing an optimal action is an appropriate time is remains challenging. Building a reinforcement learning agent which can learn an optimal time to act along with an optimal action can address this challenge. Neural networks in which timing relies on dynamic changes in the activity of population neurons have shown to be a more effective representation of time. In this work, we trained a reinforcement learning agent to create its representation of time using a neural network with a population of recurrently connected non-linear firing rate neurons. Trained with a reward-based recursive least square algorithm, the agent learned to produce a neural trajectory that peaks at the ``time-to-act"; thus, it learns ``when" to act. A few control system application also requires the agent to temporally scale its action. We trained the agent could temporally scale its action for different speed inputs. Further, given one state, the agent could learn to plan multiple future actions, i.e., multiple times to act without needing to observe a new state.