AUTHOR=Barros Pablo , Bloem Anne C. , Hootsmans Inge M. , Opheij Lena M. , Toebosch Romain H. A. , Barakova Emilia , Sciutti Alessandra 

TITLE=You Were Always on My Mind: Introducing Chef’s Hat and COPPER for Personalized Reinforcement Learning

JOURNAL=Frontiers in Robotics and AI

VOLUME=Volume 8 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2021.669990

DOI=10.3389/frobt.2021.669990

ISSN=2296-9144

ABSTRACT= Reinforcement learning simulation environments pose an important experimental testbed and facilitate data collection for developing AI-based robot applications. Most of them, however, focus on single-agent tasks, which limits their application to the development of social agents. This paper proposes the Chef’s Hat Simulation environment, which implements a multi-agent competitive card game that is a complete reproduction of the homonymous board game designed to provoke competitive strategies in humans as well as emotional responses. The game showed to be ideal for developing personalized reinforcement learning, in an online learning closed-loop scenario, as its state representation is extremely dynamic and directly related to each of the opponent’s actions. To adapt current reinforcement learning agents to this scenario, we also developed the COmPetitive Prioritized Experience Replay (COPPER) algorithm. With the help of Copper and Chef’s Hat Simulation environment, we evaluated: (1) 12 experimental learning agents, trained via 4 different regimens (self-play, play against a naive baseline, PER, or COPPER) with 3 algorithms based on different state of the art learning paradigms (PPO, DQN, and ACER), and two "dummy" baseline agents that take random actions. (2) the performance difference between COPPER and PER agents trained using the PPO algorithm, and playing against different agents (PPO, DQN, ACER) or all DQN agents, and (3) human performance when playing against two different collections of agents. Our experiments demonstrate that COPPER helps agents learn to adapt towards different types of opponents, improving the performance when compared to offline learning models. An additional contribution of the paper is the formalization of the Chef’s Hat competitive game and the implementation of the Chef’s Hat Player club, a collection of trained and assessed agents as an enabler for embedding human competitive strategies in social-based continual and competitive reinforcement learning.