AUTHOR=Zehfroosh Ashkan , Tanner  Herbert G. 

TITLE=A Hybrid PAC Reinforcement Learning Algorithm for Human-Robot Interaction

JOURNAL=Frontiers in Robotics and AI

VOLUME=Volume 9 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2022.797213

DOI=10.3389/frobt.2022.797213

ISSN=2296-9144

ABSTRACT=<p>This paper offers a new hybrid probably approximately correct (<sc>PAC</sc>) reinforcement learning (<sc>RL</sc>) algorithm for Markov decision processes (<sc>MDP</sc>s) that intelligently maintains favorable features of both model-based and model-free methodologies. The designed algorithm, referred to as the Dyna-Delayed Q-learning (<sc>DDQ</sc>) algorithm, combines model-free Delayed Q-learning and model-based R-max algorithms while outperforming both in most cases. The paper includes a <sc>PAC</sc> analysis of the <sc>DDQ</sc> algorithm and a derivation of its sample complexity. Numerical results are provided to support the claim regarding the new algorithm’s sample efficiency compared to its parents as well as the best known <sc>PAC</sc> model-free and model-based algorithms in application. A real-world experimental implementation of <sc>DDQ</sc> in the context of pediatric motor rehabilitation facilitated by infant-robot interaction highlights the potential benefits of the reported method.</p>