AUTHOR=Abdi Amir H. , Sagl Benedikt , Srungarapu Venkata P. , Stavness Ian , Prisman Eitan , Abolmaesumi Purang , Fels Sidney TITLE=Characterizing Motor Control of Mastication With Soft Actor-Critic JOURNAL=Frontiers in Human Neuroscience VOLUME=14 YEAR=2020 URL=https://www.frontiersin.org/journals/human-neuroscience/articles/10.3389/fnhum.2020.00188 DOI=10.3389/fnhum.2020.00188 ISSN=1662-5161 ABSTRACT=

The human masticatory system is a complex functional unit characterized by a multitude of skeletal components, muscles, soft tissues, and teeth. Muscle activation dynamics cannot be directly measured on live human subjects due to ethical, safety, and accessibility limitations. Therefore, estimation of muscle activations and their resultant forces is a longstanding and active area of research. Reinforcement learning (RL) is an adaptive learning strategy which is inspired by the behavioral psychology and enables an agent to learn the dynamics of an unknown system via policy-driven explorations. The RL framework is a well-formulated closed-loop system where high capacity neural networks are trained with the feedback mechanism of rewards to learn relatively complex actuation patterns. In this work, we are building on a deep RL algorithm, known as the Soft Actor-Critic, to learn the inverse dynamics of a simulated masticatory system, i.e., learn the activation patterns that drive the jaw to its desired location. The outcome of the proposed training procedure is a parametric neural model which acts as the brain of the biomechanical system. We demonstrate the model's ability to navigate the feasible three-dimensional (3D) envelope of motion with sub-millimeter accuracies. We also introduce a performance analysis platform consisting of a set of quantitative metrics to assess the functionalities of a given simulated masticatory system. This platform assesses the range of motion, metabolic efficiency, the agility of motion, the symmetry of activations, and the accuracy of reaching the desired target positions. We demonstrate how the model learns more metabolically efficient policies by integrating a force regularization term in the RL reward. We also demonstrate the inverse correlation between the metabolic efficiency of the models and their agility and range of motion. The presented masticatory model and the proposed RL training mechanism are valuable tools for the analysis of mastication and other biomechanical systems. We see this framework's potential in facilitating the functional analyses aspects of surgical treatment planning and predicting the rehabilitation performance in post-operative subjects.