AUTHOR=Ikeuchi Katsushi , Takamatsu Jun , Sasabuchi Kazuhiro , Wake Naoki , Kanehira Atsushi 

TITLE=Applying learning-from-observation to household service robots: three task common-sense formulations

JOURNAL=Frontiers in Computer Science

VOLUME=Volume 6 - 2024

YEAR=2024

URL=https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2024.1235239

DOI=10.3389/fcomp.2024.1235239

ISSN=2624-9898

ABSTRACT=Utilizing a robot in a new application requires the robot to be programmed at each time. To reduce such programming efforts, we have been developing ``Learning-from-observation (LfO)'' that automatically generates robot programs by observing human demonstrations. So far, our previous research has been in the industrial domain. From now on, we want to expand the application field to the household-service domain. One of the main issues with introducing this LfO system into the domain is the cluttered environments, which makes it difficult to discern which movements of the human body parts and their relationships with environment objects are crucial for task execution when observing demonstrations. 

To overcome this issue, it is necessary for the system to have task common-sense shared with the human demonstrator to focus on the demonstrator's specific movements. Here, task common-sense is defined as the movements humans take almost unconsciously to streamline or optimize the execution of a series of tasks. In this paper, we extract and define three types of task common-sense (semi-conscious movements) that should be focused on when observing demonstrations of household tasks and propose representations to describe them.
Specifically, the paper proposes to use labanotation to describe the whole-body movements with respect to the environment, contact-webs to describe the hand-finger movements with respect to the tool for grasping, and physical and semantic constraints to describe the movements of the hand with the tool with respect to the environment.
Based on these representations, the paper formulates task models, machine-independent robot programs, that indicate what-to-do and where-to-do. In this design process, the necessary and sufficient set of task models to be prepared in the task-model library are determined on the following criteria: for grasping tasks, according to the classification of contact-webs along the purpose of the grasping, and for manipulation tasks, corresponding to possible transitions between states defined by either physical constraints and semantic constraints. The skill-agent library is also prepared to collect skill-agents corresponding to tasks. 
The paper explains the task encoder and task decoder to execute the task models on the robot hardware and presents how the system works through several example scenes.