AUTHOR=Utimula Keishu , Hayaschi Ken-taro , Bihl Trevor J. , Hongo Kenta , Maezono Ryo 

TITLE=Using reinforcement learning to autonomously identify sources of error for agents in group missions

JOURNAL=Frontiers in Control Engineering

VOLUME=Volume 5 - 2024

YEAR=2024

URL=https://www.frontiersin.org/journals/control-engineering/articles/10.3389/fcteg.2024.1402621

DOI=10.3389/fcteg.2024.1402621

ISSN=2673-6268

ABSTRACT=When deploying agents to execute a mission with collective behavior, it is common for accidental malfunctions to occur in some agents. It is challenging to distinguish whether these malfunctions are due to actuator failures or sensor issues based solely on interactions with the affected agent. However, we humans know the way to distinguish by causing a group behavior where other agents collide with a suspected malfunctioning agent: by monitoring the presence or absence of a positional change, we can identify whether it is the actuator (position changed) or the sensor (position unchanged) that is broken. We have developed artificial intelligence that can autonomously deploy such "information acquisition strategies through collective behavior" using machine learning. In such problems, the goal is to plan collective actions that result in differences between the hypothese for the state [e.g., actuator or sensor]. Only a few of the possible collective behavior patterns will lead to distinguishing between hypotheses. The evaluation function to maximize the difference between hypotheses is therefore sparse, with mostly flat values across most of the domain. Gradient-based optimization methods are ineffective for this, and reinforcement learning becomes a viable alternative. By handling this maximization problem, our reinforcement learning surprisingly gets the optimal solution resulted in collective actions that involve collisions to differentiate the causes. Subsequent collective behaviors, reflecting this situation awareness, seemed to involve other agents assisting the malfunctioning agent.