AUTHOR=Remazeilles Anthony , Dominguez Alfonso , Barralon Pierre , Torres-Pardo Adriana , Pinto David , Aller Felix , Mombaur Katja , Conti Roberto , Saccares Lorenzo , Thorsteinsson Freygardur , Prinsen Erik , Cantón Alberto , Castilla Javier , Sanz-Morère Clara B. , Tornero Jesús , Torricelli Diego TITLE=Making Bipedal Robot Experiments Reproducible and Comparable: The Eurobench Software Approach JOURNAL=Frontiers in Robotics and AI VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2022.951663 DOI=10.3389/frobt.2022.951663 ISSN=2296-9144 ABSTRACT=

This study describes the software methodology designed for systematic benchmarking of bipedal systems through the computation of performance indicators from data collected during an experimentation stage. Under the umbrella of the European project Eurobench, we collected approximately 30 protocols with related testbeds and scoring algorithms, aiming at characterizing the performances of humanoids, exoskeletons, and/or prosthesis under different conditions. The main challenge addressed in this study concerns the standardization of the scoring process to permit a systematic benchmark of the experiments. The complexity of this process is mainly due to the lack of consistency in how to store and organize experimental data, how to define the input and output of benchmarking algorithms, and how to implement these algorithms. We propose a simple but efficient methodology for preparing scoring algorithms, to ensure reproducibility and replicability of results. This methodology mainly constrains the interface of the software and enables the engineer to develop his/her metric in his/her favorite language. Continuous integration and deployment tools are then used to verify the replicability of the software and to generate an executable instance independent of the language through dockerization. This article presents this methodology and points at all the metrics and documentation repositories designed with this policy in Eurobench. Applying this approach to other protocols and metrics would ease the reproduction, replication, and comparison of experiments.