AUTHOR=Zhou Zhongchao , Lu Yuxi , Tortós Pablo Enrique , Qin Ruian , Kokubu Shota , Matsunaga Fuko , Xie Qiaolian , Yu Wenwei TITLE=Addressing data imbalance in Sim2Real: ImbalSim2Real scheme and its application in finger joint stiffness self-sensing for soft robot-assisted rehabilitation JOURNAL=Frontiers in Bioengineering and Biotechnology VOLUME=12 YEAR=2024 URL=https://www.frontiersin.org/journals/bioengineering-and-biotechnology/articles/10.3389/fbioe.2024.1334643 DOI=10.3389/fbioe.2024.1334643 ISSN=2296-4185 ABSTRACT=

The simulation-to-reality (sim2real) problem is a common issue when deploying simulation-trained models to real-world scenarios, especially given the extremely high imbalance between simulation and real-world data (scarce real-world data). Although the cycle-consistent generative adversarial network (CycleGAN) has demonstrated promise in addressing some sim2real issues, it encounters limitations in situations of data imbalance due to the lower capacity of the discriminator and the indeterminacy of learned sim2real mapping. To overcome such problems, we proposed the imbalanced Sim2Real scheme (ImbalSim2Real). Differing from CycleGAN, the ImbalSim2Real scheme segments the dataset into paired and unpaired data for two-fold training. The unpaired data incorporated discriminator-enhanced samples to further squash the solution space of the discriminator, for enhancing the discriminator’s ability. For paired data, a term targeted regression loss was integrated to ensure specific and quantitative mapping and further minimize the solution space of the generator. The ImbalSim2Real scheme was validated through numerical experiments, demonstrating its superiority over conventional sim2real methods. In addition, as an application of the proposed ImbalSim2Real scheme, we designed a finger joint stiffness self-sensing framework, where the validation loss for estimating real-world finger joint stiffness was reduced by roughly 41% compared to the supervised learning method that was trained with scarce real-world data and by 56% relative to the CycleGAN trained with the imbalanced dataset. Our proposed scheme and framework have potential applicability to bio-signal estimation when facing an imbalanced sim2real problem.