Graph Convolutional Networks for Multi-modal Robotic Martial Arts Leg Pose Recognition

Yu, Yuexiao

doi:10.3389/fnbot.2024.1520983

ORIGINAL RESEARCH article

Front. Neurorobot.

Volume 18 - 2024 | doi: 10.3389/fnbot.2024.1520983

This article is part of the Research Topic Recent Advances in Image Fusion and Quality Improvement for Cyber-Physical Systems, Volume III View all 6 articles

Graph Convolutional Networks for Multi-modal Robotic Martial Arts Leg Pose Recognition

Provisionally accepted

Yuexiao Yu ^*

Hubei University of Science and Technology, Xianning, China

The final, formatted version of the article will be published soon.

Accurate recognition of martial arts leg poses is essential for applications in sports analytics, rehabilitation, and human-computer interaction. Traditional pose recognition models, relying on sequential or convolutional approaches, often struggle to capture the complex spatialtemporal dependencies inherent in martial arts movements. These methods lack the ability to effectively model the nuanced dynamics of joint interactions and temporal progression, leading to limited generalization in recognizing complex actions. To address these challenges, we propose PoseGCN, a Graph Convolutional Network (GCN)-based model that integrates spatial, temporal, and contextual features through a novel framework. PoseGCN leverages spatialtemporal graph encoding to capture joint motion dynamics, an action-specific attention mechanism to assign importance to relevant joints depending on the action context, and a self-supervised pretext task to enhance temporal robustness and continuity. Experimental results on four benchmark datasets-Kinetics-700, Human3.6M, NTU RGB+D, and UTD-MHAD-demonstrate that PoseGCN outperforms existing models, achieving state-of-the-art accuracy and F1 scores.These findings highlight the model's capacity to generalize across diverse datasets and capture fine-grained pose details, showcasing its potential in advancing complex pose recognition tasks.The proposed framework offers a robust solution for precise action recognition and paves the way for future developments in multi-modal pose analysis.

Keywords: Martial arts pose recognition, spatial-temporal graph encoding, graph convolutional networks, action-specific attention, Self-supervised learning

Received: 01 Nov 2024; Accepted: 19 Nov 2024.

Copyright: © 2024 Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yuexiao Yu, Hubei University of Science and Technology, Xianning, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

ORIGINAL RESEARCH article

Graph Convolutional Networks for Multi-modal Robotic Martial Arts Leg Pose Recognition

Select one of your emails

Notify me on publication