Skip to main content

ORIGINAL RESEARCH article

Front. Neurorobot.
Volume 18 - 2024 | doi: 10.3389/fnbot.2024.1443177
This article is part of the Research Topic Towards a Novel Paradigm in Brain-Inspired Computer Vision View all 5 articles

TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer

Provisionally accepted
Ma Libo Ma Libo 1Tong Yan Tong Yan 2*
  • 1 1.Guangdong Polytechnic of Environmental Protection Engineering, Guangdong Foshan, 528216,China, Guangdong, China
  • 2 Hunan Labor And Human Resources Vocational College,HuNan ChangSha,410100, ChangSha, China

The final, formatted version of the article will be published soon.

    Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.

    Keywords: neural computing, Computer Vision, Neuroscience, Multi-modal robot, table tennis stroke recognition

    Received: 03 Jun 2024; Accepted: 22 Jul 2024.

    Copyright: © 2024 Libo and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Tong Yan, Hunan Labor And Human Resources Vocational College,HuNan ChangSha,410100, ChangSha, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.