AUTHOR=Zhang Wenli , Zhao Tingsong , Zhang Jianyi , Wang Yufei TITLE=LST-EMG-Net: Long short-term transformer feature fusion network for sEMG gesture recognition JOURNAL=Frontiers in Neurorobotics VOLUME=17 YEAR=2023 URL=https://www.frontiersin.org/journals/neurorobotics/articles/10.3389/fnbot.2023.1127338 DOI=10.3389/fnbot.2023.1127338 ISSN=1662-5218 ABSTRACT=

With the development of signal analysis technology and artificial intelligence, surface electromyography (sEMG) signal gesture recognition is widely used in rehabilitation therapy, human-computer interaction, and other fields. Deep learning has gradually become the mainstream technology for gesture recognition. It is necessary to consider the characteristics of the surface EMG signal when constructing the deep learning model. The surface electromyography signal is an information carrier that can reflect neuromuscular activity. Under the same circumstances, a longer signal segment contains more information about muscle activity, and a shorter segment contains less information about muscle activity. Thus, signals with longer segments are suitable for recognizing gestures that mobilize complex muscle activity, and signals with shorter segments are suitable for recognizing gestures that mobilize simple muscle activity. However, current deep learning models usually extract features from single-length signal segments. This can easily cause a mismatch between the amount of information in the features and the information needed to recognize gestures, which is not conducive to improving the accuracy and stability of recognition. Therefore, in this article, we develop a long short-term transformer feature fusion network (referred to as LST-EMG-Net) that considers the differences in the timing lengths of EMG segments required for the recognition of different gestures. LST-EMG-Net imports multichannel sEMG datasets into a long short-term encoder. The encoder extracts the sEMG signals’ long short-term features. Finally, we successfully fuse the features using a feature cross-attention module and output the gesture category. We evaluated LST-EMG-Net on multiple datasets based on sparse channels and high density. It reached 81.47, 88.24, and 98.95% accuracy on Ninapro DB2E2, DB5E3 partial gesture, and CapgMyo DB-c, respectively. Following the experiment, we demonstrated that LST-EMG-Net could increase the accuracy and stability of various gesture identification and recognition tasks better than existing networks.