1
|
Dai Q, Wong Y, Kankanhali M, Li X, Geng W. Improved Network and Training Scheme for Cross-Trial Surface Electromyography (sEMG)-Based Gesture Recognition. Bioengineering (Basel) 2023; 10:1101. [PMID: 37760203 PMCID: PMC10525369 DOI: 10.3390/bioengineering10091101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/14/2023] [Accepted: 09/18/2023] [Indexed: 09/29/2023] Open
Abstract
To enhance the performance of surface electromyography (sEMG)-based gesture recognition, we propose a novel network-agnostic two-stage training scheme, called sEMGPoseMIM, that produces trial-invariant representations to be aligned with corresponding hand movements via cross-modal knowledge distillation. In the first stage, an sEMG encoder is trained via cross-trial mutual information maximization using the sEMG sequences sampled from the same time step but different trials in a contrastive learning manner. In the second stage, the learned sEMG encoder is fine-tuned with the supervision of gesture and hand movements in a knowledge-distillation manner. In addition, we propose a novel network called sEMGXCM as the sEMG encoder. Comprehensive experiments on seven sparse multichannel sEMG databases are conducted to demonstrate the effectiveness of the training scheme sEMGPoseMIM and the network sEMGXCM, which achieves an average improvement of +1.3% on the sparse multichannel sEMG databases compared to the existing methods. Furthermore, the comparison between training sEMGXCM and other existing networks from scratch shows that sEMGXCM outperforms the others by an average of +1.5%.
Collapse
Affiliation(s)
- Qingfeng Dai
- College of Computer Science and Technology, Faculty of Computer, Zhejiang University, Hangzhou 310058, China; (Q.D.); (X.L.)
| | - Yongkang Wong
- School of Computing, National University of Singapore, 21 Lower Kent Ridge Rd, Singapore 119077, Singapore; (Y.W.); (M.K.)
| | - Mohan Kankanhali
- School of Computing, National University of Singapore, 21 Lower Kent Ridge Rd, Singapore 119077, Singapore; (Y.W.); (M.K.)
| | - Xiangdong Li
- College of Computer Science and Technology, Faculty of Computer, Zhejiang University, Hangzhou 310058, China; (Q.D.); (X.L.)
| | | |
Collapse
|
2
|
Montazerin M, Rahimian E, Naderkhani F, Atashzar SF, Yanushkevich S, Mohammadi A. Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals. Sci Rep 2023; 13:11000. [PMID: 37419881 PMCID: PMC10329032 DOI: 10.1038/s41598-023-36490-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/05/2023] [Indexed: 07/09/2023] Open
Abstract
Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a Compact Transformer-based Hand Gesture Recognition framework referred to as [Formula: see text], which employs a vision transformer network to conduct hand gesture recognition using high-density surface EMG (HD-sEMG) signals. Taking advantage of the attention mechanism, which is incorporated into the transformer architectures, our proposed [Formula: see text] framework overcomes major constraints associated with most of the existing deep learning models such as model complexity; requiring feature engineering; inability to consider both temporal and spatial information of HD-sEMG signals, and requiring a large number of training samples. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths. [Formula: see text] can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the [Formula: see text] framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the [Formula: see text] is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed [Formula: see text] framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. Our results are obtained via 5-fold cross-validation by first applying the proposed framework on the dataset of each subject separately and then, averaging the accuracies among all the subjects. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The [Formula: see text] achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image. The proposed model is statistically compared with a 3D Convolutional Neural Network (CNN) and two different variants of Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) models. The accuracy results for each of the above-mentioned models are paired with their precision, recall, F1 score, required memory, and train/test times. The results corroborate effectiveness of the proposed [Formula: see text] framework compared to its counterparts.
Collapse
Affiliation(s)
- Mansooreh Montazerin
- Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
| | - Elahe Rahimian
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada
| | - Farnoosh Naderkhani
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada
| | - S Farokh Atashzar
- Departments of Electrical and Computer Engineering, Mechanical and Aerospace Engineering, New York University (NYU), New York, 10003, NY, USA
- NYU Center for Urban Science and Progress (CUSP), NYU WIRELESS, New York University (NYU), New York, 10003, NY, USA
| | - Svetlana Yanushkevich
- Biometric Technologies Laboratory, Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Arash Mohammadi
- Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada.
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada.
| |
Collapse
|
3
|
People with chronic low back pain display spatial alterations in high-density surface EMG-torque oscillations. Sci Rep 2022; 12:15178. [PMID: 36071134 PMCID: PMC9452584 DOI: 10.1038/s41598-022-19516-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 08/30/2022] [Indexed: 11/08/2022] Open
Abstract
We quantified the relationship between spatial oscillations in surface electromyographic (sEMG) activity and trunk-extension torque in individuals with and without chronic low back pain (CLBP), during two submaximal isometric lumbar extension tasks at 20% and 50% of their maximal voluntary torque. High-density sEMG (HDsEMG) signals were recorded from the lumbar erector spinae (ES) with a 64-electrode grid, and torque signals were recorded with an isokinetic dynamometer. Coherence and cross-correlation analyses were applied between the filtered interference HDsEMG and torque signals for each submaximal contraction. Principal component analysis was used to reduce dimensionality of HDsEMG data and improve the HDsEMG-based torque estimation. sEMG-torque coherence was quantified in the δ(0–5 Hz) frequency bandwidth. Regional differences in sEMG-torque coherence were also evaluated by creating topographical coherence maps. sEMG-torque coherence in the δ band and sEMG-torque cross-correlation increased with the increase in torque in the controls but not in the CLBP group (p = 0.018, p = 0.030 respectively). As torque increased, the CLBP group increased sEMG-torque coherence in more cranial ES regions, while the opposite was observed for the controls (p = 0.043). Individuals with CLBP show reductions in sEMG-torque relationships possibly due to the use of compensatory strategies and regional adjustments of ES-sEMG oscillatory activity.
Collapse
|
4
|
Montazerin M, Zabihi S, Rahimian E, Mohammadi A, Naderkhani F. ViT-HGR: Vision Transformer-based Hand Gesture Recognition from High Density Surface EMG Signals. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:5115-5119. [PMID: 36086242 DOI: 10.1109/embc48229.2022.9871489] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Recently, there has been a surge of significant interest on application of Deep Learning (DL) models to autonomously perform hand gesture recognition using surface Electromyogram (sEMG) signals. Many of the existing DL models are, however, designed to be applied on sparse sEMG signals. Furthermore, due to the complex structure of these models, typically, we are faced with memory constraint issues, require large training times and a large number of training samples, and; there is the need to resort to data augmentation and/or transfer learning. In this paper, for the first time (to the best of our knowledge), we investigate and design a Vision Transformer (ViT) based architecture to perform hand gesture recognition from High Density (HD-sEMG) signals. Intuitively speaking, we capitalize on the recent breakthrough role of the transformer architecture in tackling different com-plex problems together with its potential for employing more input parallelization via its attention mechanism. The proposed Vision Transformer-based Hand Gesture Recognition (ViT-HGR) framework can overcome the aforementioned training time problems and can accurately classify a large number of hand gestures from scratch without any need for data augmentation and/or transfer learning. The efficiency of the proposed ViT-HGR framework is evaluated using a recently-released HD-sEMG dataset consisting of 65 isometric hand gestures. Our experiments with 64-sample (31.25 ms) window size yield average test accuracy of 84.62 ± 3.07%, where only 78,210 learnable parameters are utilized in the model. The compact structure of the proposed ViT-based ViT-HGR framework (i.e., having significantly reduced number of trainable parameters) shows great potentials for its practical application for prosthetic control.
Collapse
|
5
|
Bai D, Liu T, Han X, Yi H. Application Research on Optimization Algorithm of sEMG Gesture Recognition Based on Light CNN+LSTM Model. CYBORG AND BIONIC SYSTEMS 2021; 2021:9794610. [PMID: 36285146 PMCID: PMC9494710 DOI: 10.34133/2021/9794610] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 09/29/2021] [Indexed: 12/02/2022] Open
Abstract
The deep learning gesture recognition based on surface electromyography plays an increasingly important role in human-computer interaction. In order to ensure the high accuracy of deep learning in multistate muscle action recognition and ensure that the training model can be applied in the embedded chip with small storage space, this paper presents a feature model construction and optimization method based on multichannel sEMG amplification unit. The feature model is established by using multidimensional sequential sEMG images by combining convolutional neural network and long-term memory network to solve the problem of multistate sEMG signal recognition. The experimental results show that under the same network structure, the sEMG signal with fast Fourier transform and root mean square as feature data processing has a good recognition rate, and the recognition accuracy of complex gestures is 91.40%, with the size of 1 MB. The model can still control the artificial hand accurately when the model is small and the precision is high.
Collapse
Affiliation(s)
- Dianchun Bai
- School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, China
- Department of Mechanical Engineering and Intelligent Systems, University of Electro-Communications, Tokyo 182-8585, Japan
| | - Tie Liu
- School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, China
| | - Xinghua Han
- School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, China
| | - Hongyu Yi
- School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, China
| |
Collapse
|
6
|
Zhou Y, Chen C, Cheng M, Alshahrani Y, Franovic S, Lau E, Xu G, Ni G, Cavanaugh JM, Muh S, Lemos S. Comparison of machine learning methods in sEMG signal processing for shoulder motion recognition. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102577] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
7
|
Zhu M, Zhang H, Wang X, Wang X, Yang Z, Wang C, Samuel OW, Chen S, Li G. Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography. J Neural Eng 2021; 18. [DOI: 10.1088/1741-2552/abca14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 11/12/2020] [Indexed: 11/11/2022]
Abstract
Abstract
Objective. Silent speech recognition (SSR) based on surface electromyography (sEMG) is an attractive non-acoustic modality of human-machine interfaces that convert the neuromuscular electrophysiological signals into computer-readable textual messages. The speaking process involves complex neuromuscular activities spanning a large area over the facial and neck muscles, thus the locations of the sEMG electrodes considerably affected the performance of the SSR system. However, most of the previous studies used only a quite limited number of electrodes that were placed empirically without prior quantitative analysis, resulting in uncertainty and unreliability of the SSR outcomes. Approach. In this study, the technique of high-density sEMG was proposed to provide a full representation of the articulatory muscle activities so that the optimal electrode configuration for SSR could be systemically explored. A total of 120 closely spaced electrodes were placed on the facial and neck muscles to collect the high-density sEMG signals for classifying ten digits (0–9) silently spoken in both English and Chinese. The sequential forward selection algorithm was adopted to explore the optimal electrodes configurations. Main Results. The results showed that the classification accuracy increased rapidly and became saturated quickly when the number of selected electrodes increased from 1 to 120. Using only ten optimal electrodes could achieve a classification accuracy of 86% for English and 94% for Chinese, whereas as many as 40 non-optimized electrodes were required to obtain comparable accuracies. Also, the optimally selected electrodes seemed to be mostly distributed on the neck instead of the facial region, and more electrodes were required for English recognition to achieve the same accuracy. Significance. The findings of this study can provide useful guidelines about electrode placement for developing a clinically feasible SSR system and implementing a promising approach of human-machine interface, especially for patients with speaking difficulties.
Collapse
|
8
|
Lw-CNN-Based Myoelectric Signal Recognition and Real-Time Control of Robotic Arm for Upper-Limb Rehabilitation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:8846021. [PMID: 33456452 PMCID: PMC7785339 DOI: 10.1155/2020/8846021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 11/22/2020] [Accepted: 12/15/2020] [Indexed: 11/18/2022]
Abstract
Deep-learning models can realize the feature extraction and advanced abstraction of raw myoelectric signals without necessitating manual selection. Raw surface myoelectric signals are processed with a deep model in this study to investigate the feasibility of recognizing upper-limb motion intents and real-time control of auxiliary equipment for upper-limb rehabilitation training. Surface myoelectric signals are collected on six motions of eight subjects' upper limbs. A light-weight convolutional neural network (Lw-CNN) and support vector machine (SVM) model are designed for myoelectric signal pattern recognition. The offline and online performance of the two models are then compared. The average accuracy is (90 ± 5)% for the Lw-CNN and (82.5 ± 3.5)% for the SVM in offline testing of all subjects, which prevails over (84 ± 6)% for the online Lw-CNN and (79 ± 4)% for SVM. The robotic arm control accuracy is (88.5 ± 5.5)%. Significance analysis shows no significant correlation (p = 0.056) among real-time control, offline testing, and online testing. The Lw-CNN model performs well in the recognition of upper-limb motion intents and can realize real-time control of a commercial robotic arm.
Collapse
|
9
|
Zhang X, Li X, Tang X, Chen X, Chen X, Zhou P. Spatial filtering for enhanced high-density surface electromyographic examination of neuromuscular changes and its application to spinal cord injury. J Neuroeng Rehabil 2020; 17:160. [PMID: 33272283 PMCID: PMC7713033 DOI: 10.1186/s12984-020-00786-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 11/11/2020] [Indexed: 12/13/2022] Open
Abstract
Background Spatial filtering of multi-channel signals is considered to be an effective pre-processing approach for improving signal-to-noise ratio. The use of spatial filtering for preprocessing high-density (HD) surface electromyogram (sEMG) helps to extract critical spatial information, but its application to non-invasive examination of neuromuscular changes have not been well investigated. Methods Aimed at evaluating how spatial filtering can facilitate examination of muscle paralysis, three different spatial filtering methods are presented using principle component analysis (PCA) algorithm, non-negative matrix factorization (NMF) algorithm, and both combination, respectively. Their performance was evaluated in terms of diagnostic power, through HD-sEMG clustering index (CI) analysis of neuromuscular changes in paralyzed muscles following spinal cord injury (SCI). Results The experimental results showed that: (1) The CI analysis of conventional single-channel sEMG can reveal complex neuromuscular changes in paralyzed muscles following SCI, and its diagnostic power has been confirmed to be characterized by the variance of Z scores; (2) the diagnostic power was highly dependent on the location of sEMG recording channel. Directly averaging the CI diagnostic indicators over channels just reached a medium level of the diagnostic power; (3) the use of either PCA-based or NMF-based filtering method yielded a greater diagnostic power, and their combination could even enhance the diagnostic power significantly. Conclusions This study not only presents an essential preprocessing approach for improving diagnostic power of HD-sEMG, but also helps to develop a standard sEMG preprocessing pipeline, thus promoting its widespread application.
Collapse
Affiliation(s)
- Xu Zhang
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230027, Anhui, China
| | - Xinhui Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230027, Anhui, China
| | - Xiao Tang
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230027, Anhui, China
| | - Xun Chen
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230027, Anhui, China.
| | - Xiang Chen
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230027, Anhui, China
| | - Ping Zhou
- Institute of Rehabilitation Engineering, University of Rehabilitation, Qingdao, 266024, Shandong, China
| |
Collapse
|