1
|
Wang Z, Huang W, Qi Z, Yin S. MS-CLSTM: Myoelectric Manipulator Gesture Recognition Based on Multi-Scale Feature Fusion CNN-LSTM Network. Biomimetics (Basel) 2024; 9:784. [PMID: 39727788 PMCID: PMC11727569 DOI: 10.3390/biomimetics9120784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Revised: 12/10/2024] [Accepted: 12/17/2024] [Indexed: 12/28/2024] Open
Abstract
Surface electromyography (sEMG) signals reflect the local electrical activity of muscle fibers and the synergistic action of the overall muscle group, making them useful for gesture control of myoelectric manipulators. In recent years, deep learning methods have increasingly been applied to sEMG gesture recognition due to their powerful automatic feature extraction capabilities. sEMG signals contain rich local details and global patterns, but single-scale convolutional networks are limited in their ability to capture both comprehensively, which restricts model performance. This paper proposes a deep learning model based on multi-scale feature fusion-MS-CLSTM (MS Block-ResCBAM-Bi-LSTM). The MS Block extracts local details, global patterns, and inter-channel correlations in sEMG signals using convolutional kernels of different scales. The ResCBAM, which integrates CBAM and Simple-ResNet, enhances attention to key gesture information while alleviating overfitting issues common in small-sample datasets. Experimental results demonstrate that the MS-CLSTM model achieves recognition accuracies of 86.66% and 83.27% on the Ninapro DB2 and DB4 datasets, respectively, and the accuracy can reach 89% in real-time myoelectric manipulator gesture prediction experiments. The proposed model exhibits superior performance in sEMG gesture recognition tasks, offering an effective solution for applications in prosthetic hand control, robotic control, and other human-computer interaction fields.
Collapse
Affiliation(s)
- Ziyi Wang
- School of Materials Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (Z.W.); (Z.Q.); (S.Y.)
| | - Wenjing Huang
- School of Mechanical and Intelligent Manufacturing, Central South University of Forestry and Technology, Changsha 410004, China
| | - Zikang Qi
- School of Materials Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (Z.W.); (Z.Q.); (S.Y.)
| | - Shuolei Yin
- School of Materials Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (Z.W.); (Z.Q.); (S.Y.)
| |
Collapse
|
2
|
Su T, Tan X, Jiang X, Liu X, Hu B, Dai C. A Dynamic Balanced Single-Source Domain Generalization Model for Cross-Posture Myoelectric Control. IEEE Trans Neural Syst Rehabil Eng 2024; PP:255-265. [PMID: 40030684 DOI: 10.1109/tnsre.2024.3521229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Electromyography (EMG) based Human-Computer Interaction (HCI) through wearable devices frequently encounter variability in body postures, which can modify the amplitude and frequency features of surface EMG (sEMG) signals. This variability often results in reduced gesture recognition accuracy. To enhance the robustness of sEMG-based gesture interfaces, mitigating the effects of body position variability is essential. In this paper, we proposed a Dynamic Balanced Single-Source Domain Generalization (DBSS-DG) transfer learning framework, which only used sEMG signal data from one posture as source domain for model training but can also generate good performance under different body postures as target domain. Validation was performed on the sEMG dataset from 16 subjects across four postures: standing, sitting, walking, and lying. With standing as the source domain, the model achieved gesture recognition accuracies of 90.79 ± 0.09%, 88.78 ± 0.06%, and 90.87 ± 0.1% for sitting, walking, and lying as the target domains, respectively, producing an average improvement of 4.71% over non-transfer learning approaches. Furthermore, the performance of our model exceeded that of many well-known single-source domain generalization methods, establishing its effectiveness in practical applications.
Collapse
|
3
|
Xu T, Zhao K, Hu Y, Li L, Wang W, Wang F, Zhou Y, Li J. Transferable non-invasive modal fusion-transformer (NIMFT) for end-to-end hand gesture recognition. J Neural Eng 2024; 21:026034. [PMID: 38565124 DOI: 10.1088/1741-2552/ad39a5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 04/02/2024] [Indexed: 04/04/2024]
Abstract
Objective.Recent studies have shown that integrating inertial measurement unit (IMU) signals with surface electromyographic (sEMG) can greatly improve hand gesture recognition (HGR) performance in applications such as prosthetic control and rehabilitation training. However, current deep learning models for multimodal HGR encounter difficulties in invasive modal fusion, complex feature extraction from heterogeneous signals, and limited inter-subject model generalization. To address these challenges, this study aims to develop an end-to-end and inter-subject transferable model that utilizes non-invasively fused sEMG and acceleration (ACC) data.Approach.The proposed non-invasive modal fusion-transformer (NIMFT) model utilizes 1D-convolutional neural networks-based patch embedding for local information extraction and employs a multi-head cross-attention (MCA) mechanism to non-invasively integrate sEMG and ACC signals, stabilizing the variability induced by sEMG. The proposed architecture undergoes detailed ablation studies after hyperparameter tuning. Transfer learning is employed by fine-tuning a pre-trained model on new subject and a comparative analysis is performed between the fine-tuning and subject-specific model. Additionally, the performance of NIMFT is compared to state-of-the-art fusion models.Main results.The NIMFT model achieved recognition accuracies of 93.91%, 91.02%, and 95.56% on the three action sets in the Ninapro DB2 dataset. The proposed embedding method and MCA outperformed the traditional invasive modal fusion transformer by 2.01% (embedding) and 1.23% (fusion), respectively. In comparison to subject-specific models, the fine-tuning model exhibited the highest average accuracy improvement of 2.26%, achieving a final accuracy of 96.13%. Moreover, the NIMFT model demonstrated superiority in terms of accuracy, recall, precision, and F1-score compared to the latest modal fusion models with similar model scale.Significance.The NIMFT is a novel end-to-end HGR model, utilizes a non-invasive MCA mechanism to integrate long-range intermodal information effectively. Compared to recent modal fusion models, it demonstrates superior performance in inter-subject experiments and offers higher training efficiency and accuracy levels through transfer learning than subject-specific approaches.
Collapse
Affiliation(s)
- Tianxiang Xu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| | - Kunkun Zhao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| | - Yuxiang Hu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| | - Liang Li
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| | - Wei Wang
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| | - Fulin Wang
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- Nanjing PANDA Electronics Equipment Co., Ltd, Nanjing 210033, People's Republic of China
| | - Yuxuan Zhou
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| | - Jianqing Li
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
- The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China
| |
Collapse
|
4
|
Gu Y, Oku H, Todoh M. American Sign Language Recognition and Translation Using Perception Neuron Wearable Inertial Motion Capture System. SENSORS (BASEL, SWITZERLAND) 2024; 24:453. [PMID: 38257544 PMCID: PMC10819960 DOI: 10.3390/s24020453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 12/17/2023] [Accepted: 01/10/2024] [Indexed: 01/24/2024]
Abstract
Sign language is designed as a natural communication method to convey messages among the deaf community. In the study of sign language recognition through wearable sensors, the data sources are limited, and the data acquisition process is complex. This research aims to collect an American sign language dataset with a wearable inertial motion capture system and realize the recognition and end-to-end translation of sign language sentences with deep learning models. In this work, a dataset consisting of 300 commonly used sentences is gathered from 3 volunteers. In the design of the recognition network, the model mainly consists of three layers: convolutional neural network, bi-directional long short-term memory, and connectionist temporal classification. The model achieves accuracy rates of 99.07% in word-level evaluation and 97.34% in sentence-level evaluation. In the design of the translation network, the encoder-decoder structured model is mainly based on long short-term memory with global attention. The word error rate of end-to-end translation is 16.63%. The proposed method has the potential to recognize more sign language sentences with reliable inertial data from the device.
Collapse
Affiliation(s)
- Yutong Gu
- Faculty of Informatics, Gunma University, Kiryu 3768515, Japan;
- Graduate School of Engineering, Hokkaido University, Sapporo 0608628, Japan
| | - Hiromasa Oku
- Faculty of Informatics, Gunma University, Kiryu 3768515, Japan;
| | - Masahiro Todoh
- Faculty of Engineering, Hokkaido University, Sapporo 0608628, Japan;
| |
Collapse
|
5
|
Duan S, Wu L, Liu A, Chen X. Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition. IEEE Trans Neural Syst Rehabil Eng 2023; 31:4661-4671. [PMID: 37983152 DOI: 10.1109/tnsre.2023.3335101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless, these methods typically fuse sEMG and ACC either in the early or late stages, overlooking the integration of entire cross-modal hierarchical information within each individual hidden layer, thus inducing inefficient inter-modal fusion. To this end, we propose a novel Alignment-Enhanced Interactive Fusion (AiFusion) model, which achieves effective fusion via a progressive hierarchical fusion strategy. Notably, AiFusion can flexibly perform both complete and incomplete multimodal HGR. Specifically, AiFusion contains two unimodal branches and a cascaded transformer-based multimodal fusion branch. The fusion branch is first designed to adequately characterize modality-interactive knowledge by adaptively capturing inter-modal similarity and fusing hierarchical features from all branches layer by layer. Then, the modality-interactive knowledge is aligned with that of unimodality using cross-modal supervised contrastive learning and online distillation from embedding and probability spaces respectively. These alignments further promote fusion quality and refine modality-specific representations. Finally, the recognition outcomes are set to be determined by available modalities, thus contributing to handling the incomplete multimodal HGR problem, which is frequently encountered in real-world scenarios. Experimental results on five public datasets demonstrate that AiFusion outperforms most state-of-the-art benchmarks in complete multimodal HGR. Impressively, it also surpasses the unimodal baselines in the challenging incomplete multimodal HGR. The proposed AiFusion provides a promising solution to realize effective and robust multimodal HGR-based interfaces.
Collapse
|
6
|
Ben Haj Amor A, El Ghoul O, Jemni M. Sign Language Recognition Using the Electromyographic Signal: A Systematic Literature Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:8343. [PMID: 37837173 PMCID: PMC10574929 DOI: 10.3390/s23198343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/23/2023] [Accepted: 10/03/2023] [Indexed: 10/15/2023]
Abstract
The analysis and recognition of sign languages are currently active fields of research focused on sign recognition. Various approaches differ in terms of analysis methods and the devices used for sign acquisition. Traditional methods rely on video analysis or spatial positioning data calculated using motion capture tools. In contrast to these conventional recognition and classification approaches, electromyogram (EMG) signals, which measure muscle electrical activity, offer potential technology for detecting gestures. These EMG-based approaches have recently gained attention due to their advantages. This prompted us to conduct a comprehensive study on the methods, approaches, and projects utilizing EMG sensors for sign language handshape recognition. In this paper, we provided an overview of the sign language recognition field through a literature review, with the objective of offering an in-depth review of the most significant techniques. These techniques were categorized in this article based on their respective methodologies. The survey discussed the progress and challenges in sign language recognition systems based on surface electromyography (sEMG) signals. These systems have shown promise but face issues like sEMG data variability and sensor placement. Multiple sensors enhance reliability and accuracy. Machine learning, including deep learning, is used to address these challenges. Common classifiers in sEMG-based sign language recognition include SVM, ANN, CNN, KNN, HMM, and LSTM. While SVM and ANN are widely used, random forest and KNN have shown better performance in some cases. A multilayer perceptron neural network achieved perfect accuracy in one study. CNN, often paired with LSTM, ranks as the third most popular classifier and can achieve exceptional accuracy, reaching up to 99.6% when utilizing both EMG and IMU data. LSTM is highly regarded for handling sequential dependencies in EMG signals, making it a critical component of sign language recognition systems. In summary, the survey highlights the prevalence of SVM and ANN classifiers but also suggests the effectiveness of alternative classifiers like random forests and KNNs. LSTM emerges as the most suitable algorithm for capturing sequential dependencies and improving gesture recognition in EMG-based sign language recognition systems.
Collapse
Affiliation(s)
| | - Oussama El Ghoul
- Mada—Assistive Technology Center Qatar, Doha P.O. Box 24230, Qatar;
| | - Mohamed Jemni
- Arab League Educational, Cultural, and Scientific Organization, Tunis 1003, Tunisia
| |
Collapse
|
7
|
Xu M, Chen X, Sun A, Zhang X, Chen X. A Novel Event-Driven Spiking Convolutional Neural Network for Electromyography Pattern Recognition. IEEE Trans Biomed Eng 2023; 70:2604-2615. [PMID: 37030849 DOI: 10.1109/tbme.2023.3258606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
Abstract
Electromyography (EMG) pattern recognition is an important technology for prosthesis control and human-computer interaction etc. However, the practical application of EMG pattern recognition is hampered by poor accuracy and robustness due to electrode shift caused by repeated wearing of the signal acquisition device. Moreover, the user's acceptability is low due to the heavy training burden, which is caused by the need for a large amount of training data by traditional methods. In order to explore the advantage of spiking neural network (SNN) in solving the poor robustness and heavy training burden problems in EMG pattern recognition, a spiking convolutional neural network (SCNN) composed of cyclic convolutional neural network (CNN) and fully connected modules is proposed and implemented in this study. High density surface electromyography (HD-sEMG) signals collected from 6 gestures of 10 subjects at 6 electrode positions are taken as the research object. Compared to CNN with the same structure, CNN-Long Short Term Memory (CNN-LSTM), linear kernel linear discriminant analysis classifier (LDA) and spiking multilayer perceptron (Spiking MLP), the accuracy of SCNN is 50.69%, 33.92%, 32.94% and 9.41% higher in the small sample training experiment, 6.50%, 4.23%, 28.73%, and 2.57% higher in the electrode shifts experiment respectively. In addition, the power consumption of SCNN is about 1/93 of CNN. The advantages of the proposed framework in alleviating user training burden, mitigating the adverse effect of electrode shifts and reducing power consumption make it very meaningful for promoting the development of user-friendly real-time myoelectric control system.
Collapse
|
8
|
Li J, Li K, Zhang J, Cao J. Continuous Motion Estimation of Knee Joint Based on a Parameter Self-Updating Mechanism Model. Bioengineering (Basel) 2023; 10:1028. [PMID: 37760130 PMCID: PMC10525850 DOI: 10.3390/bioengineering10091028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 08/22/2023] [Accepted: 08/29/2023] [Indexed: 09/29/2023] Open
Abstract
Estimation of continuous motion of human joints using surface electromyography (sEMG) signals has a critical part to play in intelligent rehabilitation. Traditional methods always use sEMG signals as inputs to build regression or biomechanical models to estimate continuous joint motion variables. However, it is challenging to accurately estimate continuous joint motion in new subjects due to the non-stationarity and individual differences in sEMG signals, which greatly limits the generalisability of the method. In this paper, a continuous motion estimation model for the human knee joint with a parameter self-updating mechanism based on the fusion of particle swarm optimization (PSO) and deep belief network (DBN) is proposed. According to the original sEMG signals of different subjects, the method adaptively optimized the parameters of the DBN model and completed the optimal reconstruction of signal feature structure in high-dimensional space to achieve the optimal estimation of continuous joint motion. Extensive experiments were conducted on knee joint motions. The results suggested that the average root mean square errors (RMSEs) of the proposed method were 9.42° and 7.36°, respectively, which was better than the results obtained by common neural networks. This finding lays a foundation for the human-robot interaction (HRI) of the exoskeleton robots based on the sEMG signals.
Collapse
Affiliation(s)
- Jiayi Li
- School of Mechanical Engineering, Hebei University of Technology, Tianjin 300401, China; (J.L.); (J.C.)
| | - Kexiang Li
- School of Mechanical and Materials Engineering, North China University of Technology, Beijing 100144, China;
| | - Jianhua Zhang
- School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China
| | - Jian Cao
- School of Mechanical Engineering, Hebei University of Technology, Tianjin 300401, China; (J.L.); (J.C.)
| |
Collapse
|
9
|
Faisal MAA, Abir FF, Ahmed MU, Ahad MAR. Exploiting domain transformation and deep learning for hand gesture recognition using a low-cost dataglove. Sci Rep 2022; 12:21446. [PMID: 36509815 PMCID: PMC9743107 DOI: 10.1038/s41598-022-25108-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 11/24/2022] [Indexed: 12/14/2022] Open
Abstract
Hand gesture recognition is one of the most widely explored areas under the human-computer interaction domain. Although various modalities of hand gesture recognition have been explored in the last three decades, in recent years, due to the availability of hardware and deep learning algorithms, hand gesture recognition research has attained renewed momentum. In this paper, we evaluate the effectiveness of a low-cost dataglove for classifying hand gestures in the light of deep learning. We have developed a cost-effective dataglove using five flex sensors, an inertial measurement unit, and a powerful microcontroller for onboard processing and wireless connectivity. We have collected data from 25 subjects for 24 static and 16 dynamic American sign language gestures for validating our system. Moreover, we proposed a novel Spatial Projection Image-based technique for dynamic hand gesture recognition. We also explored a parallel-path neural network architecture for handling multimodal data more effectively. Our method produced an F1-score of 82.19% for static gestures and 97.35% for dynamic gestures from a leave-one-out-cross-validation approach. Overall, this study demonstrates the promising performance of a generalized hand gesture recognition technique in hand gesture recognition. The dataset used in this work has been made publicly available.
Collapse
Affiliation(s)
- Md. Ahasan Atick Faisal
- grid.8198.80000 0001 1498 6059Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000 Bangladesh
| | - Farhan Fuad Abir
- grid.8198.80000 0001 1498 6059Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000 Bangladesh
| | - Mosabber Uddin Ahmed
- grid.8198.80000 0001 1498 6059Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000 Bangladesh
| | - Md Atiqur Rahman Ahad
- grid.60969.300000 0001 2189 1306Department of Computer Science and Digital Technologies, University of East London, London, UK
| |
Collapse
|
10
|
Dwivedi A, Groll H, Beckerle P. A Systematic Review of Sensor Fusion Methods Using Peripheral Bio-Signals for Human Intention Decoding. SENSORS (BASEL, SWITZERLAND) 2022; 22:6319. [PMID: 36080778 PMCID: PMC9460678 DOI: 10.3390/s22176319] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 08/02/2022] [Accepted: 08/18/2022] [Indexed: 06/15/2023]
Abstract
Humans learn about the environment by interacting with it. With an increasing use of computer and virtual applications as well as robotic and prosthetic devices, there is a need for intuitive interfaces that allow the user to have an embodied interaction with the devices they are controlling. Muscle-machine interfaces can provide an intuitive solution by decoding human intentions utilizing myoelectric activations. There are several different methods that can be utilized to develop MuMIs, such as electromyography, ultrasonography, mechanomyography, and near-infrared spectroscopy. In this paper, we analyze the advantages and disadvantages of different myography methods by reviewing myography fusion methods. In a systematic review following the PRISMA guidelines, we identify and analyze studies that employ the fusion of different sensors and myography techniques, while also considering interface wearability. We also explore the properties of different fusion techniques in decoding user intentions. The fusion of electromyography, ultrasonography, mechanomyography, and near-infrared spectroscopy as well as other sensing such as inertial measurement units and optical sensing methods has been of continuous interest over the last decade with the main focus decoding the user intention for the upper limb. From the systematic review, it can be concluded that the fusion of two or more myography methods leads to a better performance for the decoding of a user's intention. Furthermore, promising sensor fusion techniques for different applications were also identified based on the existing literature.
Collapse
Affiliation(s)
- Anany Dwivedi
- Chair of Autonomous Systems and Mechatronics, Department of Electrical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91052 Erlangen, Germany
| | - Helen Groll
- Chair of Autonomous Systems and Mechatronics, Department of Electrical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91052 Erlangen, Germany
| | - Philipp Beckerle
- Chair of Autonomous Systems and Mechatronics, Department of Electrical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91052 Erlangen, Germany
- Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91052 Erlangen, Germany
| |
Collapse
|
11
|
Zhou Z, Tam VWL, Lam EY. A Portable Sign Language Collection and Translation Platform with Smart Watches Using a BLSTM-Based Multi-Feature Framework. MICROMACHINES 2022; 13:mi13020333. [PMID: 35208457 PMCID: PMC8877205 DOI: 10.3390/mi13020333] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/12/2022] [Accepted: 02/17/2022] [Indexed: 11/16/2022]
Abstract
Continuous sign language recognition (CSLR) using different types of sensors to precisely recognize sign language in real time is a very challenging but important research direction in sensor technology. Many previous methods are vision-based, with computationally intensive algorithms to process a large number of image/video frames possibly contaminated with noises, which can result in a large translation delay. On the other hand, gesture-based CSLR relying on hand movement data captured on wearable devices may require less computation resources and translation time. Thus, it is more efficient to provide instant translation during real-world communication. However, the insufficient amount of information provided by the wearable sensors often affect the overall performance of this system. To tackle this issue, we propose a bidirectional long short-term memory (BLSTM)-based multi-feature framework for conducting gesture-based CSLR precisely with two smart watches. In this framework, multiple sets of input features are extracted from the collected gesture data to provide a diverse spectrum of valuable information to the underlying BLSTM model for CSLR. To demonstrate the effectiveness of the proposed framework, we test it on an extremely challenging and radically new dataset of Hong Kong sign language (HKSL), in which hand movement data are collected from 6 individual signers for 50 different sentences. The experimental results reveal that the proposed framework attains a much lower word error rate compared with other existing machine learning or deep learning approaches for gesture-based CSLR. Based on this framework, we further propose a portable sign language collection and translation platform, which can simplify the procedure of collecting gesture-based sign language dataset and recognize sign language through smart watch data in real time, in order to break the communication barrier for the sign language users.
Collapse
|
12
|
Yu M, Jia J, Xue C, Yan G, Guo Y, Liu Y. A review of sign language recognition research. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-210050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Sign language is the primary way of communication between hard-of-hearing and hearing people. Sign language recognition helps promote the better integration of deaf and hard-of-hearing people into society. We reviewed 95 types of research on sign language recognition technology from 1993 to 2021, analyzing and comparing algorithms from three aspects of gesture, isolated word, and continuous sentence recognition, elaborating the evolution of sign language acquisition equipment and we summarized the datasets of sign language recognition research and evaluation criteria. Finally, the main technology trends are discussed, and future challenges are analyzed.
Collapse
Affiliation(s)
- Ming Yu
- Department of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, China
| | - Jingli Jia
- Department of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, China
| | - Cuihong Xue
- Technical College for Deaf, Tianjin University of Technology, Tianjin, China
| | - Gang Yan
- Department of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, China
| | - Yingchun Guo
- Department of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, China
| | - Yuehao Liu
- Department of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, China
| |
Collapse
|
13
|
Cornerstone network with feature extractor: a metric-based few-shot model for chinese natural sign language. APPL INTELL 2021. [DOI: 10.1007/s10489-020-02170-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
14
|
Wei W, Hong H, Wu X. A Hierarchical View Pooling Network for Multichannel Surface Electromyography-Based Gesture Recognition. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:6591035. [PMID: 34484323 PMCID: PMC8413066 DOI: 10.1155/2021/6591035] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 06/29/2021] [Indexed: 11/18/2022]
Abstract
Hand gesture recognition based on surface electromyography (sEMG) plays an important role in the field of biomedical and rehabilitation engineering. Recently, there is a remarkable progress in gesture recognition using high-density surface electromyography (HD-sEMG) recorded by sensor arrays. On the other hand, robust gesture recognition using multichannel sEMG recorded by sparsely placed sensors remains a major challenge. In the context of multiview deep learning, this paper presents a hierarchical view pooling network (HVPN) framework, which improves multichannel sEMG-based gesture recognition by learning not only view-specific deep features but also view-shared deep features from hierarchically pooled multiview feature spaces. Extensive intrasubject and intersubject evaluations were conducted on the large-scale noninvasive adaptive prosthetics (NinaPro) database to comprehensively evaluate our proposed HVPN framework. Results showed that when using 200 ms sliding windows to segment data, the proposed HVPN framework could achieve the intrasubject gesture recognition accuracy of 88.4%, 85.8%, 68.2%, 72.9%, and 90.3% and the intersubject gesture recognition accuracy of 84.9%, 82.0%, 65.6%, 70.2%, and 88.9% on the first five subdatabases of NinaPro, respectively, which outperformed the state-of-the-art methods.
Collapse
Affiliation(s)
- Wentao Wei
- School of Design Arts and Media, Nanjing University of Science and Technology, Nanjing, Jiangsu, China
| | - Hong Hong
- School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu, China
| | - Xiaoli Wu
- School of Design Arts and Media, Nanjing University of Science and Technology, Nanjing, Jiangsu, China
| |
Collapse
|
15
|
Jiang S, Kang P, Song X, Lo B, Shull P. Emerging Wearable Interfaces and Algorithms for Hand Gesture Recognition: A Survey. IEEE Rev Biomed Eng 2021; 15:85-102. [PMID: 33961564 DOI: 10.1109/rbme.2021.3078190] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Hands are vital in a wide range of fundamental daily activities, and neurological diseases that impede hand function can significantly affect quality of life. Wearable hand gesture interfaces hold promise to restore and assist hand function and to enhance human-human and human-computer communication. The purpose of this review is to synthesize current novel sensing interfaces and algorithms for hand gesture recognition, and the scope of applications covers rehabilitation, prosthesis control, sign language recognition, and human-computer interaction. Results showed that electrical, dynamic, acoustical/vibratory, and optical sensing were the primary input modalities in gesture recognition interfaces. Two categories of algorithms were identified: 1) classification algorithms for predefined, fixed hand poses and 2) regression algorithms for continuous finger and wrist joint angles. Conventional machine learning algorithms, including linear discriminant analysis, support vector machines, random forests, and non-negative matrix factorization, have been widely used for a variety of gesture recognition applications, and deep learning algorithms have more recently been applied to further facilitate the complex relationship between sensor signals and multi-articulated hand postures. Future research should focus on increasing recognition accuracy with larger hand gesture datasets, improving reliability and robustness for daily use outside of the laboratory, and developing softer, less obtrusive interfaces.
Collapse
|
16
|
Rim B, Sung NJ, Min S, Hong M. Deep Learning in Physiological Signal Data: A Survey. SENSORS (BASEL, SWITZERLAND) 2020; 20:E969. [PMID: 32054042 PMCID: PMC7071412 DOI: 10.3390/s20040969] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 01/31/2020] [Accepted: 02/09/2020] [Indexed: 12/11/2022]
Abstract
Deep Learning (DL), a successful promising approach for discriminative and generative tasks, has recently proved its high potential in 2D medical imaging analysis; however, physiological data in the form of 1D signals have yet to be beneficially exploited from this novel approach to fulfil the desired medical tasks. Therefore, in this paper we survey the latest scientific research on deep learning in physiological signal data such as electromyogram (EMG), electrocardiogram (ECG), electroencephalogram (EEG), and electrooculogram (EOG). We found 147 papers published between January 2018 and October 2019 inclusive from various journals and publishers. The objective of this paper is to conduct a detailed study to comprehend, categorize, and compare the key parameters of the deep-learning approaches that have been used in physiological signal analysis for various medical applications. The key parameters of deep-learning approach that we review are the input data type, deep-learning task, deep-learning model, training architecture, and dataset sources. Those are the main key parameters that affect system performance. We taxonomize the research works using deep-learning method in physiological signal analysis based on: (1) physiological signal data perspective, such as data modality and medical application; and (2) deep-learning concept perspective such as training architecture and dataset sources.
Collapse
Affiliation(s)
- Beanbonyka Rim
- Department of Computer Science, Soonchunhyang University, Asan 31538, Korea
| | - Nak-Jun Sung
- Department of Computer Science, Soonchunhyang University, Asan 31538, Korea
| | - Sedong Min
- Department of Medical IT Engineering, Soonchunhyang University, Asan 31538, Korea
| | - Min Hong
- Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Korea
| |
Collapse
|