1
|
Xie Y, Yan Y, Li Y. The use of artificial intelligence-based Siamese neural network in personalized guidance for sports dance teaching. Sci Rep 2025; 15:12112. [PMID: 40204919 PMCID: PMC11982401 DOI: 10.1038/s41598-025-96462-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Accepted: 03/28/2025] [Indexed: 04/11/2025] Open
Abstract
This work aims to explore an accurate and effective method for recognizing dance movement features, providing precise personalized guidance for sports dance teaching. First, a human skeletal graph is constructed. A graph convolutional network (GCN) is employed to extract features from the nodes (joints) and edges (bone connections) in the graph structure, capturing both spatial relationships and temporal dynamics between joints. The GCN generates effective motion representations by aggregating the features of each node and its neighboring nodes. A dance movement recognition model combining GCN and a Siamese neural network (SNN) is proposed. The GCN module is responsible for extracting spatial features from the skeletal graph, while the SNN module evaluates the similarity between different skeletal sequences by comparing their features. The SNN employs a twin network structure, where two identical and parameter-sharing feature extraction networks process two input samples and calculate their distance or similarity in a high-dimensional feature space. The model is trained and validated on the COCO dataset. The results show that the proposed GCN-SNN model achieves an accuracy of 96.72% and an F1 score of 86.55%, significantly outperforming other comparison models. This work not only provides an efficient and intelligent personalized guidance method for sports dance teaching but also opens new avenues for the application of artificial intelligence in the education sector.
Collapse
Affiliation(s)
- Yi Xie
- Xinyu University, Xinyu, 338004, Jiangxi Province, China
| | - Yao Yan
- Beijing Institute of Graphic Communication, Beijing, 102600, China.
| | - Yuwei Li
- China Civil Affairs University, Beijing, 102600, China
| |
Collapse
|
2
|
Tian H, Gong W, Li W, Qian Y. PASTFNet: a paralleled attention spatio-temporal fusion network for micro-expression recognition. Med Biol Eng Comput 2024; 62:1911-1924. [PMID: 38413518 DOI: 10.1007/s11517-024-03041-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 02/06/2024] [Indexed: 02/29/2024]
Abstract
Micro-expressions (MEs) play such an important role in predicting a person's genuine emotions, as to make micro-expression recognition such an important resea rch focus in recent years. Most recent researchers have made efforts to recognize MEs with spatial and temporal information of video clips. However, because of their short duration and subtle intensity, capturing spatio-temporal features of micro-expressions remains challenging. To effectively promote the recognition performance, this paper presents a novel paralleled dual-branch attention-based spatio-temporal fusion network (PASTFNet). We jointly extract short- and long-range spatial relationships in spatial branch. Inspired by the composite architecture of the convolutional neural network (CNN) and long short-term memory (LSTM) for temporal modeling, we propose a novel attention-based multi-scale feature fusion network (AMFNet) to encode features of sequential frames, which can learn more expressive facial-detailed features for it implements the integrated use of attention and multi-scale feature fusion, then design an aggregation block to aggregate and acquire temporal features. At last, the features learned by the above two branches are fused to accomplish expression recognition with outstanding effect. Experiments on two MER datasets (CASMEII and SAMM) show that the PASTFNet model achieves promising ME recognition performance compared with other methods.
Collapse
Affiliation(s)
- Haichen Tian
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Weijun Gong
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Wei Li
- School of Software, Xinjiang University, Urumqi, China
| | - Yurong Qian
- School of Information Science and Engineering, Xinjiang University, Urumqi, China.
- School of Software, Xinjiang University, Urumqi, China.
- Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region, Urumqi, China.
| |
Collapse
|
3
|
Pan H, Yang H, Xie L, Wang Z. Multi-scale fusion visual attention network for facial micro-expression recognition. Front Neurosci 2023; 17:1216181. [PMID: 37575295 PMCID: PMC10412924 DOI: 10.3389/fnins.2023.1216181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 06/26/2023] [Indexed: 08/15/2023] Open
Abstract
Introduction Micro-expressions are facial muscle movements that hide genuine emotions. In response to the challenge of micro-expression low-intensity, recent studies have attempted to locate localized areas of facial muscle movement. However, this ignores the feature redundancy caused by the inaccurate locating of the regions of interest. Methods This paper proposes a novel multi-scale fusion visual attention network (MFVAN), which learns multi-scale local attention weights to mask regions of redundancy features. Specifically, this model extracts the multi-scale features of the apex frame in the micro-expression video clips by convolutional neural networks. The attention mechanism focuses on the weights of local region features in the multi-scale feature maps. Then, we mask operate redundancy regions in multi-scale features and fuse local features with high attention weights for micro-expression recognition. The self-supervision and transfer learning reduce the influence of individual identity attributes and increase the robustness of multi-scale feature maps. Finally, the multi-scale classification loss, mask loss, and removing individual identity attributes loss joint to optimize the model. Results The proposed MFVAN method is evaluated on SMIC, CASME II, SAMM, and 3DB-Combined datasets that achieve state-of-the-art performance. The experimental results show that focusing on local at the multi-scale contributes to micro-expression recognition. Discussion This paper proposed MFVAN model is the first to combine image generation with visual attention mechanisms to solve the combination challenge problem of individual identity attribute interference and low-intensity facial muscle movements. Meanwhile, the MFVAN model reveal the impact of individual attributes on the localization of local ROIs. The experimental results show that a multi-scale fusion visual attention network contributes to micro-expression recognition.
Collapse
Affiliation(s)
- Hang Pan
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| |
Collapse
|
4
|
Fu C, Yang W, Chen D, Wei F. AM3F-FlowNet: Attention-Based Multi-Scale Multi-Branch Flow Network. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1064. [PMID: 37510012 PMCID: PMC10378207 DOI: 10.3390/e25071064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 07/02/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Micro-expressions are the small, brief facial expression changes that humans momentarily show during emotional experiences, and their data annotation is complicated, which leads to the scarcity of micro-expression data. To extract salient and distinguishing features from a limited dataset, we propose an attention-based multi-scale, multi-modal, multi-branch flow network to thoroughly learn the motion information of micro-expressions by exploiting the attention mechanism and the complementary properties between different optical flow information. First, we extract optical flow information (horizontal optical flow, vertical optical flow, and optical strain) based on the onset and apex frames of micro-expression videos, and each branch learns one kind of optical flow information separately. Second, we propose a multi-scale fusion module to extract more prosperous and more stable feature expressions using spatial attention to focus on locally important information at each scale. Then, we design a multi-optical flow feature reweighting module to adaptively select features for each optical flow separately by channel attention. Finally, to better integrate the information of the three branches and to alleviate the problem of uneven distribution of micro-expression samples, we introduce a logarithmically adjusted prior knowledge weighting loss. This loss function weights the prediction scores of samples from different categories to mitigate the negative impact of category imbalance during the classification process. The effectiveness of the proposed model is demonstrated through extensive experiments and feature visualization on three benchmark datasets (CASMEII, SAMM, and SMIC), and its performance is comparable to that of state-of-the-art methods.
Collapse
Affiliation(s)
- Chenghao Fu
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
| | - Wenzhong Yang
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Multilingual Information Technology, Xinjiang University, Urumqi 830017, China
| | - Danny Chen
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
| | - Fuyuan Wei
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
| |
Collapse
|
5
|
Temporal augmented contrastive learning for micro-expression recognition. Pattern Recognit Lett 2023. [DOI: 10.1016/j.patrec.2023.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
6
|
A Survey of Micro-expression Recognition Methods Based on LBP, Optical Flow and Deep Learning. Neural Process Lett 2023. [DOI: 10.1007/s11063-022-11123-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
7
|
Wei M, Zong Y, Jiang X, Lu C, Liu J. Micro-Expression Recognition Using Uncertainty-Aware Magnification-Robust Networks. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1271. [PMID: 36141156 PMCID: PMC9498083 DOI: 10.3390/e24091271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 09/02/2022] [Accepted: 09/06/2022] [Indexed: 06/16/2023]
Abstract
A micro-expression (ME) is a kind of involuntary facial expressions, which commonly occurs with subtle intensity. The accurately recognition ME, a. k. a. micro-expression recognition (MER), has a number of potential applications, e.g., interrogation and clinical diagnosis. Therefore, the subject has received a high level of attention among researchers in affective computing and pattern recognition communities. In this paper, we proposed a straightforward and effective deep learning method called uncertainty-aware magnification-robust networks (UAMRN) for MER, which attempts to address two key issues in MER including the low intensity of ME and imbalance of ME samples. Specifically, to better distinguish subtle ME movements, we reconstructed a new sequence by magnifying the ME intensity. Furthermore, a sparse self-attention (SSA) block was implemented which rectifies the standard self-attention with locality sensitive hashing (LSH), resulting in the suppression of artefacts generated during magnification. On the other hand, for the class imbalance problem, we guided the network optimization based on the confidence about the estimation, through which the samples from rare classes were allotted greater uncertainty and thus trained more carefully. We conducted the experiments on three public ME databases, i.e., CASME II, SAMM and SMIC-HS, the results of which demonstrate improvement compared to recent state-of-the-art MER methods.
Collapse
Affiliation(s)
- Mengting Wei
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| | - Yuan Zong
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| | - Xingxun Jiang
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| | - Cheng Lu
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Information Science and Engineering, Southeast University, Nanjing 210096, China
| | - Jiateng Liu
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| |
Collapse
|
8
|
Late Fusion-Based Video Transformer for Facial Micro-Expression Recognition. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this article, we propose a novel model for facial micro-expression (FME) recognition. The proposed model basically comprises a transformer, which is recently used for computer vision and has never been used for FME recognition. A transformer requires a huge amount of data compared to a convolution neural network. Then, we use motion features, such as optical flow and late fusion to complement the lack of FME dataset. The proposed method was verified and evaluated using the SMIC and CASME II datasets. Our approach achieved state-of-the-art (SOTA) performance of 0.7447 and 73.17% in SMIC in terms of unweighted F1 score (UF1) and accuracy (Acc.), respectively, which are 0.31 and 1.8% higher than previous SOTA. Furthermore, UF1 of 0.7106 and Acc. of 70.68% were shown in the CASME II experiment, which are comparable with SOTA.
Collapse
|