1
|
Chen T, Ma Y, Pan Z, Wang W, Yu J. Fusion of multi-scale feature extraction and adaptive multi-channel graph neural network for 12-lead ECG classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 265:108725. [PMID: 40184850 DOI: 10.1016/j.cmpb.2025.108725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 03/14/2025] [Accepted: 03/14/2025] [Indexed: 04/07/2025]
Abstract
BACKGROUND AND OBJECTIVE The 12-lead electrocardiography (ECG) is a widely used diagnostic method in clinical practice for cardiovascular diseases. The potential correlation between interlead signals is an important reference for clinical diagnosis but is often overlooked by most deep learning methods. Although graph neural networks can capture the associations between leads through edge topology, the complex correlations inherent in 12-lead ECG may involve edge topology, node features, or their combination. METHODS In this study, we propose a multi-scale adaptive graph fusion network (MSAGFN) model, which fuses multi-scale feature extraction and adaptive multi-channel graph neural network (AMGNN) for 12-lead ECG classification. The proposed MSAGFN model first extracts multi-scale features individually from 12 leads and then utilizes these features as nodes to construct feature graphs and topology graphs. To efficiently capture the most correlated information from the feature graphs and topology graphs, AMGNN iteratively performs a series of graph operations to learn the final graph-level representations for prediction. Moreover, we incorporate consistency and disparity constraints into our model to further refine the learned features. RESULTS Our model was validated on the PTB-XL dataset, achieving an area under the receiver operating characteristic curve score of 0.937, mean accuracy of 0.894, and maximum F1 score of 0.815. These results surpass the corresponding metrics of state-of-the-art methods. Additionally, we conducted ablation studies to further demonstrate the effectiveness of our model. CONCLUSIONS Our study demonstrates that, in 12-lead ECG classification, by constructing topology graphs based on physiological relationships and feature graphs based on lead feature relationships, and effectively integrating them, we can fully explore and utilize the complementary characteristics of the two graph structures. By combining these structures, we construct a comprehensive data view, significantly enhancing the feature representation and classification accuracy.
Collapse
Affiliation(s)
- Teng Chen
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, PR China.
| | - Yumei Ma
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, PR China.
| | - Zhenkuan Pan
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, PR China.
| | - Weining Wang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, PR China.
| | - Jinpeng Yu
- School of Automation, Qingdao University, Qingdao 266071, PR China.
| |
Collapse
|
2
|
Vos G, Ebrahimpour M, van Eijk L, Sarnyai Z, Rahimi Azghadi M. Stress monitoring using low-cost electroencephalogram devices: A systematic literature review. Int J Med Inform 2025; 198:105859. [PMID: 40056845 DOI: 10.1016/j.ijmedinf.2025.105859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 02/27/2025] [Accepted: 03/01/2025] [Indexed: 03/10/2025]
Abstract
INTRODUCTION The use of low-cost, consumer-grade wearable health monitoring devices has become increasingly prevalent in mental health research, including stress studies. While cortisol response magnitude remains the gold standard for stress assessment, an expanding body of research employs low-cost EEG devices as primary tools for recording biomarker data, often combined with wrist and ring-based wearables. However, the technical variability among low-cost EEG devices, particularly in sensor count and placement according to the 10-20 Electrode Placement System, poses challenges for reproducibility in study outcomes. OBJECTIVE This review aims to provide an overview of the growing application of low-cost EEG devices and machine learning techniques for assessing brain function, with a focus on stress detection. It also highlights the strengths and weaknesses of various machine learning methods commonly used in stress research, and evaluates the reproducibility of reported findings along with sensor count and placement importance. METHODS A comprehensive review was conducted of published studies utilizing EEG devices for stress detection and their associated machine learning approaches. Searches were performed across databases including Scopus, Google Scholar, ScienceDirect, Nature, and PubMed, yielding 69 relevant articles for analysis. The selected studies were synthesized into four thematic categories: stress assessment using EEG, low-cost EEG devices, datasets for EEG-based stress measurement, and machine learning techniques for EEG-based stress analysis. For machine learning-focused studies, validation and reproducibility methods were critically assessed. Study quality was evaluated and scored using the IJMEDI checklist. RESULTS The review identified several studies employing low-cost EEG devices to monitor brain activity during stress and relaxation phases, with many reporting high predictive accuracy using various machine learning validation techniques. However, only 54% of the studies included health screening prior to experimentation, and 58% were categorized as low-powered due to limited sample sizes. Additionally, few studies validated their results using an independent validation set or cortisol response as a correlating biomarker and there was a lack of consensus on data pre-processing and sensor placement as a key contributor to improving model generalization and accuracy. CONCLUSION Low-cost consumer-grade wearable devices, including EEG and wrist-based monitors, are increasingly utilized in stress-related research, offering promising avenues for non-invasive biomarker monitoring. However, significant gaps remain in standardizing EEG signal processing and sensor placement, both of which are critical for enhancing model generalization and accuracy. Furthermore, the limited use of independent validation sets and cortisol response as correlating biomarkers highlights the need for more robust validation methodologies. Future research should focus on addressing these limitations and establishing consensus on data pre-processing techniques and sensor configurations to improve the reliability and reproducibility of findings in this growing field.
Collapse
Affiliation(s)
- Gideon Vos
- College of Science and Engineering, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia
| | - Maryam Ebrahimpour
- College of Science and Engineering, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia
| | - Liza van Eijk
- College of Health Care Sciences, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia
| | - Zoltan Sarnyai
- College of Public Health, Medical, and Vet Sciences, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia
| | - Mostafa Rahimi Azghadi
- College of Science and Engineering, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia.
| |
Collapse
|
3
|
You L, Zhong T, He E, Liu X, Zhong Q. Cross-subject affective analysis based on dynamic brain functional networks. Front Hum Neurosci 2025; 19:1445763. [PMID: 40297263 PMCID: PMC12034672 DOI: 10.3389/fnhum.2025.1445763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Accepted: 03/18/2025] [Indexed: 04/30/2025] Open
Abstract
Introduction Emotion recognition is crucial in facilitating human-computer emotional interaction. To enhance the credibility and realism of emotion recognition, researchers have turned to physiological signals, particularly EEG signals, as they directly reflect cerebral cortex activity. However, due to inter-subject variability and non-smoothness of EEG signals, the generalization performance of models across subjects remains a challenge. Methods In this study, we proposed a novel approach that combines time-frequency analysis and brain functional networks to construct dynamic brain functional networks using sliding time windows. This integration of time, frequency, and spatial domains helps to effectively capture features, reducing inter-individual differences, and improving model generalization performance. To construct brain functional networks, we employed mutual information to quantify the correlation between EEG channels and set appropriate thresholds. We then extracted three network attribute features-global efficiency, local efficiency, and local clustering coefficients-to achieve emotion classification based on dynamic brain network features. Results The proposed method is evaluated on the DEAP dataset through subject-dependent (trial-independent), subject-independent, and subject- and trial-independent experiments along both valence and arousal dimensions. The results demonstrate that our dynamic brain functional network outperforms the static brain functional network in all three experimental cases. High classification accuracies of 90.89% and 91.17% in the valence and arousal dimensions, respectively, were achieved on the subject-independent experiments based on the dynamic brain function, leading to significant advancements in EEG-based emotion recognition. In addition, experiments with each brain region yielded that the left and right temporal lobes focused on processing individual private emotional information, whereas the remaining brain regions paid attention to processing basic emotional information.
Collapse
Affiliation(s)
- Lifeng You
- School of Physics, South China Normal University, Guangzhou, China
| | - Tianyu Zhong
- School of Social Sciences, Nanyang Technological University, Singapore, Singapore
| | - Erheng He
- School of Physics, South China Normal University, Guangzhou, China
| | - Xuejie Liu
- School of Electronic Science and Engineering (School of Microelectronics), South China Normal University, Foshan, China
| | - Qinghua Zhong
- School of Electronic Science and Engineering (School of Microelectronics), South China Normal University, Foshan, China
| |
Collapse
|
4
|
Zhong L, Xu M, Li J, Bai Z, Ji H, Liu L, Jin L. From Micro to Meso: A Data-Driven Mesoscopic Region Division Method Based on Functional Connectivity for EEG-Based Driver Fatigue Detection. IEEE J Biomed Health Inform 2025; 29:2603-2616. [PMID: 40030270 DOI: 10.1109/jbhi.2024.3504847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2025]
Abstract
The integration of EEG signals and deep learning methods is emerging as an effective approach for brain fatigue detection, particularly utilizing Graph Neural Networks(GNNs) that excel in capturing complex electrode relationships. A significant challenge within GNNs is the construction of an effective adjacency matrix that enhances spatial information learning. Concurrently, electrode aggregation in EEG has emerged as a pivotal area of research. However, conventional partitioning methods depend on task-specific prior knowledge, limiting their generalizability across diverse tasks. To Address this issue, we propose a novel mesoscopic region division approach for EEG-based driver fatigue detection, leveraging inherent data characteristics and functional connectivity-based GNN. This method adopts a two-stage approach: initially, micro-electrodes exhibiting similar functional connectivity relationships are grouped as "mesoscopic region"; subsequently, all micro-electrodes in the same group are aggregated into virtual meso-electrodes, and the fatigue state classification is subsequently based on the functional connectivity between them. Applied to a public driver fatigue detection dataset, our approach surpasses existing state-of-the-art methods in performance. Additionally, interpretive analysis provides micro and mesoscopic insights into brain regions and neuronal connections associated with alert and fatigued states.
Collapse
|
5
|
Imtiaz MN, Khan N. Enhanced cross-dataset electroencephalogram-based emotion recognition using unsupervised domain adaptation. Comput Biol Med 2025; 184:109394. [PMID: 39549531 DOI: 10.1016/j.compbiomed.2024.109394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 10/07/2024] [Accepted: 11/07/2024] [Indexed: 11/18/2024]
Abstract
Emotion recognition holds great promise in healthcare and in the development of affect-sensitive systems such as brain-computer interfaces (BCIs). However, the high cost of labeled data and significant differences in electroencephalogram (EEG) signals among individuals limit the cross-domain application of EEG-based emotion recognition models. Addressing cross-dataset scenarios poses greater challenges due to changes in subject demographics, recording devices, and stimuli presented. To tackle these challenges, we propose an improved method for classifying EEG-based emotions across domains with different distributions. We propose a Gradual Proximity-guided Target Data Selection (GPTDS) technique, which gradually selects reliable target domain samples for training based on their proximity to the source clusters and the model's confidence in predicting them. This approach avoids negative transfer caused by diverse and unreliable samples. Additionally, we introduce a cost-effective test-time augmentation (TTA) technique named Prediction Confidence-aware Test-Time Augmentation (PC-TTA). Traditional TTA methods often face substantial computational burden, limiting their practical utility. By applying TTA only when necessary, based on the model's predictive confidence, our approach improves the model's performance during inference while minimizing computational costs compared to traditional TTA approaches. Experiments on the DEAP and SEED datasets demonstrate that our method outperforms state-of-the-art approaches, achieving accuracies of 67.44% when trained on DEAP and tested on SEED, and 59.68% vice versa, with improvements of 7.09% and 6.07% over the baseline. It excels in detecting both positive and negative emotions, highlighting its effectiveness for practical emotion recognition in healthcare applications. Moreover, our proposed PC-TTA technique reduces computational time by a factor of 15 compared to traditional full TTA approaches.
Collapse
Affiliation(s)
- Md Niaz Imtiaz
- Department of Electrical, Computer and Biomedical Engineering, Toronto Metropolitan University, 350 Victoria St, Toronto, ON M5B 2K3, Canada.
| | - Naimul Khan
- Department of Electrical, Computer and Biomedical Engineering, Toronto Metropolitan University, 350 Victoria St, Toronto, ON M5B 2K3, Canada.
| |
Collapse
|
6
|
Garcia-Moreno FM, Badenes-Sastre M, Expósito F, Rodriguez-Fortiz MJ, Bermudez-Edo M. EEG headbands vs caps: How many electrodes do I need to detect emotions? The case of the MUSE headband. Comput Biol Med 2025; 184:109463. [PMID: 39608032 DOI: 10.1016/j.compbiomed.2024.109463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 11/19/2024] [Accepted: 11/20/2024] [Indexed: 11/30/2024]
Abstract
BACKGROUND In the realm of emotion detection, comfort and portability play crucial roles in enhancing user experiences. However, few works study the reduction in the number of electrodes used to detect emotions, and none of them compare the location of these electrodes with a commercial low-cost headband. METHODS This work explores the potential of wearable EEG devices, specifically the Muse S headband, for emotion classification in terms of valence and arousal. We conducted a direct comparison between the Muse S, with its only four electrodes, and the DEAP dataset, which employs 32-electrode in a more intrusive headset. DEAP is a benchmark dataset constructed by emotions elicited by music. Our methodology focused on utilizing raw data and extracting four common frequency ranges. In particular, we select from DEAP the 4 electrodes that are similar to those in the Muse S. Additionally, we created a dataset using the Muse S, where we segmented the complete video into fixed-size temporal windows. Our 4-electrodes dataset uses film clips to elicit emotions, classified according to the Self-Assessment Manikin. RESULTS Our findings indicate that the Muse S, despite its limited electrode count, can effectively discriminate between high and low valence/arousal emotions with accuracy comparable to the accuracy obtained with all the DEAP electrodes. The Gamma band emerged as particularly effective for valence detection. Using a Muse device and raw data, the best performance achieved a G-Mean only 1-2% lower than that of the DEAP dataset, demonstrating that comparable results can be obtained with a simplified setup. CONCLUSIONS While the Muse-S did not reach DEAP in terms of outcomes, it proved to be a viable, lower-cost, less intrusive alternative, and adaptable for everyday use. The dataset created for this study is publicly available at https://doi.org/10.5281/zenodo.8431451.
Collapse
Affiliation(s)
- Francisco M Garcia-Moreno
- Department of Software Engineering, Computer Science School, University of Granada, Granada, Spain; Research Centre for Information and Communication Technologies (CITIC-UGR), University of Granada, Granada, Spain.
| | | | | | - Maria Jose Rodriguez-Fortiz
- Department of Software Engineering, Computer Science School, University of Granada, Granada, Spain; Research Centre for Information and Communication Technologies (CITIC-UGR), University of Granada, Granada, Spain
| | - Maria Bermudez-Edo
- Department of Software Engineering, Computer Science School, University of Granada, Granada, Spain; Research Centre for Information and Communication Technologies (CITIC-UGR), University of Granada, Granada, Spain
| |
Collapse
|
7
|
Oka H, Ono K, Panagiotis A. Attention-Based PSO-LSTM for Emotion Estimation Using EEG. SENSORS (BASEL, SWITZERLAND) 2024; 24:8174. [PMID: 39771907 PMCID: PMC11679865 DOI: 10.3390/s24248174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 12/13/2024] [Accepted: 12/18/2024] [Indexed: 01/11/2025]
Abstract
Recent advances in emotion recognition through Artificial Intelligence (AI) have demonstrated potential applications in various fields (e.g., healthcare, advertising, and driving technology), with electroencephalogram (EEG)-based approaches demonstrating superior accuracy compared to facial or vocal methods due to their resistance to intentional manipulation. This study presents a novel approach to enhance EEG-based emotion estimation accuracy by emphasizing temporal features and efficient parameter space exploration. We propose a model combining Long Short-Term Memory (LSTM) with an attention mechanism to highlight temporal features in EEG data while optimizing LSTM parameters through Particle Swarm Optimization (PSO). The attention mechanism assigned weights to LSTM hidden states, and PSO dynamically optimizes the vital parameters, including units, batch size, and dropout rate. Using the DEAP and SEED datasets, which serve as benchmark datasets for emotion estimation research using EEG, we evaluate the model's performance. For the DEAP dataset, we conduct a four-class classification of combinations of high and low valence and arousal states. We perform a three-class classification of negative, neutral, and positive emotions for the SEED dataset. The proposed model achieves an accuracy of 0.9409 on the DEAP dataset, surpassing the previous state-of-the-art accuracy of 0.9100 reported by Lin et al. The model attains an accuracy of 0.9732 on the SEED dataset, recording one of the highest accuracies among the related research. These results demonstrate that integrating the attention mechanism with PSO significantly improves the accuracy of EEG-based emotion estimation, contributing to the advancement of emotion recognition technology.
Collapse
Affiliation(s)
- Hayato Oka
- Master’s Program in Information and Computer Science, Doshisha University, Kyoto 610-0394, Japan
| | - Keiko Ono
- Department of Intelligent Information Engineering and Sciences, Doshisha University, Kyoto 610-0394, Japan;
| | - Adamidis Panagiotis
- Department of Information and Electronic Engineering, International Hellenic University, 57001 Thessaloniki, Greece;
| |
Collapse
|
8
|
Ranaut A, Khandnor P, Chand T. Identification of autism spectrum disorder using electroencephalography and machine learning: a review. J Neural Eng 2024; 21:061006. [PMID: 39580816 DOI: 10.1088/1741-2552/ad9681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 11/24/2024] [Indexed: 11/26/2024]
Abstract
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition characterized by communication barriers, societal disengagement, and monotonous actions. Traditional diagnostic methods for ASD rely on clinical observations and behavioural assessments, which are time-consuming. In recent years, researchers have focused mainly on the early diagnosis of ASD due to the unavailability of recognised causes and the lack of permanent curative solutions. Electroencephalography (EEG) research in ASD offers insight into the neural dynamics of affected individuals. This comprehensive review examines the unique integration of EEG, machine learning, and statistical analysis for ASD identification, highlighting the promise of an interdisciplinary approach for enhancing diagnostic precision. The comparative analysis of publicly available EEG datasets for ASD, along with local data acquisition methods and their technicalities, is presented in this paper. This study also compares preprocessing techniques, and feature extraction methods, followed by classification models and statistical analysis which are discussed in detail. In addition, it briefly touches upon comparisons with other modalities to contextualize the extensiveness of ASD research. Moreover, by outlining research gaps and future directions, this work aims to catalyse further exploration in the field, with the main goal of facilitating more efficient and effective early identification methods that may be helpful to the lives of ASD individuals.
Collapse
Affiliation(s)
- Anamika Ranaut
- Department of Computer Science and Engineering, Punjab Engineering College, Chandigarh, India
| | - Padmavati Khandnor
- Department of Computer Science and Engineering, Punjab Engineering College, Chandigarh, India
| | - Trilok Chand
- Department of Computer Science and Engineering, Punjab Engineering College, Chandigarh, India
| |
Collapse
|
9
|
Meng M, Xu B, Ma Y, Gao Y, Luo Z. STGAT-CS: spatio-temporal-graph attention network based channel selection for MI-based BCI. Cogn Neurodyn 2024; 18:3663-3678. [PMID: 39712131 PMCID: PMC11655804 DOI: 10.1007/s11571-024-10154-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 06/25/2024] [Accepted: 07/10/2024] [Indexed: 12/24/2024] Open
Abstract
Brain-computer interface (BCI) based on the motor imagery paradigm typically utilizes multi-channel electroencephalogram (EEG) to ensure accurate capture of physiological phenomena. However, excessive channels often contain redundant information and noise, which can significantly degrade BCI performance. Although there have been numerous studies on EEG channel selection, most of them require manual feature extraction, and the extracted features are difficult to fully represent the effective information of EEG signals. In this paper, we propose a spatio-temporal-graph attention network for channel selection (STGAT-CS) of EEG signals. We consider the EEG channels and their inter-channel connectivity as a graph and treat the channel selection problem as a node classification problem on the graph. We leverage the multi-head attention mechanism of graph attention network to dynamically capture topological relationships between nodes and update node features accordingly. Additionally, we introduce one-dimensional convolution to automatically extract temporal features from each channel in the original EEG signal, thereby obtaining more comprehensive spatiotemporal characteristics. In the classification tasks of the BCI Competition III Dataset IVa and BCI Competition IV Dataset I, STGAT-CS achieved average accuracies of 91.5% and 85.4% respectively, demonstrating the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Ming Meng
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018 Zhejiang China
| | - Bin Xu
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018 Zhejiang China
| | - Yuliang Ma
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018 Zhejiang China
| | - Yunyuan Gao
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018 Zhejiang China
| | - Zhizeng Luo
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018 Zhejiang China
| |
Collapse
|
10
|
Hu F, Wang F, Bi J, An Z, Chen C, Qu G, Han S. HASTF: a hybrid attention spatio-temporal feature fusion network for EEG emotion recognition. Front Neurosci 2024; 18:1479570. [PMID: 39469033 PMCID: PMC11513351 DOI: 10.3389/fnins.2024.1479570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Accepted: 09/30/2024] [Indexed: 10/30/2024] Open
Abstract
Introduction EEG-based emotion recognition has gradually become a new research direction, known as affective Brain-Computer Interface (aBCI), which has huge application potential in human-computer interaction and neuroscience. However, how to extract spatio-temporal fusion features from complex EEG signals and build learning method with high recognition accuracy and strong interpretability is still challenging. Methods In this paper, we propose a hybrid attention spatio-temporal feature fusion network for EEG-based emotion recognition. First, we designed a spatial attention feature extractor capable of merging shallow and deep features to extract spatial information and adaptively select crucial features under different emotional states. Then, the temporal feature extractor based on the multi-head attention mechanism is integrated to perform spatio-temporal feature fusion to achieve emotion recognition. Finally, we visualize the extracted spatial attention features using feature maps, further analyzing key channels corresponding to different emotions and subjects. Results Our method outperforms the current state-of-the-art methods on two public datasets, SEED and DEAP. The recognition accuracy are 99.12% ± 1.25% (SEED), 98.93% ± 1.45% (DEAP-arousal), and 98.57% ± 2.60% (DEAP-valence). We also conduct ablation experiments, using statistical methods to analyze the impact of each module on the final result. The spatial attention features reveal that emotion-related neural patterns indeed exist, which is consistent with conclusions in the field of neurology. Discussion The experimental results show that our method can effectively extract and fuse spatial and temporal information. It has excellent recognition performance, and also possesses strong robustness, performing stably across different datasets and experimental environments for emotion recognition.
Collapse
Affiliation(s)
- Fangzhou Hu
- Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
| | - Fei Wang
- Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
| | - Jinying Bi
- Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
| | - Zida An
- Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
| | - Chao Chen
- Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
| | - Gangguo Qu
- Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
| | - Shuai Han
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
11
|
Goshvarpour A, Goshvarpour A. EEG emotion recognition based on an innovative information potential index. Cogn Neurodyn 2024; 18:2177-2191. [PMID: 39555291 PMCID: PMC11564503 DOI: 10.1007/s11571-024-10077-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 01/16/2024] [Accepted: 01/28/2024] [Indexed: 11/19/2024] Open
Abstract
The recent exceptional demand for emotion recognition systems in clinical and non-medical applications has attracted the attention of many researchers. Since the brain is the primary object of understanding emotions and responding to them, electroencephalogram (EEG) signal analysis is one of the most popular approaches in affect classification. Previously, different approaches have been presented to benefit from brain connectivity information. We envisioned analyzing the interactions between brain electrodes with the information potential and providing a new index to quantify the connectivity matrix. The current study proposed a simple measure based on the cross-information potential between pairs of EEG electrodes to characterize emotions. This measure was tested for different EEG frequency bands to realize which EEG waves could be fruitful in recognizing emotions. Support vector machine and k-nearest neighbor (kNN) were implemented to classify four emotion categories based on two-dimensional valence and arousal space. Experimental results on the Database for Emotion Analysis using Physiological signals revealed a maximum accuracy of 90.14%, a sensitivity of 89.71%, and an F-score of 94.57% using kNN. The gamma frequency band obtained the highest recognition rates. Furthermore, low valence-low arousal was classified more effectively than other classes.
Collapse
Affiliation(s)
- Atefeh Goshvarpour
- Department of Biomedical Engineering, Faculty of Electrical Engineering, Sahand University of Technology, Tabriz, Iran
| | - Ateke Goshvarpour
- Department of Biomedical Engineering, Imam Reza International University, Mashhad, Razavi Khorasan Iran
- Health Technology Research Center, Imam Reza International University, Mashhad, Razavi Khorasan Iran
| |
Collapse
|
12
|
Li C, Wang F, Zhao Z, Wang H, Schuller BW. Attention-Based Temporal Graph Representation Learning for EEG-Based Emotion Recognition. IEEE J Biomed Health Inform 2024; 28:5755-5767. [PMID: 38696290 DOI: 10.1109/jbhi.2024.3395622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/04/2024]
Abstract
Due to the objectivity of emotional expression in the central nervous system, EEG-based emotion recognition can effectively reflect humans' internal emotional states. In recent years, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have made significant strides in extracting local features and temporal dependencies from EEG signals. However, CNNs ignore spatial distribution information from EEG electrodes; moreover, RNNs may encounter issues such as exploding/vanishing gradients and high time consumption. To address these limitations, we propose an attention-based temporal graph representation network (ATGRNet) for EEG-based emotion recognition. Firstly, a hierarchical attention mechanism is introduced to integrate feature representations from both frequency bands and channels ordered by priority in EEG signals. Second, a graph convolutional neural network with top-k operation is utilized to capture internal relationships between EEG electrodes under different emotion patterns. Next, a residual-based graph readout mechanism is applied to accumulate the EEG feature node-level representations into graph-level representations. Finally, the obtained graph-level representations are fed into a temporal convolutional network (TCN) to extract the temporal dependencies between EEG frames. We evaluated our proposed ATGRNet on the SEED, DEAP and FACED datasets. The experimental findings show that the proposed ATGRNet surpasses the state-of-the-art graph-based mehtods for EEG-based emotion recognition.
Collapse
|
13
|
An N, Gao Z, Li W, Cao F, Wang W, Xu W, Wang C, Xiang M, Gao Y, Wang D, Yu D, Ning X. Source localization comparison and combination of OPM-MEG and fMRI to detect sensorimotor cortex responses. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108292. [PMID: 38936152 DOI: 10.1016/j.cmpb.2024.108292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 04/29/2024] [Accepted: 06/16/2024] [Indexed: 06/29/2024]
Abstract
BACKGROUND AND OBJECTIVES The exploration of various neuroimaging techniques have become focal points within the field of neuroscience research. Magnetoencephalography based on optically pumped magnetometers (OPM-MEG) has shown significant potential to be the next generation of functional neuroimaging with the advantages of high signal intensity and flexible sensor arrangement. In this study, we constructed a 31-channel OPM-MEG system and performed a preliminary comparison of the temporal and spatial relationship between magnetic responses measured by OPM-MEG and blood-oxygen-level-dependent signals detected by functional magnetic resonance imaging (fMRI) during a grasping task. METHODS For OPM-MEG, the β-band (15-30 Hz) oscillatory activities can be reliably detected across multiple subjects and multiple session runs. To effectively localize the inhibitory oscillatory activities, a source power-spectrum ratio-based imaging method was proposed. This approach was compared with conventional source imaging methods, such as minimum norm-type and beamformer methods, and was applied in OPM-MEG source analysis. Subsequently, the spatial and temporal responses at the source-level between OPM-MEG and fMRI were analyzed. RESULTS The effectiveness of the proposed method was confirmed through simulations compared to benchmark methods. Our demonstration revealed an average spatial separation of 10.57 ± 4.41 mm between the localization results of OPM-MEG and fMRI across four subjects. Furthermore, the fMRI-constrained OPM-MEG localization results indicated a more focused imaging extent. CONCLUSIONS Taken together, the performance exhibited by OPM-MEG positions it as a potential instrument for functional surgery assessment.
Collapse
Affiliation(s)
- Nan An
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China
| | - Zhenfeng Gao
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China
| | - Wen Li
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China
| | - Fuzhi Cao
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; School of Engineering Medicine, Beihang University, Beijing, 100191, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China.
| | - Wenli Wang
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China
| | - Weinan Xu
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China
| | - Chunhui Wang
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China
| | - Min Xiang
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China; Hefei National Laboratory, Hefei, 230088, China
| | - Yang Gao
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China
| | - Dawei Wang
- Shandong Key Laboratory: Magnetic Field-free Medicine & Functional Imaging, Qilu hospital of Shandong University, Jinan, 250014, China
| | - Dexin Yu
- Shandong Key Laboratory: Magnetic Field-free Medicine & Functional Imaging, Qilu hospital of Shandong University, Jinan, 250014, China
| | - Xiaolin Ning
- Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing, 100191, China; Hangzhou Institute of Extremely-weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; Zhejiang Provincial Key Laboratory of Ultra-Weak Magnetic-Field Space and Applied Technology, Hangzhou Innovation Institute, Beihang University, Hangzhou, 310051, China; Hefei National Laboratory, Hefei, 230088, China
| |
Collapse
|
14
|
Yang K, Yao Z, Zhang K, Xu J, Zhu L, Cheng S, Zhang J. Automatically Extracting and Utilizing EEG Channel Importance Based on Graph Convolutional Network for Emotion Recognition. IEEE J Biomed Health Inform 2024; 28:4588-4598. [PMID: 38776202 DOI: 10.1109/jbhi.2024.3404146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
Graph convolutional network (GCN) based on the brain network has been widely used for EEG emotion recognition. However, most studies train their models directly without considering network dimensionality reduction beforehand. In fact, some nodes and edges are invalid information or even interference information for the current task. It is necessary to reduce the network dimension and extract the core network. To address the problem of extracting and utilizing the core network, a core network extraction model (CWGCN) based on channel weighting and graph convolutional network and a graph convolutional network model (CCSR-GCN) based on channel convolution and style-based recalibration for emotion recognition have been proposed. The CWGCN model automatically extracts the core network and the channel importance parameter in a data-driven manner. The CCSR-GCN model innovatively uses the output information of the CWGCN model to identify the emotion state. The experimental results on SEED show that: 1) the core network extraction can help improve the performance of the GCN model; 2) the models of CWGCN and CCSR-GCN achieve better results than the currently popular methods. The idea and its implementation in this paper provide a novel and successful perspective for the application of GCN in brain network analysis of other specific tasks.
Collapse
|
15
|
Yu H, Xiong X, Zhou J, Qian R, Sha K. CATM: A Multi-Feature-Based Cross-Scale Attentional Convolutional EEG Emotion Recognition Model. SENSORS (BASEL, SWITZERLAND) 2024; 24:4837. [PMID: 39123882 PMCID: PMC11314657 DOI: 10.3390/s24154837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 07/21/2024] [Accepted: 07/23/2024] [Indexed: 08/12/2024]
Abstract
Aiming at the problem that existing emotion recognition methods fail to make full use of the information in the time, frequency, and spatial domains in the EEG signals, which leads to the low accuracy of EEG emotion classification, this paper proposes a multi-feature, multi-frequency band-based cross-scale attention convolutional model (CATM). The model is mainly composed of a cross-scale attention module, a frequency-space attention module, a feature transition module, a temporal feature extraction module, and a depth classification module. First, the cross-scale attentional convolution module extracts spatial features at different scales for the preprocessed EEG signals; then, the frequency-space attention module assigns higher weights to important channels and spatial locations; next, the temporal feature extraction module extracts temporal features of the EEG signals; and, finally, the depth classification module categorizes the EEG signals into emotions. We evaluated the proposed method on the DEAP dataset with accuracies of 99.70% and 99.74% in the valence and arousal binary classification experiments, respectively; the accuracy in the valence-arousal four-classification experiment was 97.27%. In addition, considering the application of fewer channels, we also conducted 5-channel experiments, and the binary classification accuracies of valence and arousal were 97.96% and 98.11%, respectively. The valence-arousal four-classification accuracy was 92.86%. The experimental results show that the method proposed in this paper exhibits better results compared to other recent methods, and also achieves better results in few-channel experiments.
Collapse
Affiliation(s)
| | | | - Jianhua Zhou
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; (H.Y.); (X.X.); (R.Q.); (K.S.)
| | | | | |
Collapse
|
16
|
Li C, Pun SH, Li JW, Chen F. EEG-based Emotion Recognition using Graph Attention Network with Dual-Branch Attention Module. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40038966 DOI: 10.1109/embc53108.2024.10782334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
EEG reveals human brain activities for emotion and becomes an important aspect of affective computing. In this study, we developed a novel approach, namely DAM-GAT, which incorporated a dual-branch attention module (DAM) into a graph attention network (GAT) for EEG-based emotion recognition. This method used the GAT to capture the local features of emotional EEG signals. To enhance the important EEG features for emotion recognition, the proposed method also included a DAM that calculated weights considering both channel and frequency information. Additionally, the relationship between EEG channels was determined using the phase-locking value (PLV) connectivity of corresponding EEG signals. Based on the SEED datasets, the proposed approach provided an accuracy of up to 94.63% for emotion recognition, demonstrating its impressive performance compared with other existing methods.
Collapse
|
17
|
Samal P, Hashmi MF. An improved empirical mode decomposition method with ensemble classifiers for analysis of multichannel EEG in BCI emotion recognition. Comput Methods Biomech Biomed Engin 2024:1-24. [PMID: 38920119 DOI: 10.1080/10255842.2024.2369257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 06/12/2024] [Indexed: 06/27/2024]
Abstract
Emotion recognition using EEG is a difficult study because the signals' unstable behavior, which is brought on by the brain's complex neuronal activity, makes it difficult to extract the underlying patterns inside it. Therefore, to analyse the signal more efficiently, in this article, a hybrid model based on IEMD-KW-Ens (Improved Empirical Mode Decomposition-Kruskal Wallis-Ensemble classifiers) technique is used. Here IEMD based technique is proposed to interpret EEG signals by adding an improved sifting stopping criterion with median filter to get the optimal decomposed EEG signals for further processing. A mixture of time, frequency and non-linear distinct features are extracted for constructing the feature vector. Afterward, we conducted feature selection using KW test to remove the insignificant ones from the feature set. Later the classification of emotions in three-dimensional model is performed in two categories i.e. machine learning based RUSBoosted trees and deep learning based convolutional neural network (CNN) for DEAP and DREAMER datasets and the outcomes are evaluated for valence, arousal, and dominance classes. The findings demonstrate that the hybrid model can successfully classify emotions in multichannel EEG signals. The decomposition approach is also instructive for improving the model's utility in emotional computing.
Collapse
Affiliation(s)
- Priyadarsini Samal
- Department of Electronics and Communication Engineering, National Institute of Technology, Warangal, Telangana, India
| | - Mohammad Farukh Hashmi
- Department of Electronics and Communication Engineering, National Institute of Technology, Warangal, Telangana, India
| |
Collapse
|
18
|
Wang L, Wang S, Jin B, Wei X. GC-STCL: A Granger Causality-Based Spatial-Temporal Contrastive Learning Framework for EEG Emotion Recognition. ENTROPY (BASEL, SWITZERLAND) 2024; 26:540. [PMID: 39056903 PMCID: PMC11275820 DOI: 10.3390/e26070540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 06/06/2024] [Accepted: 06/17/2024] [Indexed: 07/28/2024]
Abstract
EEG signals capture information through multi-channel electrodes and hold promising prospects for human emotion recognition. However, the presence of high levels of noise and the diverse nature of EEG signals pose significant challenges, leading to potential overfitting issues that further complicate the extraction of meaningful information. To address this issue, we propose a Granger causal-based spatial-temporal contrastive learning framework, which significantly enhances the ability to capture EEG signal information by modeling rich spatial-temporal relationships. Specifically, in the spatial dimension, we employ a sampling strategy to select positive sample pairs from individuals watching the same video. Subsequently, a Granger causality test is utilized to enhance graph data and construct potential causality for each channel. Finally, a residual graph convolutional neural network is employed to extract features from EEG signals and compute spatial contrast loss. In the temporal dimension, we first apply a frequency domain noise reduction module for data enhancement on each time series. Then, we introduce the Granger-Former model to capture time domain representation and calculate the time contrast loss. We conduct extensive experiments on two publicly available sentiment recognition datasets (DEAP and SEED), achieving 1.65% improvement of the DEAP dataset and 1.55% improvement of the SEED dataset compared to state-of-the-art unsupervised models. Our method outperforms benchmark methods in terms of prediction accuracy as well as interpretability.
Collapse
Affiliation(s)
- Lei Wang
- School of Software Technology, Dalian University of Technology, Dalian 116024, China;
| | - Siming Wang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China;
| | - Bo Jin
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian 116024, China
| | - Xiaopeng Wei
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
19
|
Zhang X, Cheng X, Liu H. TPRO-NET: an EEG-based emotion recognition method reflecting subtle changes in emotion. Sci Rep 2024; 14:13491. [PMID: 38866813 PMCID: PMC11169376 DOI: 10.1038/s41598-024-62990-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 05/23/2024] [Indexed: 06/14/2024] Open
Abstract
Emotion recognition based on Electroencephalogram (EEG) has been applied in various fields, including human-computer interaction and healthcare. However, for the popular Valence-Arousal-Dominance emotion model, researchers often classify the dimensions into high and low categories, which cannot reflect subtle changes in emotion. Furthermore, there are issues with the design of EEG features and the efficiency of transformer. To address these issues, we have designed TPRO-NET, a neural network that takes differential entropy and enhanced differential entropy features as input and outputs emotion categories through convolutional layers and improved transformer encoders. For our experiments, we categorized the emotions in the DEAP dataset into 8 classes and those in the DREAMER dataset into 5 classes. On the DEAP and the DREAMER datasets, TPRO-NET achieved average accuracy rates of 97.63%/97.47%/97.88% and 98.18%/98.37%/98.40%, respectively, on the Valence/Arousal/Dominance dimension for the subject-dependent experiments. Compared to other advanced methods, TPRO-NET demonstrates superior performance.
Collapse
Affiliation(s)
- Xinyi Zhang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Suzhou Institute of Biomedical Engineering and Technology, China Academy of Science, Suzhou, 215163, China
| | - Xiankai Cheng
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China.
- Suzhou Institute of Biomedical Engineering and Technology, China Academy of Science, Suzhou, 215163, China.
| | - Hui Liu
- Cognitive Systems Lab, University of Bremen, Bremen, Germany.
| |
Collapse
|
20
|
Wu Z, Guo K, Luo E, Wang T, Wang S, Yang Y, Zhu X, Ding R. Medical long-tailed learning for imbalanced data: Bibliometric analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 247:108106. [PMID: 38452661 DOI: 10.1016/j.cmpb.2024.108106] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 02/24/2024] [Accepted: 02/26/2024] [Indexed: 03/09/2024]
Abstract
BACKGROUND In the last decade, long-tail learning has become a popular research focus in deep learning applications in medicine. However, no scientometric reports have provided a systematic overview of this scientific field. We utilized bibliometric techniques to identify and analyze the literature on long-tailed learning in deep learning applications in medicine and investigate research trends, core authors, and core journals. We expanded our understanding of the primary components and principal methodologies of long-tail learning research in the medical field. METHODS Web of Science was utilized to collect all articles on long-tailed learning in medicine published until December 2023. The suitability of all retrieved titles and abstracts was evaluated. For bibliometric analysis, all numerical data were extracted. CiteSpace was used to create clustered and visual knowledge graphs based on keywords. RESULTS A total of 579 articles met the evaluation criteria. Over the last decade, the annual number of publications and citation frequency both showed significant growth, following a power-law and exponential trend, respectively. Noteworthy contributors to this field include Husanbir Singh Pannu, Fadi Thabtah, and Talha Mahboob Alam, while leading journals such as IEEE ACCESS, COMPUTERS IN BIOLOGY AND MEDICINE, IEEE TRANSACTIONS ON MEDICAL IMAGING, and COMPUTERIZED MEDICAL IMAGING AND GRAPHICS have emerged as pivotal platforms for disseminating research in this area. The core of long-tailed learning research within the medical domain is encapsulated in six principal themes: deep learning for imbalanced data, model optimization, neural networks in image analysis, data imbalance in health records, CNN in diagnostics and risk assessment, and genetic information in disease mechanisms. CONCLUSION This study summarizes recent advancements in applying long-tail learning to deep learning in medicine through bibliometric analysis and visual knowledge graphs. It explains new trends, sources, core authors, journals, and research hotspots. Although this field has shown great promise in medical deep learning research, our findings will provide pertinent and valuable insights for future research and clinical practice.
Collapse
Affiliation(s)
- Zheng Wu
- School of Information Engineering, Hunan University of Science and Engineering, Yongzhou 425199, China.
| | - Kehua Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Entao Luo
- School of Information Engineering, Hunan University of Science and Engineering, Yongzhou 425199, China.
| | - Tian Wang
- BNU-UIC Institute of Artificial Intelligence and Future Networks, Beijing Normal University (BNU Zhuhai), Zhuhai, China.
| | - Shoujin Wang
- Data Science Institute, University of Technology Sydney, Sydney, Australia.
| | - Yi Yang
- Department of Computer Science, Northeastern Illinois University, Chicago, IL 60625, USA.
| | - Xiangyuan Zhu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Rui Ding
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
21
|
Çelebi M, Öztürk S, Kaplan K. An emotion recognition method based on EWT-3D-CNN-BiLSTM-GRU-AT model. Comput Biol Med 2024; 169:107954. [PMID: 38183705 DOI: 10.1016/j.compbiomed.2024.107954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/28/2023] [Accepted: 01/01/2024] [Indexed: 01/08/2024]
Abstract
This has become a significant study area in recent years because of its use in brain-machine interaction (BMI). The robustness problem of emotion classification is one of the most basic approaches for improving the quality of emotion recognition systems. One of the two main branches of these approaches deals with the problem by extracting the features using manual engineering and the other is the famous artificial intelligence approach, which infers features of EEG data. This study proposes a novel method that considers the characteristic behavior of EEG recordings and based on the artificial intelligence method. The EEG signal is a noisy signal with a non-stationary and non-linear form. Using the Empirical Wavelet Transform (EWT) signal decomposition method, the signal's frequency components are obtained. Then, frequency-based features, linear and non-linear features are extracted. The resulting frequency-based, linear, and nonlinear features are mapped to the 2-D axis according to the positions of the EEG electrodes. By merging this 2-D images, 3-D images are constructed. In this way, the multichannel brain frequency of EEG recordings, spatial and temporal relationship are combined. Lastly, 3-D deep learning framework was constructed, which was combined with convolutional neural network (CNN), bidirectional long-short term memory (BiLSTM) and gated recurrent unit (GRU) with self-attention (AT). This model is named EWT-3D-CNN-BiLSTM-GRU-AT. As a result, we have created framework comprising handcrafted features generated and cascaded from state-of-the-art deep learning models. The framework is evaluated on the DEAP recordings based on the person-independent approach. The experimental findings demonstrate that the developed model can achieve classification accuracies of 90.57 % and 90.59 % for valence and arousal axes, respectively, for the DEAP database. Compared with existing cutting-edge emotion classification models, the proposed framework exhibits superior results for classifying human emotions.
Collapse
Affiliation(s)
- Muharrem Çelebi
- Electronics and Communication Engineering, Kocaeli University, Kocaeli, 41001, Turkey.
| | - Sıtkı Öztürk
- Electronics and Communication Engineering, Kocaeli University, Kocaeli, 41001, Turkey.
| | - Kaplan Kaplan
- Software Engineering, Kocaeli University, Kocaeli, 41001, Turkey.
| |
Collapse
|
22
|
Wu M, Ouyang R, Zhou C, Sun Z, Li F, Li P. A study on the combination of functional connection features and Riemannian manifold in EEG emotion recognition. Front Neurosci 2024; 17:1345770. [PMID: 38287990 PMCID: PMC10823003 DOI: 10.3389/fnins.2023.1345770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 12/26/2023] [Indexed: 01/31/2024] Open
Abstract
Introduction Affective computing is the core for Human-computer interface (HCI) to be more intelligent, where electroencephalogram (EEG) based emotion recognition is one of the primary research orientations. Besides, in the field of brain-computer interface, Riemannian manifold is a highly robust and effective method. However, the symmetric positive definiteness (SPD) of the features limits its application. Methods In the present work, we introduced the Laplace matrix to transform the functional connection features, i.e., phase locking value (PLV), Pearson correlation coefficient (PCC), spectral coherent (COH), and mutual information (MI), to into semi-positive, and the max operator to ensure the transformed feature be positive. Then the SPD network is employed to extract the deep spatial information and a fully connected layer is employed to validate the effectiveness of the extracted features. Particularly, the decision layer fusion strategy is utilized to achieve more accurate and stable recognition results, and the differences of classification performance of different feature combinations are studied. What's more, the optimal threshold value applied to the functional connection feature is also studied. Results The public emotional dataset, SEED, is adopted to test the proposed method with subject dependent cross-validation strategy. The result of average accuracies for the four features indicate that PCC outperform others three features. The proposed model achieve best accuracy of 91.05% for the fusion of PLV, PCC, and COH, followed by the fusion of all four features with the accuracy of 90.16%. Discussion The experimental results demonstrate that the optimal thresholds for the four functional connection features always kept relatively stable within a fixed interval. In conclusion, the experimental results demonstrated the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Minchao Wu
- Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
- Key Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Guanghan, China
| | - Rui Ouyang
- Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
| | - Chang Zhou
- Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
| | - Zitong Sun
- Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
| | - Fan Li
- Key Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Guanghan, China
| | - Ping Li
- Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
23
|
Lin K, Zhang L, Cai J, Sun J, Cui W, Liu G. DSE-Mixer: A pure multilayer perceptron network for emotion recognition from EEG feature maps. J Neurosci Methods 2024; 401:110008. [PMID: 37967671 DOI: 10.1016/j.jneumeth.2023.110008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 09/20/2023] [Accepted: 11/09/2023] [Indexed: 11/17/2023]
Abstract
BACKGROUND Decoding emotions from brain maps is a challenging task. Convolutional Neural Network (CNN) is commonly used for EEG feature map. However, due to its local bias, CNN is unable to efficiently utilize the global spatial information of EEG signals which limits the accuracy of emotion recognition. NEW METHODS We design the Dual-scal EEG-Mixer(DSE-Mixer) model for EEG feature map processing. Its brain region mixer layer and electrode mixer layer are designed to fuse EEG information at different spatial scales. For each mixer layer, the structure of alternating mixing of rows and columns of the input table enables cross-regional and cross-Mchannel communication of EEG information. In addition, a channel attention mechanism is introduced to adaptively learn the importance of each channel. RESULTS On the DEAP dataset, the DSE-Mixer model achieved a binary classification accuracy of 95.19% for arousal and 95.22% for valence. For the four-class classification across valence and arousal, the accuracies were HVHA: 92.12%, HVLA: 89.77%, LVLA: 93.35%, and LVHA: 92.63%. On the SEED dataset, the average recognition accuracy for the three emotions (positive, negative, and neutral) is 93.69%. COMPARISON WITH EXISTING METHODS In the emotion recognition research based on the DEAP and SEED datasets, DSE-Mixer achieved a high ranking performance. Compared to the two commonly used model in computer vision field, CNN and Vision Transformer(VIT), DSE-Mixer achieved significantly higher classification accuracy while requiring much less computational complexity. CONCLUSIONS DSE-Mixer provides a novel brain map processing model with a small size, demonstrating outstanding performance in emotion recognition.
Collapse
Affiliation(s)
- Kai Lin
- Colleage of Instrumentation and Electrical Engineering, Jilin University, Changchun, 130000, Jilin, China.
| | - Linhang Zhang
- Colleage of Instrumentation and Electrical Engineering, Jilin University, Changchun, 130000, Jilin, China.
| | - Jing Cai
- Colleage of Instrumentation and Electrical Engineering, Jilin University, Changchun, 130000, Jilin, China.
| | - Jiaqi Sun
- Colleage of Instrumentation and Electrical Engineering, Jilin University, Changchun, 130000, Jilin, China.
| | - Wenjie Cui
- Colleage of Instrumentation and Electrical Engineering, Jilin University, Changchun, 130000, Jilin, China.
| | - Guangda Liu
- Colleage of Instrumentation and Electrical Engineering, Jilin University, Changchun, 130000, Jilin, China.
| |
Collapse
|
24
|
Li JW, Lin D, Che Y, Lv JJ, Chen RJ, Wang LJ, Zeng XX, Ren JC, Zhao HM, Lu X. An innovative EEG-based emotion recognition using a single channel-specific feature from the brain rhythm code method. Front Neurosci 2023; 17:1221512. [PMID: 37547144 PMCID: PMC10397731 DOI: 10.3389/fnins.2023.1221512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 06/30/2023] [Indexed: 08/08/2023] Open
Abstract
Introduction Efficiently recognizing emotions is a critical pursuit in brain-computer interface (BCI), as it has many applications for intelligent healthcare services. In this work, an innovative approach inspired by the genetic code in bioinformatics, which utilizes brain rhythm code features consisting of δ, θ, α, β, or γ, is proposed for electroencephalography (EEG)-based emotion recognition. Methods These features are first extracted from the sequencing technique. After evaluating them using four conventional machine learning classifiers, an optimal channel-specific feature that produces the highest accuracy in each emotional case is identified, so emotion recognition through minimal data is realized. By doing so, the complexity of emotion recognition can be significantly reduced, making it more achievable for practical hardware setups. Results The best classification accuracies achieved for the DEAP and MAHNOB datasets range from 83-92%, and for the SEED dataset, it is 78%. The experimental results are impressive, considering the minimal data employed. Further investigation of the optimal features shows that their representative channels are primarily on the frontal region, and associated rhythmic characteristics are typical of multiple kinds. Additionally, individual differences are found, as the optimal feature varies with subjects. Discussion Compared to previous studies, this work provides insights into designing portable devices, as only one electrode is appropriate to generate satisfactory performances. Consequently, it would advance the understanding of brain rhythms, which offers an innovative solution for classifying EEG signals in diverse BCI applications, including emotion recognition.
Collapse
Affiliation(s)
- Jia Wen Li
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, China
- Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan University of Science and Technology, Wuhan, China
- Guangxi Key Lab of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin, China
| | - Di Lin
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, China
- New Engineering Industry College, Putian University, Putian, China
| | - Yan Che
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, China
- New Engineering Industry College, Putian University, Putian, China
| | - Ju Jian Lv
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Rong Jun Chen
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Lei Jun Wang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Xian Xian Zeng
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Jin Chang Ren
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
- National Subsea Centre, Robert Gordon University, Aberdeen, United Kingdom
| | - Hui Min Zhao
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Xu Lu
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| |
Collapse
|
25
|
Shi Y, Li Y, Koike Y. Sparse Logistic Regression-Based EEG Channel Optimization Algorithm for Improved Universality across Participants. Bioengineering (Basel) 2023; 10:664. [PMID: 37370595 DOI: 10.3390/bioengineering10060664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 05/25/2023] [Accepted: 05/26/2023] [Indexed: 06/29/2023] Open
Abstract
Electroencephalogram (EEG) channel optimization can reduce redundant information and improve EEG decoding accuracy by selecting the most informative channels. This article aims to investigate the universality regarding EEG channel optimization in terms of how well the selected EEG channels can be generalized to different participants. In particular, this study proposes a sparse logistic regression (SLR)-based EEG channel optimization algorithm using a non-zero model parameter ranking method. The proposed channel optimization algorithm was evaluated in both individual analysis and group analysis using the raw EEG data, compared with the conventional channel selection method based on the correlation coefficients (CCS). The experimental results demonstrate that the SLR-based EEG channel optimization algorithm not only filters out most redundant channels (filters 75-96.9% of channels) with a 1.65-5.1% increase in decoding accuracy, but it can also achieve a satisfactory level of decoding accuracy in the group analysis by employing only a few (2-15) common EEG electrodes, even for different participants. The proposed channel optimization algorithm can realize better universality for EEG decoding, which can reduce the burden of EEG data acquisition and enhance the real-world application of EEG-based brain-computer interface (BCI).
Collapse
Affiliation(s)
- Yuxi Shi
- School of Engineering, Tokyo Institute of Technology, Yokohama 226-8503, Japan
| | - Yuanhao Li
- School of Engineering, Tokyo Institute of Technology, Yokohama 226-8503, Japan
| | - Yasuharu Koike
- Institute of Innovative Research, Tokyo Institute of Technology, Yokohama 226-8503, Japan
| |
Collapse
|
26
|
Zhao Y, Zeng H, Zheng H, Wu J, Kong W, Dai G. A bidirectional interaction-based hybrid network architecture for EEG cognitive recognition. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 238:107593. [PMID: 37209578 DOI: 10.1016/j.cmpb.2023.107593] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/04/2023] [Accepted: 05/08/2023] [Indexed: 05/22/2023]
Abstract
BACKGROUND AND OBJECTIVE Extracting cognitive representation and computational representation information simultaneously from electroencephalography (EEG) data and constructing corresponding information interaction models can effectively improve the recognition capability of brain cognitive status. However, due to the huge gap in the interaction between the two types of information, existing studies have yet to consider the advantages of the interaction of both. METHODS This paper introduces a novel architecture named the bidirectional interaction-based hybrid network (BIHN) for EEG cognitive recognition. BIHN consists of two networks: a cognitive-based network named CogN (e.g., graph convolution network, GCN; capsule network, CapsNet) and a computing-based network named ComN (e.g., EEGNet). CogN is responsible for extracting cognitive representation features from EEG data, while ComN is responsible for extracting computational representation features. Additionally, a bidirectional distillation-based coadaptation (BDC) algorithm is proposed to facilitate information interaction between CogN and ComN to realize the coadaptation of the two networks through bidirectional closed-loop feedback. RESULTS Cross-subject cognitive recognition experiments were performed on the Fatigue-Awake EEG dataset (FAAD, 2-class classification) and SEED dataset (3-class classification), and hybrid network pairs of GCN + EEGNet and CapsNet + EEGNet were verified. The proposed method achieved average accuracies of 78.76% (GCN + EEGNet) and 77.58% (CapsNet + EEGNet) on FAAD and 55.38% (GCN + EEGNet) and 55.10% (CapsNet + EEGNet) on SEED, outperforming the hybrid networks without the bidirectional interaction strategy. CONCLUSIONS Experimental results show that BIHN can achieve superior performance on two EEG datasets and enhance the ability of both CogN and ComN in EEG processing as well as cognitive recognition. We also validated its effectiveness with different hybrid network pairs. The proposed method could greatly promote the development of brain-computer collaborative intelligence.
Collapse
Affiliation(s)
- Yue Zhao
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Hong Zeng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China.
| | - Haohao Zheng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Jing Wu
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Wanzeng Kong
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China
| | - Guojun Dai
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China.
| |
Collapse
|