1
|
Huang Y, Chen Y, Xu S, Wu D, Wu X. Self-Supervised Learning with Adaptive Frequency-Time Attention Transformer for Seizure Prediction and Classification. Brain Sci 2025; 15:382. [PMID: 40309845 PMCID: PMC12025975 DOI: 10.3390/brainsci15040382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 03/08/2025] [Accepted: 03/12/2025] [Indexed: 05/02/2025] Open
Abstract
BACKGROUND In deep learning-based epilepsy prediction and classification, enhancing the extraction of electroencephalogram (EEG) features is crucial for improving model accuracy. Traditional supervised learning methods rely on large, detailed annotated datasets, limiting the feasibility of large-scale training. Recently, self-supervised learning approaches using masking-and-reconstruction strategies have emerged, reducing dependence on labeled data. However, these methods are vulnerable to inherent noise and signal degradation in EEG data, which diminishes feature extraction robustness and overall model performance. METHODS In this study, we proposed a self-supervised learning Transformer network enhanced with Adaptive Frequency-Time Attention (AFTA) for learning robust EEG feature representations from unlabeled data, utilizing a masking-and-reconstruction framework. Specifically, we pretrained the Transformer network using a self-supervised learning approach, and subsequently fine-tuned the pretrained model for downstream tasks like seizure prediction and classification. To mitigate the impact of inherent noise in EEG signals and enhance feature extraction capabilities, we incorporated AFTA into the Transformer architecture. AFTA incorporates an Adaptive Frequency Filtering Module (AFFM) to perform adaptive global and local filtering in the frequency domain. This module was then integrated with temporal attention mechanisms, enhancing the model's self-supervised learning capabilities. RESULT Our method achieved exceptional performance in EEG analysis tasks. Our method consistently outperformed state-of-the-art approaches across TUSZ, TUAB, and TUEV datasets, achieving the highest AUROC (0.891), balanced accuracy (0.8002), weighted F1-score (0.8038), and Cohen's kappa (0.6089). These results validate its robustness, generalization, and effectiveness in seizure detection and classification tasks on diverse EEG datasets.
Collapse
Affiliation(s)
| | | | | | | | - Xunyi Wu
- Department of Neurology, Huashan Hospital, Fudan University, Shanghai 200040, China; (Y.H.); (Y.C.); (S.X.); (D.W.)
| |
Collapse
|
2
|
Andronache C, Curǎvale D, Nicolae IE, Neacşu AA, Nicolae G, Ivanovici M. Tackling the possibility of extracting a brain digital fingerprint based on personal hobbies predilection. Front Neurosci 2025; 19:1487175. [PMID: 40143846 PMCID: PMC11937079 DOI: 10.3389/fnins.2025.1487175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Accepted: 02/24/2025] [Indexed: 03/28/2025] Open
Abstract
In an attempt to create a more familiar brain-machine interaction for biometric authentication applications, we investigated the efficiency of using the users' personal hobbies, interests, and memory collections. This approach creates a unique and pleasant experience that can be later utilized within an authentication protocol. This paper presents a new EEG dataset recorded while subjects watch images of popular hobbies, pictures with no point of interest and images with great personal significance. In addition, we propose several applications that can be tackled with our newly collected dataset. Namely, our study showcases 4 types of applications and we obtain state-of-the-art level results for all of them. The tackled tasks are: emotion classification, category classification, authorization process, and person identification. Our experiments show great potential for using EEG response to hobby visualization for people authentication. In our study, we show preliminary results for using predilection for personal hobbies, as measured by EEG, for identifying people. Also, we propose a novel authorization process paradigm using electroencephalograms. Code and dataset are available here.
Collapse
Affiliation(s)
- Cristina Andronache
- Sigma Laboratory, CAMPUS Institute, National University of Science and Technology Politehnica Bucharest, Bucharest, Romania
| | - Dan Curǎvale
- Sigma Laboratory, CAMPUS Institute, National University of Science and Technology Politehnica Bucharest, Bucharest, Romania
| | - Irina E. Nicolae
- Sigma Laboratory, CAMPUS Institute, National University of Science and Technology Politehnica Bucharest, Bucharest, Romania
| | - Ana A. Neacşu
- Sigma Laboratory, CAMPUS Institute, National University of Science and Technology Politehnica Bucharest, Bucharest, Romania
| | - Georgian Nicolae
- Sigma Laboratory, CAMPUS Institute, National University of Science and Technology Politehnica Bucharest, Bucharest, Romania
| | - Mihai Ivanovici
- Faculty of Electrical Engineering and Computer Science, Electronics and Computers Department, Transilvania University, Brasov, Romania
| |
Collapse
|
3
|
Xu M, Jiao J, Chen D, Ding Y, Chen Q, Wu J, Gu P, Pan Y, Peng X, Xiao N, Yang B, Li Q, Guo J. REI-Net: A Reference Electrode Standardization Interpolation Technique Based 3D CNN for Motor Imagery Classification. IEEE J Biomed Health Inform 2025; 29:2136-2147. [PMID: 40030217 DOI: 10.1109/jbhi.2024.3498916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2025]
Abstract
High-quality scalp EEG datasets are extremely valuable for motor imagery (MI) analysis. However, due to electrode size and montage, different datasets inevitably experience channel information loss, posing a significant challenge for MI decoding. A 2D representation that focuses on the time domain may loss the spatial information in EEG. In contrast, a 3D representation based on topography may suffer from channel loss and introduce noise through different padding methods. In this paper, we propose a framework called Reference Electrode Standardization Interpolation Network (REI-Net). Through an interpolation of 3D representation, REI-Net retains the temporal information in 2D scalp EEG while improving the spatial resolution within a certain montage. Additionally, to overcome the data variability caused by individual differences, transfer learning is employed to enhance the decoding robustness. Our approach achieves promising performance on two widely-recognized MI datasets, with an accuracy of 77.99% on BCI-C IV-2a and an accuracy of 63.94% on Kaya2018. The proposed algorithm outperforms the SOTAs leading to more accurate and robust results.
Collapse
|
4
|
Jia X, Chen J, Liu K, Wang Q, He J. Multimodal depression detection based on an attention graph convolution and transformer. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2025; 22:652-676. [PMID: 40083285 DOI: 10.3934/mbe.2025024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
Traditional depression detection methods typically rely on single-modal data, but these approaches are limited by individual differences, noise interference, and emotional fluctuations. To address the low accuracy in single-modal depression detection and the poor fusion of multimodal features from electroencephalogram (EEG) and speech signals, we have proposed a multimodal depression detection model based on EEG and speech signals, named the multi-head attention-GCN_ViT (MHA-GCN_ViT). This approach leverages deep learning techniques, including graph convolutional networks (GCN) and vision transformers (ViT), to effectively extract and fuse the frequency-domain features and spatiotemporal characteristics of EEG signals with the frequency-domain features of speech signals. First, a discrete wavelet transform (DWT) was used to extract wavelet features from 29 channels of EEG signals. These features serve as node attributes for the construction of a feature matrix, calculating the Pearson correlation coefficient between channels, from which an adjacency matrix is constructed to represent the brain network structure. This structure was then fed into a graph convolutional network (GCN) for deep feature learning. A multi-head attention mechanism was introduced to enhance the GCN's capability in representing brain networks. Using a short-time Fourier transform (STFT), we extracted 2D spectral features of EEG signals and mel spectrogram features of speech signals. Both were further processed using a vision transformer (ViT) to obtain deep features. Finally, the multiple features from EEG and speech spectrograms were fused at the decision level for depression classification. A five-fold cross-validation on the MODMA dataset demonstrated the model's accuracy, precision, recall, and F1 score of 89.03%, 90.16%, 89.04%, and 88.83%, respectively, indicating a significant improvement in the performance of multimodal depression detection. Furthermore, MHA-GCN_ViT demonstrated robust performance in depression detection and exhibited broad applicability, with potential for extension to multimodal detection tasks in other psychological and neurological disorders.
Collapse
Affiliation(s)
- Xiaowen Jia
- College of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, Shaanxi, China
| | - Jingxia Chen
- College of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, Shaanxi, China
| | - Kexin Liu
- College of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, Shaanxi, China
| | - Qian Wang
- College of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, Shaanxi, China
| | - Jialing He
- College of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, Shaanxi, China
| |
Collapse
|
5
|
Vafaei E, Hosseini M. Transformers in EEG Analysis: A Review of Architectures and Applications in Motor Imagery, Seizure, and Emotion Classification. SENSORS (BASEL, SWITZERLAND) 2025; 25:1293. [PMID: 40096020 PMCID: PMC11902326 DOI: 10.3390/s25051293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2025] [Revised: 02/17/2025] [Accepted: 02/18/2025] [Indexed: 03/19/2025]
Abstract
Transformers have rapidly influenced research across various domains. With their superior capability to encode long sequences, they have demonstrated exceptional performance, outperforming existing machine learning methods. There has been a rapid increase in the development of transformer-based models for EEG analysis. The high volumes of recently published papers highlight the need for further studies exploring transformer architectures, key components, and models employed particularly in EEG studies. This paper aims to explore four major transformer architectures: Time Series Transformer, Vision Transformer, Graph Attention Transformer, and hybrid models, along with their variants in recent EEG analysis. We categorize transformer-based EEG studies according to the most frequent applications in motor imagery classification, emotion recognition, and seizure detection. This paper also highlights the challenges of applying transformers to EEG datasets and reviews data augmentation and transfer learning as potential solutions explored in recent years. Finally, we provide a summarized comparison of the most recent reported results. We hope this paper serves as a roadmap for researchers interested in employing transformer architectures in EEG analysis.
Collapse
Affiliation(s)
- Elnaz Vafaei
- Department of Psychology, Northeastern University, Boston, MA 02115, USA
| | - Mohammad Hosseini
- Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran 1477893855, Iran
| |
Collapse
|
6
|
Feng X, Guo Z, Kwong S. ID3RSNet: cross-subject driver drowsiness detection from raw single-channel EEG with an interpretable residual shrinkage network. Front Neurosci 2025; 18:1508747. [PMID: 39844854 PMCID: PMC11751225 DOI: 10.3389/fnins.2024.1508747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Accepted: 12/24/2024] [Indexed: 01/24/2025] Open
Abstract
Accurate monitoring of drowsy driving through electroencephalography (EEG) can effectively reduce traffic accidents. Developing a calibration-free drowsiness detection system with single-channel EEG alone is very challenging due to the non-stationarity of EEG signals, the heterogeneity among different individuals, and the relatively parsimonious compared to multi-channel EEG. Although deep learning-based approaches can effectively decode EEG signals, most deep learning models lack interpretability due to their black-box nature. To address these issues, we propose a novel interpretable residual shrinkage network, namely, ID3RSNet, for cross-subject driver drowsiness detection using single-channel EEG signals. First, a base feature extractor is employed to extract the essential features of EEG frequencies; to enhance the discriminative feature learning ability, the residual shrinkage building unit with attention mechanism is adopted to perform adaptive feature recalibration and soft threshold denoising inside the residual network is further applied to achieve automatic feature extraction. In addition, a fully connected layer with weight freezing is utilized to effectively suppress the negative influence of neurons on the model classification. With the global average pooling (GAP) layer incorporated in the residual shrinkage network structure, we introduce an EEG-based Class Activation Map (ECAM) interpretable method to enable visualization analysis of sample-wise learned patterns to effectively explain the model decision. Extensive experimental results demonstrate that the proposed method achieves the superior classification performance and has found neurophysiologically reliable evidence of classification.
Collapse
Affiliation(s)
- Xiao Feng
- School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, China
- Henan High-speed Railway Operation and Maintenance Engineering Research Center, Zhengzhou, Henan, China
| | - Zhongyuan Guo
- College of Electronic and Information Engineering, Southwest University, Chongqing, China
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Sam Kwong
- School of Data Science, Lingnan University, Hong Kong SAR, China
| |
Collapse
|
7
|
Liu Z, Zhao J. Leveraging deep learning for robust EEG analysis in mental health monitoring. Front Neuroinform 2025; 18:1494970. [PMID: 39829439 PMCID: PMC11739345 DOI: 10.3389/fninf.2024.1494970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Accepted: 12/02/2024] [Indexed: 01/22/2025] Open
Abstract
Introduction Mental health monitoring utilizing EEG analysis has garnered notable interest due to the non-invasive characteristics and rich temporal information encoded in EEG signals, which are indicative of cognitive and emotional conditions. Conventional methods for EEG-based mental health evaluation often depend on manually crafted features or basic machine learning approaches, like support vector classifiers or superficial neural networks. Despite the potential of these approaches, they often fall short in capturing the intricate spatiotemporal relationships within EEG data, leading to lower classification accuracy and poor adaptability across various populations and mental health scenarios. Methods To overcome these limitations, we introduce the EEG Mind-Transformer, an innovative deep learning architecture composed of a Dynamic Temporal Graph Attention Mechanism (DT-GAM), a Hierarchical Graph Representation and Analysis (HGRA) module, and a Spatial-Temporal Fusion Module (STFM). The DT-GAM is designed to dynamically extract temporal dependencies within EEG data, while the HGRA models the brain's hierarchical structure to capture both localized and global interactions among different brain regions. The STFM synthesizes spatial and temporal elements, generating a comprehensive representation of EEG signals. Results and discussion Our empirical results confirm that the EEG Mind-Transformer significantly surpasses conventional approaches, achieving an accuracy of 92.5%, a recall of 91.3%, an F1-score of 90.8%, and an AUC of 94.2% across several datasets. These findings underline the model's robustness and its generalizability to diverse mental health conditions. Moreover, the EEG Mind-Transformer not only pushes the boundaries of state-of-the-art EEG-based mental health monitoring but also offers meaningful insights into the underlying brain functions associated with mental disorders, solidifying its value for both research and clinical settings.
Collapse
Affiliation(s)
- Zixiang Liu
- Anhui Vocational College of Grain Engineering, Hefei, China
| | | |
Collapse
|
8
|
Lu W, Xia L, Tan TP, Ma H. CIT-EmotionNet: convolution interactive transformer network for EEG emotion recognition. PeerJ Comput Sci 2024; 10:e2610. [PMID: 39896395 PMCID: PMC11784834 DOI: 10.7717/peerj-cs.2610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 11/25/2024] [Indexed: 02/04/2025]
Abstract
Emotion recognition is a significant research problem in affective computing as it has a lot of potential areas of application. One of the approaches in emotion recognition uses electroencephalogram (EEG) signals to identify the emotion of a person. However, effectively using the global and local features of EEG signals to improve the performance of emotion recognition is still a challenge. In this study, we propose a novel Convolution Interactive Transformer Network for EEG Emotion Recognition, known as CIT-EmotionNet, which efficiently integrates the global and local features of EEG signals. We convert the raw EEG signals into spatial-spectral representations, which serve as the inputs into the model. The model integrates convolutional neural network (CNN) and Transformer within a single framework in a parallel manner. We propose a Convolution Interactive Transformer module, which facilitates the interaction and fusion of local and global features extracted by CNN and Transformer respectively, thereby improving the average accuracy of emotion recognition. The proposed CIT-EmotionNet outperforms state-of-the-art methods, achieving an average recognition accuracy of 98.57% and 92.09% on two publicly available datasets, SEED and SEED-IV, respectively.
Collapse
Affiliation(s)
- Wei Lu
- Henan High-Speed Railway Operation and Maintenance Engineering Research Center, Zhengzhou Railway Vocational and Technical College, Zhengzhou, Henan, China
- School of Computer Sciences, Universiti Sains Malaysia, USM, Pulau Pinang, Malaysia
- Zhengzhou University Industrial Technology Research Institute, Zhengzhou, Henan, China
| | - Lingnan Xia
- Henan High-Speed Railway Operation and Maintenance Engineering Research Center, Zhengzhou Railway Vocational and Technical College, Zhengzhou, Henan, China
| | - Tien Ping Tan
- School of Computer Sciences, Universiti Sains Malaysia, USM, Pulau Pinang, Malaysia
| | - Hua Ma
- Henan High-Speed Railway Operation and Maintenance Engineering Research Center, Zhengzhou Railway Vocational and Technical College, Zhengzhou, Henan, China
- Zhengzhou University Industrial Technology Research Institute, Zhengzhou, Henan, China
| |
Collapse
|
9
|
Zhang X, Landsness EC, Miao H, Chen W, Tang MJ, Brier LM, Culver JP, Lee JM, Anastasio MA. Attention-based CNN-BiLSTM for sleep state classification of spatiotemporal wide-field calcium imaging data. J Neurosci Methods 2024; 411:110250. [PMID: 39151658 PMCID: PMC12104862 DOI: 10.1016/j.jneumeth.2024.110250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 08/03/2024] [Accepted: 08/13/2024] [Indexed: 08/19/2024]
Abstract
BACKGROUND Wide-field calcium imaging (WFCI) with genetically encoded calcium indicators allows for spatiotemporal recordings of neuronal activity in mice. When applied to the study of sleep, WFCI data are manually scored into the sleep states of wakefulness, non-REM (NREM) and REM by use of adjunct EEG and EMG recordings. However, this process is time-consuming, invasive and often suffers from low inter- and intra-rater reliability. Therefore, an automated sleep state classification method that operates on spatiotemporal WFCI data is desired. NEW METHOD A hybrid network architecture consisting of a convolutional neural network (CNN) to extract spatial features of image frames and a bidirectional long short-term memory network (BiLSTM) with attention mechanism to identify temporal dependencies among different time points was proposed to classify WFCI data into states of wakefulness, NREM and REM sleep. RESULTS Sleep states were classified with an accuracy of 84 % and Cohen's κ of 0.64. Gradient-weighted class activation maps revealed that the frontal region of the cortex carries more importance when classifying WFCI data into NREM sleep while posterior area contributes most to the identification of wakefulness. The attention scores indicated that the proposed network focuses on short- and long-range temporal dependency in a state-specific manner. COMPARISON WITH EXISTING METHOD On a held out, repeated 3-hour WFCI recording, the CNN-BiLSTM achieved a κ of 0.67, comparable to a κ of 0.65 corresponding to the human EEG/EMG-based scoring. CONCLUSIONS The CNN-BiLSTM effectively classifies sleep states from spatiotemporal WFCI data and will enable broader application of WFCI in sleep research.
Collapse
Affiliation(s)
- Xiaohui Zhang
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Eric C Landsness
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Hanyang Miao
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Wei Chen
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Michelle J Tang
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Lindsey M Brier
- Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Joseph P Culver
- Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Biomedical Engineering, Washington University School of Engineering, St. Louis, MO 63130, USA; Department of Electrical and Systems Engineering, Washington University School of Engineering, St. Louis, MO 63130, USA; Department of Physics, Washington University School of Arts and Science, St. Louis, Mo 63130, USA
| | - Jin-Moo Lee
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Biomedical Engineering, Washington University School of Engineering, St. Louis, MO 63130, USA
| | - Mark A Anastasio
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
10
|
Lee MH, Shomanov A, Begim B, Kabidenova Z, Nyssanbay A, Yazici A, Lee SW. EAV: EEG-Audio-Video Dataset for Emotion Recognition in Conversational Contexts. Sci Data 2024; 11:1026. [PMID: 39300129 PMCID: PMC11413008 DOI: 10.1038/s41597-024-03838-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 08/29/2024] [Indexed: 09/22/2024] Open
Abstract
Understanding emotional states is pivotal for the development of next-generation human-machine interfaces. Human behaviors in social interactions have resulted in psycho-physiological processes influenced by perceptual inputs. Therefore, efforts to comprehend brain functions and human behavior could potentially catalyze the development of AI models with human-like attributes. In this study, we introduce a multimodal emotion dataset comprising data from 30-channel electroencephalography (EEG), audio, and video recordings from 42 participants. Each participant engaged in a cue-based conversation scenario, eliciting five distinct emotions: neutral, anger, happiness, sadness, and calmness. Throughout the experiment, each participant contributed 200 interactions, which encompassed both listening and speaking. This resulted in a cumulative total of 8,400 interactions across all participants. We evaluated the baseline performance of emotion recognition for each modality using established deep neural network (DNN) methods. The Emotion in EEG-Audio-Visual (EAV) dataset represents the first public dataset to incorporate three primary modalities for emotion recognition within a conversational context. We anticipate that this dataset will make significant contributions to the modeling of the human emotional process, encompassing both fundamental neuroscience and machine learning viewpoints.
Collapse
Affiliation(s)
- Min-Ho Lee
- Nazarbayev University, Department of Computer Science, Astana, 010000, Republic of Kazakhstan
| | - Adai Shomanov
- Nazarbayev University, Department of Computer Science, Astana, 010000, Republic of Kazakhstan
| | - Balgyn Begim
- Nazarbayev University, Department of Computer Science, Astana, 010000, Republic of Kazakhstan
| | - Zhuldyz Kabidenova
- Nazarbayev University, Department of Computer Science, Astana, 010000, Republic of Kazakhstan
| | - Aruna Nyssanbay
- Nazarbayev University, Department of Computer Science, Astana, 010000, Republic of Kazakhstan
| | - Adnan Yazici
- Nazarbayev University, Department of Computer Science, Astana, 010000, Republic of Kazakhstan
| | - Seong-Whan Lee
- Korea University, Department of Artificial Intelligence, Seoul, 02841, Republic of Korea.
| |
Collapse
|
11
|
Ma W, Zheng Y, Li T, Li Z, Li Y, Wang L. A comprehensive review of deep learning in EEG-based emotion recognition: classifications, trends, and practical implications. PeerJ Comput Sci 2024; 10:e2065. [PMID: 38855206 PMCID: PMC11157589 DOI: 10.7717/peerj-cs.2065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 04/25/2024] [Indexed: 06/11/2024]
Abstract
Emotion recognition utilizing EEG signals has emerged as a pivotal component of human-computer interaction. In recent years, with the relentless advancement of deep learning techniques, using deep learning for analyzing EEG signals has assumed a prominent role in emotion recognition. Applying deep learning in the context of EEG-based emotion recognition carries profound practical implications. Although many model approaches and some review articles have scrutinized this domain, they have yet to undergo a comprehensive and precise classification and summarization process. The existing classifications are somewhat coarse, with insufficient attention given to the potential applications within this domain. Therefore, this article systematically classifies recent developments in EEG-based emotion recognition, providing researchers with a lucid understanding of this field's various trajectories and methodologies. Additionally, it elucidates why distinct directions necessitate distinct modeling approaches. In conclusion, this article synthesizes and dissects the practical significance of EEG signals in emotion recognition, emphasizing its promising avenues for future application.
Collapse
Affiliation(s)
- Weizhi Ma
- School of Information Science and Technology, North China University of Technology, Beijing, China
| | - Yujia Zheng
- School of Information Science and Technology, North China University of Technology, Beijing, China
| | - Tianhao Li
- School of Information Science and Technology, North China University of Technology, Beijing, China
| | - Zhengping Li
- School of Information Science and Technology, North China University of Technology, Beijing, China
| | - Ying Li
- School of Information Science and Technology, North China University of Technology, Beijing, China
| | - Lijun Wang
- School of Information Science and Technology, North China University of Technology, Beijing, China
| |
Collapse
|
12
|
Zuo Q, Li R, Shi B, Hong J, Zhu Y, Chen X, Wu Y, Guo J. U-shaped convolutional transformer GAN with multi-resolution consistency loss for restoring brain functional time-series and dementia diagnosis. Front Comput Neurosci 2024; 18:1387004. [PMID: 38694950 PMCID: PMC11061376 DOI: 10.3389/fncom.2024.1387004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 04/02/2024] [Indexed: 05/04/2024] Open
Abstract
Introduction The blood oxygen level-dependent (BOLD) signal derived from functional neuroimaging is commonly used in brain network analysis and dementia diagnosis. Missing the BOLD signal may lead to bad performance and misinterpretation of findings when analyzing neurological disease. Few studies have focused on the restoration of brain functional time-series data. Methods In this paper, a novel U-shaped convolutional transformer GAN (UCT-GAN) model is proposed to restore the missing brain functional time-series data. The proposed model leverages the power of generative adversarial networks (GANs) while incorporating a U-shaped architecture to effectively capture hierarchical features in the restoration process. Besides, the multi-level temporal-correlated attention and the convolutional sampling in the transformer-based generator are devised to capture the global and local temporal features for the missing time series and associate their long-range relationship with the other brain regions. Furthermore, by introducing multi-resolution consistency loss, the proposed model can promote the learning of diverse temporal patterns and maintain consistency across different temporal resolutions, thus effectively restoring complex brain functional dynamics. Results We theoretically tested our model on the public Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, and our experiments demonstrate that the proposed model outperforms existing methods in terms of both quantitative metrics and qualitative assessments. The model's ability to preserve the underlying topological structure of the brain functional networks during restoration is a particularly notable achievement. Conclusion Overall, the proposed model offers a promising solution for restoring brain functional time-series and contributes to the advancement of neuroscience research by providing enhanced tools for disease analysis and interpretation.
Collapse
Affiliation(s)
- Qiankun Zuo
- Hubei Key Laboratory of Digital Finance Innovation, Hubei University of Economics, Wuhan, Hubei, China
- School of Information Engineering, Hubei University of Economics, Wuhan, Hubei, China
- Hubei Internet Finance Information Engineering Technology Research Center, Hubei University of Economics, Wuhan, Hubei, China
| | - Ruiheng Li
- Hubei Key Laboratory of Digital Finance Innovation, Hubei University of Economics, Wuhan, Hubei, China
- School of Information Engineering, Hubei University of Economics, Wuhan, Hubei, China
| | - Binghua Shi
- Hubei Key Laboratory of Digital Finance Innovation, Hubei University of Economics, Wuhan, Hubei, China
- School of Information Engineering, Hubei University of Economics, Wuhan, Hubei, China
| | - Jin Hong
- Medical Research Institute, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Yanfei Zhu
- School of Foreign Languages, Sun Yat-sen University, Guangzhou, China
| | - Xuhang Chen
- Faculty of Science and Technology, University of Macau, Taipa, Macao SAR, China
| | - Yixian Wu
- School of Mechanical Engineering, Beijing Institute of Petrochemical Technology, Beijing, China
| | - Jia Guo
- Hubei Key Laboratory of Digital Finance Innovation, Hubei University of Economics, Wuhan, Hubei, China
- School of Information Engineering, Hubei University of Economics, Wuhan, Hubei, China
- Hubei Internet Finance Information Engineering Technology Research Center, Hubei University of Economics, Wuhan, Hubei, China
| |
Collapse
|
13
|
Wang C, Xiao Z, Xu Y, Zhang Q, Chen J. A novel approach for ASD recognition based on graph attention networks. Front Comput Neurosci 2024; 18:1388083. [PMID: 38659616 PMCID: PMC11039788 DOI: 10.3389/fncom.2024.1388083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/02/2024] [Indexed: 04/26/2024] Open
Abstract
Early detection and diagnosis of Autism Spectrum Disorder (ASD) can significantly improve the quality of life for affected individuals. Identifying ASD based on brain functional connectivity (FC) poses a challenge due to the high heterogeneity of subjects' fMRI data in different sites. Meanwhile, deep learning algorithms show efficacy in ASD identification but lack interpretability. In this paper, a novel approach for ASD recognition is proposed based on graph attention networks. Specifically, we treat the region of interest (ROI) of the subjects as node, conduct wavelet decomposition of the BOLD signal in each ROI, extract wavelet features, and utilize them along with the mean and variance of the BOLD signal as node features, and the optimized FC matrix as the adjacency matrix, respectively. We then employ the self-attention mechanism to capture long-range dependencies among features. To enhance interpretability, the node-selection pooling layers are designed to determine the importance of ROI for prediction. The proposed framework are applied to fMRI data of children (younger than 12 years old) from the Autism Brain Imaging Data Exchange datasets. Promising results demonstrate superior performance compared to recent similar studies. The obtained ROI detection results exhibit high correspondence with previous studies and offer good interpretability.
Collapse
Affiliation(s)
- Canhua Wang
- School of Computer, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Zhiyong Xiao
- School of Electronic & Information Engineering, Jiangxi Institute of Economic Administrators, Nanchang, China
| | - Yilu Xu
- School of Software, Jiangxi Agricultural University, Nanchang, China
| | - Qi Zhang
- Department of Medical Imaging, Affiliated Hospital of Jiangxi University of Chinese Medicine, Nanchang, China
| | - Jingfang Chen
- Department of Medical Imaging, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| |
Collapse
|