1
|
Chen J, Han J, Su P, Zhou G. Framework for Groove Rating in Exercise-Enhancing Music Based on a CNN-TCN Architecture with Integrated Entropy Regularization and Pooling. ENTROPY (BASEL, SWITZERLAND) 2025; 27:317. [PMID: 40149241 PMCID: PMC11941122 DOI: 10.3390/e27030317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 02/17/2025] [Accepted: 02/26/2025] [Indexed: 03/29/2025]
Abstract
Groove, a complex aspect of music perception, plays a crucial role in eliciting emotional and physical responses from listeners. However, accurately quantifying and predicting groove remains challenging due to its intricate acoustic features. To address this, we propose a novel framework for groove rating that integrates Convolutional Neural Networks (CNNs) with Temporal Convolutional Networks (TCNs), enhanced by entropy regularization and entropy-pooling techniques. Our approach processes audio files into Mel-spectrograms, which are analyzed by a CNN for feature extraction and by a TCN to capture long-range temporal dependencies, enabling precise groove-level prediction. Experimental results show that our CNN-TCN framework significantly outperforms benchmark methods in predictive accuracy. The integration of entropy pooling and regularization is critical, with their omission leading to notable reductions in R2 values. Our method also surpasses the performance of CNN and other machine-learning models, including long short-term memory (LSTM) networks and support vector machine (SVM) variants. This study establishes a strong foundation for the automated assessment of musical groove, with potential applications in music education, therapy, and composition. Future research will focus on expanding the dataset, enhancing model generalization, and exploring additional machine-learning techniques to further elucidate the factors influencing groove perception.
Collapse
Affiliation(s)
- Jiangang Chen
- College of Sports and Health Sciences, Xi’an Physical Education University, Xi’an 710068, China
- School of P. E and Sports, Beijing Normal University, Beijing 100875, China
| | - Junbo Han
- School of P. E and Sports, Beijing Normal University, Beijing 100875, China
| | - Pei Su
- School of P. E and Sports, Beijing Normal University, Beijing 100875, China
| | - Gaoquan Zhou
- School of P. E and Sports, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
2
|
Elnaggar K, El-Gayar MM, Elmogy M. Depression Detection and Diagnosis Based on Electroencephalogram (EEG) Analysis: A Systematic Review. Diagnostics (Basel) 2025; 15:210. [PMID: 39857094 PMCID: PMC11765027 DOI: 10.3390/diagnostics15020210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 01/03/2025] [Accepted: 01/07/2025] [Indexed: 01/27/2025] Open
Abstract
Background: Mental disorders are disturbances of brain functions that cause cognitive, affective, volitional, and behavioral functions to be disrupted to varying degrees. One of these disorders is depression, a significant factor contributing to the increase in suicide cases worldwide. Consequently, depression has become a significant public health issue globally. Electroencephalogram (EEG) data can be utilized to diagnose mild depression disorder (MDD), offering valuable insights into the pathophysiological mechanisms underlying mental disorders and enhancing the understanding of MDD. Methods: This survey emphasizes the critical role of EEG in advancing artificial intelligence (AI)-driven approaches for depression diagnosis. By focusing on studies that integrate EEG with machine learning (ML) and deep learning (DL) techniques, we systematically analyze methods utilizing EEG signals to identify depression biomarkers. The survey highlights advancements in EEG preprocessing, feature extraction, and model development, showcasing how these approaches enhance the diagnostic precision, scalability, and automation of depression detection. Results: This survey is distinguished from prior reviews by addressing their limitations and providing researchers with valuable insights for future studies. It offers a comprehensive comparison of ML and DL approaches utilizing EEG and an overview of the five key steps in depression detection. The survey also presents existing datasets for depression diagnosis and critically analyzes their limitations. Furthermore, it explores future directions and challenges, such as enhancing diagnostic robustness with data augmentation techniques and optimizing EEG channel selection for improved accuracy. The potential of transfer learning and encoder-decoder architectures to leverage pre-trained models and enhance diagnostic performance is also discussed. Advancements in feature extraction methods for automated depression diagnosis are highlighted as avenues for improving ML and DL model performance. Additionally, integrating Internet of Things (IoT) devices with EEG for continuous mental health monitoring and distinguishing between different types of depression are identified as critical research areas. Finally, the review emphasizes improving the reliability and predictability of computational intelligence-based models to advance depression diagnosis. Conclusions: This study will serve as a well-organized and helpful reference for researchers working on detecting depression using EEG signals and provide insights into the future directions outlined above, guiding further advancements in the field.
Collapse
Affiliation(s)
- Kholoud Elnaggar
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt;
| | - Mostafa M. El-Gayar
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt;
- Department of Computer Science, Arab East Colleges, Riyadh 11583, Saudi Arabia
| | - Mohammed Elmogy
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt;
| |
Collapse
|
3
|
Zhang T, Yan X, Chen X, Mao Y. XCF-LSTMSATNet: A Classification Approach for EEG Signals Evoked by Dynamic Random Dot Stereograms. IEEE Trans Neural Syst Rehabil Eng 2025; PP:502-513. [PMID: 40030938 DOI: 10.1109/tnsre.2025.3529991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Stereovision is the visual perception of depth derived from the integration of two slightly different images from each eye, enabling understanding of the three-dimensional space. This capability is deeply intertwined with cognitive brain functions. To explore the impact of stereograms with varied motions on brain activities, we collected Electroencephalography (EEG) signals evoked by Dynamic Random Dot Stereograms (DRDS). To effectively classify the EEG signals induced by DRDS, we introduced a novel hybrid neural network model, XCF-LSTMSATNet, which integrates an XGBoost Channel Feature Optimization Module with the EEGNet and an LSTM Self-Attention Modules. Initially, in the channel selection phase, XGBoost is employed for preliminary classification and feature weight analysis, which can enhance our channel selection strategy. Following this, EEGNet employs deep convolutional layers to extract spatial features, while separable convolutions are subsequently used to derive high-dimensional spatial-temporal features. Meanwhile, the LSTMSAT Module, with its capability to learn long-term dependencies in time-series signals, is deployed to capture temporal continuity information. The incorporation of the self-attention mechanism further amplifies the model's ability to grasp long-distance dependencies and enables dynamic weight allocation to the extracted features. In the end, both temporal and spatial features are integrated into the classification module, enabling precise prediction across three categories of EEG signals. The proposed XCF-LSTMSATNet was extensively tested on both a custom dataset and the public datasets SRDA and SRDB. The results demonstrate that the model exhibits solid classification performance across all three datasets, effectively showcasing its robustness and generalization capabilities.
Collapse
|
4
|
Xia X, Cheng Y, Zhang Z, Hua Z, Wang Q, Shi Y, Men H. Advancing research on odor-induced sweetness enhancement: A EEG local-global fusion transformer network for sweetness quantification combined with EEG technology. Food Chem 2025; 463:141533. [PMID: 39388878 DOI: 10.1016/j.foodchem.2024.141533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 09/18/2024] [Accepted: 10/02/2024] [Indexed: 10/12/2024]
Abstract
Reducing sugar intake is crucial for health, and odor sweetening enhances food enjoyment and quality perception. Current research relies on subjective manual sensory evaluations, which are poorly reproducible. Traditional methods also fail to capture dynamic neural responses to odor-induced sweetness. We propose an electroencephalogram local-global fusion transformer network (EEG-LGFNet) model to decode this impact objectively. Electroencephalogram data were collected from 16 subjects under different odor and sucrose stimuli. The model captures complex neural signals by integrating local and global feature extraction mechanisms. Its performance was validated across three-time windows, demonstrating efficacy over various temporal ranges. Analysis of the coefficient of determination across brain regions confirmed the importance of the frontal, central, and parietal areas of sweetness perception. The EEG-LGFNet model excelled in quantifying odor-enhanced sweetness, significantly outperforming state-of-the-art models. This research offers new insights into odor sweetening, with applications in food development, personalized nutrition, and neuroscience.
Collapse
Affiliation(s)
- Xiuxin Xia
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| | - Yatao Cheng
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| | - Zhuo Zhang
- College of Computer Science, National University of Defense Technology, Changsha 410073, China
| | - Zhijie Hua
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| | - Qun Wang
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| | - Yan Shi
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| | - Hong Men
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| |
Collapse
|
5
|
Sam A, Boostani R, Hashempour S, Taghavi M, Sanei S. Depression Identification Using EEG Signals via a Hybrid of LSTM and Spiking Neural Networks. IEEE Trans Neural Syst Rehabil Eng 2023; 31:4725-4737. [PMID: 37995160 DOI: 10.1109/tnsre.2023.3336467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
Depression severity can be classified into distinct phases based on the Beck depression inventory (BDI) test scores, a subjective questionnaire. However, quantitative assessment of depression may be attained through the examination and categorization of electroencephalography (EEG) signals. Spiking neural networks (SNNs), as the third generation of neural networks, incorporate biologically realistic algorithms, making them ideal for mimicking internal brain activities while processing EEG signals. This study introduces a novel framework that for the first time, combines an SNN architecture and a long short-term memory (LSTM) structure to model the brain's underlying structures during different stages of depression and effectively classify individual depression levels using raw EEG signals. By employing a brain-inspired SNN model, our research provides fresh perspectives and advances knowledge of the neurological mechanisms underlying different levels of depression. The methodology employed in this study includes the utilization of the synaptic time dependent plasticity (STDP) learning rule within a 3-dimensional brain-template structured SNN model. Furthermore, it encompasses the tasks of classifying and predicting individual outcomes, visually representing the structural alterations in the brain linked to the anticipated outcomes, and offering interpretations of the findings. Notably, our method achieves exceptional accuracy in classification, with average rates of 98% and 96% for eyes-closed and eyes-open states, respectively. These results significantly outperform state-of-the-art deep learning methods.
Collapse
|
6
|
Luo G, Rao H, An P, Li Y, Hong R, Chen W, Chen S. Exploring Adaptive Graph Topologies and Temporal Graph Networks for EEG-Based Depression Detection. IEEE Trans Neural Syst Rehabil Eng 2023; 31:3947-3957. [PMID: 37773916 DOI: 10.1109/tnsre.2023.3320693] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2023]
Abstract
In recent years, Graph Neural Networks (GNNs) based on deep learning techniques have achieved promising results in EEG-based depression detection tasks but still have some limitations. Firstly, most existing GNN-based methods use pre-computed graph adjacency matrices, which ignore the differences in brain networks between individuals. Additionally, methods based on graph-structured data do not consider the temporal dependency information of brain networks. To address these issues, we propose a deep learning algorithm that explores adaptive graph topologies and temporal graph networks for EEG-based depression detection. Specifically, we designed an Adaptive Graph Topology Generation (AGTG) module that can adaptively model the real-time connectivity of the brain networks, revealing differences between individuals. In addition, we designed a Graph Convolutional Gated Recurrent Unit (GCGRU) module to capture the temporal dynamical changes of brain networks. To further explore the differential features between depressed and healthy individuals, we adopt Graph Topology-based Max-Pooling (GTMP) module to extract graph representation vectors accurately. We conduct a comparative analysis with several advanced algorithms on both public and our own datasets. The results reveal that our final model achieves the highest Area Under the Receiver Operating Characteristic Curve (AUROC) on both datasets, with values of 83% and 99%, respectively. Furthermore, we perform extensive validation experiments demonstrating our proposed method's effectiveness and advantages. Finally, we present a comprehensive discussion on the differences in brain networks between healthy and depressed individuals based on the outputs of our final model's AGTG and GTMP modules.
Collapse
|
7
|
Hag A, Al-Shargie F, Handayani D, Asadi H. Mental Stress Classification Based on Selected Electroencephalography Channels Using Correlation Coefficient of Hjorth Parameters. Brain Sci 2023; 13:1340. [PMID: 37759941 PMCID: PMC10527440 DOI: 10.3390/brainsci13091340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/11/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open
Abstract
Electroencephalography (EEG) signals offer invaluable insights into diverse activities of the human brain, including the intricate physiological and psychological responses associated with mental stress. A major challenge, however, is accurately identifying mental stress while mitigating the limitations associated with a large number of EEG channels. Such limitations encompass computational complexity, potential overfitting, and the prolonged setup time for electrode placement, all of which can hinder practical applications. To address these challenges, this study presents the novel CCHP method, aimed at identifying and ranking commonly optimal EEG channels based on their sensitivity to the mental stress state. This method's uniqueness lies in its ability not only to find common channels, but also to prioritize them according to their responsiveness to stress, ensuring consistency across subjects and making it potentially transformative for real-world applications. From our rigorous examinations, eight channels emerged as universally optimal in detecting stress variances across participants. Leveraging features from the time, frequency, and time-frequency domains of these channels, and employing machine learning algorithms, notably RLDA, SVM, and KNN, our approach achieved a remarkable accuracy of 81.56% with the SVM algorithm outperforming existing methodologies. The implications of this research are profound, offering a stepping stone toward the development of real-time stress detection devices, and consequently, enabling clinicians to make more informed therapeutic decisions based on comprehensive brain activity monitoring.
Collapse
Affiliation(s)
- Ala Hag
- School of Computer Science & Engineering, Taylor’s University, Jalan Taylors, Subang Jaya 47500, Selangor, Malaysia;
| | - Fares Al-Shargie
- Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, VIC 3216, Australia
| | - Dini Handayani
- Department of Electrical Engineering, Abu Dhabi University, Abu Dhabi P.O. Box 59911, United Arab Emirates;
| | - Houshyar Asadi
- Computer Science Department, KICT, International Islamic University Malaysia, Kuala Lumpur 53100, Selangor, Malaysia
| |
Collapse
|
8
|
Cheng C, Zhang Y, Liu L, Liu W, Feng L. Multi-Domain Encoding of Spatiotemporal Dynamics in EEG for Emotion Recognition. IEEE J Biomed Health Inform 2023; 27:1342-1353. [PMID: 37015504 DOI: 10.1109/jbhi.2022.3232497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The common goal of the studies is to map any emotional states encoded from electroencephalogram (EEG) into 2-dimensional arousal-valance scores. It is still challenging due to each emotion having its specific spatial structure and dynamic dependence over the distinct time segments among EEG signals. This paper aims to model human dynamic emotional behavior by considering the location connectivity and context dependency of brain electrodes. Thus, we designed a hybrid EEG modeling method that mainly adopts the attention mechanism, combining a multi-domain spatial transformer (MST) module and a dynamic temporal transformer (DTT) module, named MSDTTs. Specifically, the MST module extracts single-domain and cross-domain features from different brain regions and fuses them into multi-domain spatial features. Meanwhile, the temporal dynamic excitation (TDE) is inserted into the multi-head convolutional transformer to form the DTT module. These two blocks work together to activate and extract the emotion-related dynamic temporal features within the DTT module. Furthermore, we place the convolutional mapping into the transformer structure to mine the static context features among the keyframes. Overall results show that high classification accuracy of 98.91%/0.14% was obtained by the $\beta$ frequency band of the DEAP dataset, and 97.52%/0.12% and 96.70%/0.26% were obtained by the $\gamma$ frequency band of SEED and SEED-IV datasets. Empirical experiments indicate that our proposed method can achieve remarkable results in comparison with state-of-the-art algorithms.
Collapse
|
9
|
Zhang B, Wei D, Yan G, Lei T, Cai H, Yang Z. Feature-level fusion based on spatial-temporal of pervasive EEG for depression recognition. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107113. [PMID: 36103735 DOI: 10.1016/j.cmpb.2022.107113] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 08/23/2022] [Accepted: 09/04/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE In view of the depression characteristics such as high prevalence, high disability rate, high fatality rate, and high recurrence rate, early identification and early intervention are the most effective methods to prevent irreversible damage of brain function over time. The traditional method of depression recognition based on questionnaires and interviews is time-consuming and labor-intensive, and heavily depends on the doctor's subjective experience. Therefore, accurate, convenient and effective recognition of depression has important social value and scientific significance. METHODS This paper proposes a depression recognition framework based on feature-level fusion of spatial-temporal pervasive electroencephalography (EEG). Time series EEG data were collected by portable three-electrode EEG acquisition instrument, and mapped to a spatial complex network called visibility graph (VG). Then temporal EEG features and spatial VG metric features were extracted and selected. Based on the correlation between features and categories, the differences in contribution of individual feature are explored, and different contribution coefficients are assigned to different features as the data basis of feature-level fusion to ensure the diversity of data. A cascade forest model based on three different decision forests is designed to realize the efficient depression recognition using spatial-temporal feature-level fusion data. RESULTS Experimental data were obtained from 26 depressed patients and 29 healthy controls (HC). The results of multiple control experiments show that compared with single type feature, feature-level fusion without contribution coefficient, and independent classifiers, the feature-level method with contribution coefficient of spatial-temporal has a stronger recognition ability of depression, and the highest accuracy is 92.48%. CONCLUSION Feature-level fusion method provides an effective computer-aided tool for rapid clinical diagnosis of depression.
Collapse
Affiliation(s)
- Bingtao Zhang
- School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China; School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China.
| | - Dan Wei
- School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
| | - Guanghui Yan
- School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
| | - Tao Lei
- School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Haishu Cai
- School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
| | - Zhifei Yang
- School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
| |
Collapse
|
10
|
Cheng C, Liu Y, You B, Zhou Y, Gao F, Yang L, Dai Y. Multilevel Feature Learning Method for Accurate Interictal Epileptiform Spike Detection. IEEE Trans Neural Syst Rehabil Eng 2022; 30:2506-2516. [PMID: 35877795 DOI: 10.1109/tnsre.2022.3193666] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Interictal epileptiform spike (referred to as spike) detected from electroencephalograms lasting only 20- to 200-ms can provide a reliable evidence-based indicator for clinical seizure type diagnosis. Recent feature representation approaches focus either on the concrete-level or on abstract-level information mining of the spike, thus demonstrating suboptimal detection performance. Additionally, existing abstract-level information mining methods of the spike based deep learning networks have not realized the effective feature representation of long-term dependent distinguished information within similar waveform cycles caused by morphological heterogeneity, which affects detection performance. Thus, a multilevel feature learning method for accurate spike detection was proposed in this study. Specifically, the spatio-temporal-frequency multidomain information in concrete-level first are inferred the common mimetic properties of the spike using the multidomain feature extractors. Then, the effective feature representation of long-term dependent distinguished information within similar waveform cycles caused by morphological heterogeneity is suitably captured using the temporal convolutional network. Finally, the spatio-temporal-frequency multidomain long-term dependent feature representation of spike is calculated using the element-wise manner to fuse the feature representation in concrete- and abstract-levels. The experimental results indicate that the proposed method can achieve an accuracy of 90.62±1.38%, sensitivity of 90.38±1.52%, specificity of 91.00±1.60%, precision of 90.33±4.71%, and the false detection rate per minute is 0.148±0.020m-1, which are higher than when using the feature representation in the concrete- or abstract-level alone. Additionally, the detection results indicate that the proposed method avoids the subjectivity and inefficiency of visual inspection, and it enables a highly accurate detection of the spike.
Collapse
|
11
|
Research on the MEG of Depression Patients Based on Multivariate Transfer Entropy. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7516627. [PMID: 35909866 PMCID: PMC9328977 DOI: 10.1155/2022/7516627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 05/27/2022] [Accepted: 06/03/2022] [Indexed: 11/17/2022]
Abstract
The pathogenesis of depression is complex, and the current means of medical diagnosis is single. Patients with severe depression may even have great physical pain and suicidal tendencies. Magnetoencephalography (MEG) has the characteristics of ultrahigh spatiotemporal resolution and safety. It is a good medical means for the diagnosis of depression. In this paper, multivariate transfer entropy algorithm is used to study MEG of depression. In this paper, the subjects are divided into the same brain region and the multichannel combination between different brain regions, and the multivariate transfer entropy of patients with depression and healthy controls under different EEG signal frequency bands is calculated. Finally, the significant difference between the two groups of experimental samples is verified by the results of independent sample t-test. The experimental results show that for the same combination of brain channels, the multivariate transfer entropy in the depression group is generally lower than that in the healthy control group, and the difference is the best in γ frequency band and the largest in the frontal region.
Collapse
|