1
|
Samaee M, Yazdi M, Massicotte D. Multi-modal signal integration for enhanced sleep stage classification: Leveraging EOG and 2-channel EEG data with advanced feature extraction. Artif Intell Med 2025; 166:103152. [PMID: 40334525 DOI: 10.1016/j.artmed.2025.103152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 07/30/2024] [Accepted: 05/01/2025] [Indexed: 05/09/2025]
Abstract
This paper introduces an innovative approach to sleep stage classification, leveraging a multi-modal signal integration framework encompassing Electrooculography (EOG) and two-channel electroencephalography (EEG) data. We explore the utility of various feature extraction techniques, including Short-Time Fourier Transform (STFT), Wavelet Transform, and raw signal processing, alongside the utilization of neural networks as feature extractors. This unique combination allows us to harness the benefits of traditional feature extraction methods while capitalizing on the power of neural networks to enhance classification performance. Our comprehensive classifier evaluation encompasses a range of models, including Long Short-Term Memory (LSTM) networks and XGBoost. Remarkably, our results reveal exceptional performance with the XGBoost classifier, achieving an overall accuracy of 84.57 % and a macro-F1 score of 78.21 % on the Sleep-EDF expanded dataset, and an overall accuracy of 86.02 % and a macro-F1 score of 81.96 % on the ISRUC-Sleep dataset. Class-specific accuracies highlight its proficiency, particularly in detecting wake and N2 stages, solidifying its superiority among the classifiers tested. This amalgamation of feature sets, complemented by Principal Component Analysis (PCA) for dimensionality reduction, underscores its significance in yielding top-tier classification outcomes. The integration of traditional feature extraction methods with neural networks as feature extractors creates a robust and comprehensive system for sleep stage classification, offering the advantages of both approaches to enhance the accuracy and reliability of the results.
Collapse
Affiliation(s)
- Mahdi Samaee
- Signal and Image Processing Laboratory, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
| | - Mehran Yazdi
- Signal and Image Processing Laboratory, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran.
| | - Daniel Massicotte
- Laboratory of Signal and System Integration, Department of Electrical and Computer Engineering, Université du Québec à Trois-Rivières, Trois-Rivières, Canada
| |
Collapse
|
2
|
Ren Z, Ma J, Ding Y. FlexibleSleepNet:A Model for Automatic Sleep Stage Classification Based on Multi-Channel Polysomnography. IEEE J Biomed Health Inform 2025; 29:3488-3501. [PMID: 40030855 DOI: 10.1109/jbhi.2025.3525626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
In the task of automatic sleep stage classification, deep learning models often face the challenge of balancing temporal-spatial feature extraction with computational complexity. To address this issue, this study introduces FlexibleSleepNet, a lightweight convolutional neural network architecture designed around the Adaptive Feature Extraction (AFE) Module and Scale-Varying Compression (SVC) Module. Through multi-channel polysomnography data input and preprocessing, FlexibleSleepNet utilizes the AFE Module to capture intra-channel features and employs the SVC Module for channel feature compression and dimension expansion. The collaborative work of these modules enables the network to effectively capture temporal-spatial dependencies between channels. Additionally, the network extracts feature maps through four distinct stages, each from different receptive field scales, culminating in precise sleep stage classification via a classification module. This study conducted k-fold cross-validation on three different databases: SleepEDF-20, SleepEDF-78, and SHHS. Experimental results show that FlexibleSleepNet demonstrates superior classification performance, achieving classification accuracies of 86.9% and 87.6% on the SleepEDF-20 and SHHS datasets, respectively. It performs particularly well on the SleepEDF-78 dataset, where it reaches a classification accuracy of 87.0%. Additionally, it has significantly enhanced computational efficiency while maintaining low computational complexity.
Collapse
|
3
|
Cong Z, Zhao M, Gao H, Lou M, Zheng G, Wang Z, Wang X, Yan C, Ling L, Li J, Liu C. BiTS-SleepNet: An Attention-Based Two Stage Temporal-Spectral Fusion Model for Sleep Staging With Single-Channel EEG. IEEE J Biomed Health Inform 2025; 29:3366-3376. [PMID: 40030778 DOI: 10.1109/jbhi.2024.3523908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Automated sleep staging is crucial for assessing sleep quality and diagnosing sleep-related diseases. Single-channel EEG has attracted significant attention due to its portability and accessibility. Most existing automated sleep staging methods often emphasize temporal information and neglect spectral information, the relationship between sleep stage contextual features, and transition rules between sleep stages. To overcome these obstacles, this paper proposes an attention-based two stage temporal-spectral fusion model (BiTS-SleepNet). The BiTS-SleepNet stage 1 network consists of a dual-stream temporal-spectral feature extractor branch and a temporal-spectral feature fusion module based on the cross-attention mechanism. These blocks are designed to autonomously extract and integrate the temporal and spectral features of EEG signals, leveraging temporal-spectral fusion information to discriminate between different sleep stages. The BiTS-SleepNet stage 2 network includes a feature context learning module (FCLM) based on Bi-GRU and a transition rules learning module (TRLM) based on the Conditional Random Field (CRF). The FCLM optimizes preliminary sleep stage results from the stage 1 network by learning dependencies between features of multiple adjacent stages. The TRLM additionally employs transition rules to optimize overall outcomes. We evaluated the BiTS-SleepNet on three public datasets: Sleep-EDF-20, Sleep-EDF-78, and SHHS, achieving accuracies of 88.50%, 85.09%, and 87.01%, respectively. The experimental results demonstrate that BiTS-SleepNet achieves competitive performance in comparison to recently published methods. This highlights its promise for practical applications.
Collapse
|
4
|
Chen J, Fan X, Ge R, Xiao J, Wang R, Ma W, Li Y. Towards interpretable sleep stage classification with a multi-stream fusion network. BMC Med Inform Decis Mak 2025; 25:164. [PMID: 40229774 PMCID: PMC11998347 DOI: 10.1186/s12911-025-02995-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 04/04/2025] [Indexed: 04/16/2025] Open
Abstract
Sleep stage classification is a significant measure in assessing sleep quality and diagnosing sleep disorders. Many researchers have investigated automatic sleep stage classification methods and achieved promising results. However, these methods ignored the heterogeneous information fusion of the spatial-temporal and spectral-temporal features among multiple-channel sleep monitoring signals. In this study, we propose an interpretable multi-stream fusion network, named MSF-SleepNet, for sleep stage classification. Specifically, we employ Chebyshev graph convolution and temporal convolution to obtain the spatial-temporal features from body-topological information of sleep monitoring signals. Meanwhile, we utilize a short time Fourier transform and gated recurrent unit to learn the spectral-temporal features from sleep monitoring signals. After fusing the spatial-temporal and spectral-temporal features, we use a contrastive learning scheme to enhance the differences in feature patterns of sleep monitoring signals across various sleep stages. Finally, LIME is employed to improve the interpretability of MSF-SleepNet. Experimental results on ISRUC-S1 and ISRUC-S3 datasets show that MSF-SleepNet achieves competitive results and is superior to its state-of-the-art counterparts on most of performance metrics.
Collapse
Affiliation(s)
- Jingrui Chen
- Department of Information Management, Guangdong Justice Police Vocational College, Guangzhou, Guangdong, 510520, China
| | - Xiaomao Fan
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, Guangdong, 518055, China.
| | - Ruiquan Ge
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China
| | - Jing Xiao
- School of Computer Science, South China Normal University, Guangzhou, Guangdong, 510631, China
| | - Ruxin Wang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, China
| | - Wenjun Ma
- School of Computer Science, South China Normal University, Guangzhou, Guangdong, 510631, China.
| | - Ye Li
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, China
| |
Collapse
|
5
|
Massie F, Vits S, Verbraecken J, Bergmann J. The evaluation of a novel single-lead biopotential device for home sleep testing. Sleep 2025; 48:zsae248. [PMID: 39441980 PMCID: PMC11985384 DOI: 10.1093/sleep/zsae248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/29/2024] [Indexed: 10/25/2024] Open
Abstract
STUDY OBJECTIVES This paper reports on the clinical evaluation of the sleep staging performance of a novel single-lead biopotential device. METHODS One hundred and thirty-three patients suspected of obstructive sleep apnea were included in a multi-site cohort. All patients underwent polysomnography and received the study device, a single-lead biopotential measurement device attached to the forehead. Clinical endpoint parameters were selected to evaluate the device's ability to determine sleep stages. Finally, the device's performance was compared to the clinical study results of comparable devices. RESULTS Concurrent PSG and study device data were successfully acquired for 106 of the 133 included patients. The results of this study demonstrated significant similarity in overall sleep staging performance (five-stage Cohen's Kappa of 0.70) to the best-performing reduced-lead biopotential device to which it was compared (five-stage Cohen's Kappa of 0.73). Contrary to the comparator devices, the study device reported a higher Cohen's Kappa for rapid eye movement (REM) (0.78) compared to N3 (0.61), which can be explained by its particular measuring electrode placement (diagonally across the lateral cross-section of the eye). This placement was optimized to ensure the polarity of rapid eye movements could be adequately captured, enhancing the capacity to discriminate between N3 and REM sleep when using only a single-lead setup. CONCLUSIONS The results of this study demonstrate the feasibility of incorporating a single-lead biopotential extension in a reduced-channel home sleep apnea testing setup. Such incorporation could narrow the gap in the functionality of reduced-channel home sleep testing and in-lab polysomnography without compromising the patient's ease of use and comfort. CLINICAL TRIALS NCT05035992, A Validation Study of the NightOwl Head Sensor https://clinicaltrials.gov/ct2/show/NCT05035992.
Collapse
Affiliation(s)
- Frederik Massie
- Natural Interaction Lab, Thom Building, Department of Engineering, University of Oxford, Oxford, UK
| | - Steven Vits
- Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Johan Verbraecken
- Department of Pulmonary Medicine and Multidisciplinary Sleep Disorders Centre, Antwerp University Hospital, Edegem, Belgium and Research Group LEMP, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Jeroen Bergmann
- Natural Interaction Lab, Thom Building, Department of Engineering, University of Oxford, Oxford, UK
- Department of Technology and Innovation, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
6
|
Huang Y, Chen Y, Xu S, Wu D, Wu X. Self-Supervised Learning with Adaptive Frequency-Time Attention Transformer for Seizure Prediction and Classification. Brain Sci 2025; 15:382. [PMID: 40309845 PMCID: PMC12025975 DOI: 10.3390/brainsci15040382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 03/08/2025] [Accepted: 03/12/2025] [Indexed: 05/02/2025] Open
Abstract
BACKGROUND In deep learning-based epilepsy prediction and classification, enhancing the extraction of electroencephalogram (EEG) features is crucial for improving model accuracy. Traditional supervised learning methods rely on large, detailed annotated datasets, limiting the feasibility of large-scale training. Recently, self-supervised learning approaches using masking-and-reconstruction strategies have emerged, reducing dependence on labeled data. However, these methods are vulnerable to inherent noise and signal degradation in EEG data, which diminishes feature extraction robustness and overall model performance. METHODS In this study, we proposed a self-supervised learning Transformer network enhanced with Adaptive Frequency-Time Attention (AFTA) for learning robust EEG feature representations from unlabeled data, utilizing a masking-and-reconstruction framework. Specifically, we pretrained the Transformer network using a self-supervised learning approach, and subsequently fine-tuned the pretrained model for downstream tasks like seizure prediction and classification. To mitigate the impact of inherent noise in EEG signals and enhance feature extraction capabilities, we incorporated AFTA into the Transformer architecture. AFTA incorporates an Adaptive Frequency Filtering Module (AFFM) to perform adaptive global and local filtering in the frequency domain. This module was then integrated with temporal attention mechanisms, enhancing the model's self-supervised learning capabilities. RESULT Our method achieved exceptional performance in EEG analysis tasks. Our method consistently outperformed state-of-the-art approaches across TUSZ, TUAB, and TUEV datasets, achieving the highest AUROC (0.891), balanced accuracy (0.8002), weighted F1-score (0.8038), and Cohen's kappa (0.6089). These results validate its robustness, generalization, and effectiveness in seizure detection and classification tasks on diverse EEG datasets.
Collapse
Affiliation(s)
| | | | | | | | - Xunyi Wu
- Department of Neurology, Huashan Hospital, Fudan University, Shanghai 200040, China; (Y.H.); (Y.C.); (S.X.); (D.W.)
| |
Collapse
|
7
|
Yang W, Wang X, Qi W, Wang W. LGFormer: integrating local and global representations for EEG decoding. J Neural Eng 2025; 22:026042. [PMID: 40138736 DOI: 10.1088/1741-2552/adc5a3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Accepted: 03/26/2025] [Indexed: 03/29/2025]
Abstract
Objective.Electroencephalography (EEG) decoding is challenging because of its temporal variability and low signal-to-noise ratio, which complicate the extraction of meaningful information from signals. Although convolutional neural networks (CNNs) effectively extract local features from EEG signals, they are constrained by restricted receptive fields. In contrast, transformers excel at capturing global dependencies through self-attention mechanisms but often require extensive training data and computational resources, which limits their efficiency on EEG datasets with limited samples.Approach.In this paper, we propose LGFormer, a hybrid network designed to efficiently learn both local and global representations for EEG decoding. LGFormer employs a deep attention module to extract global information from EEG signals, dynamically adjusting the focus of CNNs. Subsequently, LGFormer incorporates a local-enhanced transformer, combining the strengths of CNNs and transformers to achieve multiscale perception from local to global. Despite integrating multiple advanced techniques, LGFormer maintains a lightweight design and training efficiency.Main results.LGFormer achieves state-of-the-art performance within 200 training epochs across four public datasets, including motor imagery, cognitive workload, and error-related negativity decoding tasks. Additionally, we propose a novel spatial and temporal attention visualization method, revealing that LGFormer captures discriminative spatial and temporal features, enhancing model interpretability and providing insights into its decision-making process.Significance.In summary, LGFormer demonstrates superior performance while maintaining high training efficiency across different tasks, highlighting its potential as a versatile and practical model for EEG decoding.
Collapse
Affiliation(s)
- Wenjie Yang
- CAS Key Laboratory of Space Manufacturing Technology, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, People's Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Xingfu Wang
- CAS Key Laboratory of Space Manufacturing Technology, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, People's Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Wenxia Qi
- CAS Key Laboratory of Space Manufacturing Technology, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, People's Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Wei Wang
- CAS Key Laboratory of Space Manufacturing Technology, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, People's Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| |
Collapse
|
8
|
Zhong Y, Zhou W, Tao L. Source-free time series domain adaptation with wavelet-based multi-scale temporal imputation. Neural Netw 2025; 188:107428. [PMID: 40184865 DOI: 10.1016/j.neunet.2025.107428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Revised: 03/08/2025] [Accepted: 03/20/2025] [Indexed: 04/07/2025]
Abstract
Recent works on source-free domain adaptation (SFDA) for time series reveal the effectiveness of learning domain-invariant temporal dynamics on improving the cross-domain performance of the model. However, existing SFDA methods for time series mainly focus on modeling the original sequence, lacking the utilization of the multi-scale properties of time series. This may result in insufficient extraction of domain-invariant temporal patterns. Furthermore, previous multi-scale analysis methods typically ignore important frequency domain information during multi-scale division, leading to the limited ability for multi-scale time series modeling. To this end, we propose LEMON, a novel SFDA method for time series with wavelet-based multi-scale temporal imputation. It utilizes the discrete wavelet transform to decompose a time series into multiple scales, each with a distinct time-frequency resolution and specific frequency range, enabling full-spectrum utilization. To effectively transfer multi-scale temporal dynamics from the source domain to the target domain, we introduce a multi-scale temporal imputation module which assigns a deep neural network to perform the temporal imputation task on the sequence at each scale, learning scale-specific domain-invariant information. We further design an energy-based multi-scale weighting strategy, which adaptively integrates information from multiple scales based on the frequency distribution of the input data to improve the transfer performance of the model. Extensive experiments on three real-world time series datasets demonstrate that LEMON significantly outperforms the state-of-the-art methods, achieving an average improvement of 4.45% in accuracy and 6.29% in MF1-score.
Collapse
Affiliation(s)
- Yingyi Zhong
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China.
| | - Wen'an Zhou
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China.
| | - Liwen Tao
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China.
| |
Collapse
|
9
|
Deng G, Niu M, Luo Y, Rao S, Xie J, Yu Z, Liu W, Zhao S, Pan G, Li X, Deng W, Guo W, Li T, Jiang H. A Unified Flexible Large Polysomnography Model for Sleep Staging and Mental Disorder Diagnosis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2024.12.11.24318815. [PMID: 39711704 PMCID: PMC11661386 DOI: 10.1101/2024.12.11.24318815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Sleep quality is vital to human health, yet automated sleep staging faces challenges in cross-center generalization due to data scarcity and domain gaps. Traditional scoring is labor-intensive, while deep learning models often fail to generalize across datasets. Here, we present LPSGM, a unified and flexible large polysomnography (PSG) model designed to enhance cross-center generalization in sleep staging and enable fine-tuning for disease diagnosis. Trained on 220,500 hours of PSG data from 16 public datasets, LPSGM integrates domain-adaptive learning and supports variable-channel configurations, achieving performance comparable to models trained directly on target-center data. In a prospective clinical study, LPSGM matches expert-level accuracy with lower variability. When fine-tuned, it attains 88.01% accuracy in narcolepsy detection and 100% in depression detection. These results establish LPSGM as a scalable, plug-and-play solution for automated PSG analysis, bridging the gap between sleep staging and clinical deployment.
Collapse
Affiliation(s)
- Guifeng Deng
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, 310058, China
| | - Mengfan Niu
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Yuxi Luo
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat sen University, Shenzhen, 518100, China
| | - Shuying Rao
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, 310058, China
| | - Junyi Xie
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Zhenghe Yu
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Wenjuan Liu
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Sha Zhao
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
| | - Gang Pan
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
| | - Xiaojing Li
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
| | - Wei Deng
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
| | - Wanjun Guo
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
| | - Tao Li
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
| | - Haiteng Jiang
- Affiliated Mental Health Center & Hangzhou Seventh People’s Hospital, School of Brain Science and Brain Medicine, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
10
|
Xu X, Zhang B, Xu T, Tang J. An Effective and Interpretable Sleep Stage Classification Approach Using Multi-Domain Electroencephalogram and Electrooculogram Features. Bioengineering (Basel) 2025; 12:286. [PMID: 40150750 PMCID: PMC11939799 DOI: 10.3390/bioengineering12030286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2025] [Revised: 03/07/2025] [Accepted: 03/12/2025] [Indexed: 03/29/2025] Open
Abstract
Accurate sleep staging is critical for assessing sleep quality and diagnosing sleep disorders. Recent research efforts on automated sleep staging have focused on complex deep learning architectures that have achieved modest improvements in classification accuracy but have limited real-world applicability due to the complexity of model training and deployment and a lack of interpretability. This paper presents an effective and interpretable sleep staging scheme that follows a classical machine learning pipeline. Multi-domain features were extracted from preprocessed electroencephalogram (EEG) signals, and novel electrooculogram (EOG) features were created to characterize different sleep stages. A two-step feature selection strategy combining F-score pre-filtering and XGBoost feature ranking was designed to select the most discriminating feature subset, which was then fed into an XGBoost model for sleep stage classification. Through a rigorous double-cross-validation procedure, our approach achieved competitive classification performance on the public Sleep-EDF dataset (accuracy 87.0%, F1-score 86.6%, Kappa coefficient 0.81) compared with the state-of-the-art deep learning methods and provided interpretability through feature importance analysis. These promising results demonstrate the effectiveness of the proposed sleep staging model and show its potential in practical applications due to its low complexity, interpretability, and transparency.
Collapse
Affiliation(s)
| | | | - Tingting Xu
- School of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China; (X.X.); (B.Z.); (J.T.)
| | | |
Collapse
|
11
|
Li H, Wang Y, Fu P. A Novel Multi-Dynamic Coupled Neural Mass Model of SSVEP. Biomimetics (Basel) 2025; 10:171. [PMID: 40136825 PMCID: PMC11940536 DOI: 10.3390/biomimetics10030171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 02/19/2025] [Accepted: 03/07/2025] [Indexed: 03/27/2025] Open
Abstract
Steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs) leverage high-speed neural synchronization to visual flicker stimuli for efficient device control. While SSVEP-BCIs minimize user training requirements, their dependence on physical EEG recordings introduces challenges, such as inter-subject variability, signal instability, and experimental complexity. To overcome these limitations, this study proposes a novel neural mass model for SSVEP simulation by integrating frequency response characteristics with dual-region coupling mechanisms. Specific parallel linear transformation functions were designed based on SSVEP frequency responses, and weight coefficient matrices were determined according to the frequency band energy distribution under different visual stimulation frequencies in the pre-recorded SSVEP signals. A coupled neural mass model was constructed by establishing connections between occipital and parietal regions, with parameters optimized through particle swarm optimization to accommodate individual differences and neuronal density variations. Experimental results demonstrate that the model achieved a high-precision simulation of real SSVEP signals across multiple stimulation frequencies (10 Hz, 11 Hz, and 12 Hz), with maximum errors decreasing from 2.2861 to 0.8430 as frequency increased. The effectiveness of the model was further validated through the real-time control of an Arduino car, where simulated SSVEP signals were successfully classified by the advanced FPF-net model and mapped to control commands. This research not only advances our understanding of SSVEP neural mechanisms but also releases the user from the brain-controlled coupling system, thus providing a practical framework for developing more efficient and reliable BCI-based systems.
Collapse
Affiliation(s)
- Hongqi Li
- School of Software, Northwestern Polytechnical University, Xi’an 710072, China;
- Yangtze River Delta Research Institute, Northwestern Polytechnical University, Taicang 215400, China
| | - Yujuan Wang
- School of Software, Northwestern Polytechnical University, Xi’an 710072, China;
- Yangtze River Delta Research Institute, Northwestern Polytechnical University, Taicang 215400, China
| | - Peirong Fu
- Huawei Technologies Co., Ltd., Shenzhen 310051, China;
| |
Collapse
|
12
|
Phang CR, Hirata A. Explainable multiscale temporal convolutional neural network model for sleep stage detection based on electroencephalogram activities. J Neural Eng 2025; 22:026010. [PMID: 39983236 DOI: 10.1088/1741-2552/adb90c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 02/21/2025] [Indexed: 02/23/2025]
Abstract
Objective.Humans spend a significant portion of their lives in sleep (an essential driver of body metabolism). Moreover, as sleep deprivation could cause various health complications, it is crucial to develop an automatic sleep stage detection model to facilitate the tedious manual labeling process. Notably, recently proposed sleep staging algorithms lack model explainability and still require performance improvement.Approach.We implemented multiscale neurophysiology-mimicking kernels to capture sleep-related electroencephalogram (EEG) activities at varying frequencies and temporal lengths; the implemented model was named 'multiscale temporal convolutional neural network (MTCNN).' Further, we evaluated its performance using an open-source dataset (Sleep-EDF Database Expanded comprising 153 d of polysomnogram data).Main results.By investigating the learned kernel weights, we observed that MTCNN detected the EEG activities specific to each sleep stage, such as the frequencies, K-complexes, and sawtooth waves. Furthermore, regarding the characterization of these neurophysiologically significant features, MTCNN demonstrated an overall accuracy (OAcc) of 91.12% and a Cohen kappa coefficient of 0.86 in the cross-subject paradigm. Notably, it demonstrated an OAcc of 88.24% and a Cohen kappa coefficient of 0.80 in the leave-few-days-out analysis. Our MTCNN model also outperformed the existing deep learning models in sleep stage classification even when it was trained with only 16% of the total EEG data, achieving an OAcc of 85.62% and a Cohen kappa coefficient of 0.75 on the remaining 84% of testing data.Significance.The proposed MTCNN enables model explainability and it can be trained with lesser amount of data, which is beneficial to its application in the real-world because large amounts of training data are not often and readily available.
Collapse
Affiliation(s)
- Chun-Ren Phang
- Department of Electrical and Mechanical Engineering, and the Center of Biomedical Physics and Information Technology, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555 Aichi, Japan
- DNAKE BCI Lab, Brain-Computer Interaction Business Unit, DNAKE (Xiamen) Intelligent Technology CO., LTD, Xiamen, Fujian, People's Republic of China
| | - Akimasa Hirata
- Department of Electrical and Mechanical Engineering, and the Center of Biomedical Physics and Information Technology, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555 Aichi, Japan
| |
Collapse
|
13
|
Liu R, Wen S, Xing Y. An integrated approach for advanced vehicle classification. PLoS One 2025; 20:e0318530. [PMID: 39965022 PMCID: PMC11835343 DOI: 10.1371/journal.pone.0318530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Accepted: 01/16/2025] [Indexed: 02/20/2025] Open
Abstract
This study is dedicated to addressing the trade-off between receptive field size and computational efficiency in low-level vision. Conventional neural networks (CNNs) usually expand the receptive field by adding layers or inflation filtering, which often leads to high computational costs. Although expansion filtering was introduced to reduce the computational burden, the resulting receptive field is only a sparse sampling of the tessellated pattern in the input image due to the grid effect. To better trade-off between the size of the receptive field and the computational efficiency, a new multilevel discrete wavelet CNN model (DWAN) is proposed in this paper. The DWAN introduces a four-level discrete wavelet transform in the convolutional neural network architecture and combines it with Convolutional Block Attention Module (CBAM) to efficiently capture multiscale feature information. By reducing the size of the feature maps in the shrinkage subnetwork, DWAN achieves a wider sensory field coverage while maintaining a smaller computational cost, thus improving the performance and efficiency of visual tasks. In addition, this paper validates the DWAN model in an image classification task targeting fine categories of automobiles. Significant performance gains are observed by training and testing the DWAN architecture that includes CBAM. The DWAN model can identify and accurately classify subtle features and differences in automotive images, resulting in better classification results for the automotive fine-grained category. This validation result further demonstrates the effectiveness and robustness of the DWAN model in vision tasks and lays a solid foundation for its generalization to practical applications.
Collapse
Affiliation(s)
- Rui Liu
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, Sichuan, China
| | - Shiyuan Wen
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, Sichuan, China
| | - Yufei Xing
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, Sichuan, China
| |
Collapse
|
14
|
Wan C, Nnamdi MC, Shi W, Smith B, Purnell C, Wang MD. Advancing Sleep Disorder Diagnostics: A Transformer-Based EEG Model for Sleep Stage Classification and OSA Prediction. IEEE J Biomed Health Inform 2025; 29:878-886. [PMID: 40030422 DOI: 10.1109/jbhi.2024.3512616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Sleep disorders, particularly Obstructive Sleep Apnea (OSA), have a considerable effect on an individual's health and quality of life. Accurate sleep stage classification and prediction of OSA are crucial for timely diagnosis and effective management of sleep disorders. In this study, we develop a sequential network that enhances sleep stage classification by incorporating self-attention mechanisms and Conditional Random Fields (CRF) into a deep learning model comprising multi-kernel Convolutional Neural Networks (CNNs) and Transformer-based encoders. The self-attention mechanism enables the model to focus on the most discriminative features extracted from single-channel electroencephalography (EEG) recordings, while the CRF module captures the temporal dependencies between sleep stages, improving the model's ability to learn more plausible sleep stage sequences. Moreover, we explore the relationship between sleep stages and OSA severity by utilizing the predicted sleep stage features to train various regression models for Apnea-Hypopnea Index (AHI) prediction. Our experiments demonstrate an improved sleep stage classification performance of 78.7%, particularly on datasets with diverse AHI values, and highlight the potential of leveraging sleep stage information for monitoring OSA. By employing advanced deep learning techniques, we thoroughly explore the intricate relationship between sleep stages and sleep apnea, laying the foundation for more precise and automated diagnostics of sleep disorders.
Collapse
|
15
|
Pan J, Yu Y, Li M, Wei W, Chen S, Zheng H, He Y, Li Y. A Multimodal Consistency-Based Self-Supervised Contrastive Learning Framework for Automated Sleep Staging in Patients With Disorders of Consciousness. IEEE J Biomed Health Inform 2025; 29:1320-1332. [PMID: 39471113 DOI: 10.1109/jbhi.2024.3487657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2024]
Abstract
Sleep is a fundamental human activity, and automated sleep staging holds considerable investigational potential. Despite numerous deep learning methods proposed for sleep staging that exhibit notable performance, several challenges remain unresolved, including inadequate representation and generalization capabilities, limitations in multimodal feature extraction, the scarcity of labeled data, and the restricted practical application for patients with disorder of consciousness (DOC). This paper proposes MultiConsSleepNet, a multimodal consistency-based sleep staging network. This network comprises a unimodal feature extractor and a multimodal consistency feature extractor, aiming to explore universal representations of electroencephalograms (EEGs) and electrooculograms (EOGs) and extract the consistency of intra- and intermodal features. Additionally, self-supervised contrastive learning strategies are designed for unimodal and multimodal consistency learning to address the current situation in clinical practice where it is difficult to obtain high-quality labeled data but has a huge amount of unlabeled data. It can effectively alleviate the model's dependence on labeled data, and improve the model's generalizability for effective migration to DOC patients. Experimental results on three publicly available datasets demonstrate that MultiConsSleepNet achieves state-of-the-art performance in sleep staging with limited labeled data and effectively utilizes unlabeled data, enhancing its practical applicability. Furthermore, the proposed model yields promising results on a self-collected DOC dataset, offering a novel perspective for sleep staging research in patients with DOC.
Collapse
|
16
|
Pei Y, Xu J, Yu F, Zhang L, Luo W. WaveSleepNet: An Interpretable Network for Expert-Like Sleep Staging. IEEE J Biomed Health Inform 2025; 29:1371-1382. [PMID: 40030379 DOI: 10.1109/jbhi.2024.3498871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2025]
Abstract
Although deep learning algorithms have proven their efficiency in automatic sleep staging, their "black-box" nature has limited their clinical adoption. In this study, we propose WaveSleepNet, an interpretable neural network for sleep staging that reasons in a similar way to sleep clinical experts. In this network, we utilize the latent space representations generated during training to identify characteristic wave prototypes corresponding to different sleep stages. The feature representation of an input signal is segmented into patches within the latent space, each of which is compared against the learned wave prototypes. The proximity between these patches and the wave prototypes is quantified through scores, indicating the prototypes' presence and relative proportion within the signal. The scores serve as the decision-making criteria for final sleep staging. During training, an ensemble of loss functions is employed for the prototypes' diversity and robustness. Furthermore, the learned wave prototypes are visualized by analyzing occlusion sensitivity. The efficacy of WaveSleepNet is validated across three public datasets, achieving sleep staging performance that are on par with those of the state-of-the-art models. A detailed case study examining the decision-making process of WaveSleepNet demonstrates that it aligns closely with American Academy of Sleep Medicine (AASM) manual guidelines. Another case study systematically explained the misidentified reasons behind each sleep stage. WaveSleepNet's transparent process provides specialists with direct access to the physiological significance of the model's criteria, allowing for future validation, adoption and further enrichment by sleep clinical experts.
Collapse
|
17
|
Xiong H, Yan Y, Chen Y, Liu J. Graph convolution network-based eeg signal analysis: a review. Med Biol Eng Comput 2025:10.1007/s11517-025-03295-0. [PMID: 39883372 DOI: 10.1007/s11517-025-03295-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Accepted: 01/07/2025] [Indexed: 01/31/2025]
Abstract
With the advancement of artificial intelligence technology, more and more effective methods are being used to identify and classify Electroencephalography (EEG) signals to address challenges in healthcare and brain-computer interface fields. The applications and major achievements of Graph Convolution Network (GCN) techniques in EEG signal analysis are reviewed in this paper. Through an exhaustive search of the published literature, a module-by-module discussion is carried out for the first time to address the current research status of GCN. An exhaustive classification of methods and a systematic analysis of key modules, such as brain map construction, node feature extraction, and GCN architecture design, are presented. In addition, we pay special attention to several key research issues related to GCN. This review enhances the understanding of the future potential of GCN in the field of EEG signal analysis. At the same time, several valuable development directions are sorted out for researchers in related fields, such as analysing the applicability of different GCN layers, building task-oriented GCN models, and improving adaptation to limited data.
Collapse
Affiliation(s)
- Hui Xiong
- School of Control Science and Engineering, Tiangong University, Tianjin, 300387, China.
- Key Laboratory of Intelligent Control of Electrical Equipment, Tiangong University, Tianjin, 300387, China.
| | - Yan Yan
- Key Laboratory of Intelligent Control of Electrical Equipment, Tiangong University, Tianjin, 300387, China
- School of Artificial Intelligence, Tiangong University, Tianjin, 300387, China
| | - Yimei Chen
- School of Control Science and Engineering, Tiangong University, Tianjin, 300387, China
| | - Jinzhen Liu
- School of Control Science and Engineering, Tiangong University, Tianjin, 300387, China
- Key Laboratory of Intelligent Control of Electrical Equipment, Tiangong University, Tianjin, 300387, China
| |
Collapse
|
18
|
Jiao Y, He X. Recognizing drivers' sleep onset by detecting slow eye movement using a parallel multimodal one-dimensional convolutional neural network. Comput Methods Biomech Biomed Engin 2025:1-15. [PMID: 39877998 DOI: 10.1080/10255842.2025.2456996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 11/11/2024] [Accepted: 01/14/2025] [Indexed: 01/31/2025]
Abstract
Slow eye movements (SEMs) are a reliable physiological marker of drivers' sleep onset, often accompanied by EEG alpha wave attenuation. A parallel multimodal 1D convolutional neural network (PM-1D-CNN) model is proposed to classify SEMs. The model uses two parallel 1D-CNN blocks to extract features from EOG and EEG signals, which are then fused and fed into fully connected layers for classification. Results show that the PM-1D-CNN outperforms the SGL-1D-CNN and Bimodal-LSTM networks in both subject-to-subject and cross-subject evaluations, confirming its effectiveness in detecting sleep onset.
Collapse
Affiliation(s)
- Yingying Jiao
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, Changzhou University, Changzhou, P.R. China
| | - Xiujin He
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, Changzhou University, Changzhou, P.R. China
| |
Collapse
|
19
|
Lee H, Choi YR, Lee HK, Jeong J, Hong J, Shin HW, Kim HS. Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals. NPJ Digit Med 2025; 8:55. [PMID: 39863774 PMCID: PMC11762271 DOI: 10.1038/s41746-024-01378-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 12/10/2024] [Indexed: 01/27/2025] Open
Abstract
Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the 'black-box' nature. In this study, we present SleepXViT, an automatic sleep staging system using Vision Transformer (ViT) that provides intuitive, consistent explanations by mimicking human 'visual scoring'. Tested on KISS-a PSG image dataset from 7745 patients across four hospitals-SleepXViT achieved a Macro F1 score of 81.94%, outperforming baseline models and showing robust performances on public datasets SHHS1 and SHHS2. Furthermore, SleepXViT offers well-calibrated confidence scores, enabling expert review for low-confidence predictions, alongside high-resolution heatmaps highlighting essential features and relevance scores for adjacent epochs' influence on sleep stage predictions. Together, these explanations reinforce the scoring consistency of SleepXViT, making it both reliable and interpretable, thereby facilitating the synergy between the AI model and human scorers in clinical settings.
Collapse
Affiliation(s)
- Hyojin Lee
- Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea
| | - You Rim Choi
- Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea
| | - Hyun Kyung Lee
- Obstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Jaemin Jeong
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Joopyo Hong
- Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea
| | - Hyun-Woo Shin
- Obstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of Medicine, Seoul, Republic of Korea.
- OUaR LaB, Inc, Seoul, Republic of Korea.
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul, Republic of Korea.
- Sensory Organ Research Institute, Seoul National University Medical Research Center, Seoul, Republic of Korea.
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea.
| | - Hyung-Sin Kim
- Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
20
|
Feng X, Guo Z, Kwong S. ID3RSNet: cross-subject driver drowsiness detection from raw single-channel EEG with an interpretable residual shrinkage network. Front Neurosci 2025; 18:1508747. [PMID: 39844854 PMCID: PMC11751225 DOI: 10.3389/fnins.2024.1508747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Accepted: 12/24/2024] [Indexed: 01/24/2025] Open
Abstract
Accurate monitoring of drowsy driving through electroencephalography (EEG) can effectively reduce traffic accidents. Developing a calibration-free drowsiness detection system with single-channel EEG alone is very challenging due to the non-stationarity of EEG signals, the heterogeneity among different individuals, and the relatively parsimonious compared to multi-channel EEG. Although deep learning-based approaches can effectively decode EEG signals, most deep learning models lack interpretability due to their black-box nature. To address these issues, we propose a novel interpretable residual shrinkage network, namely, ID3RSNet, for cross-subject driver drowsiness detection using single-channel EEG signals. First, a base feature extractor is employed to extract the essential features of EEG frequencies; to enhance the discriminative feature learning ability, the residual shrinkage building unit with attention mechanism is adopted to perform adaptive feature recalibration and soft threshold denoising inside the residual network is further applied to achieve automatic feature extraction. In addition, a fully connected layer with weight freezing is utilized to effectively suppress the negative influence of neurons on the model classification. With the global average pooling (GAP) layer incorporated in the residual shrinkage network structure, we introduce an EEG-based Class Activation Map (ECAM) interpretable method to enable visualization analysis of sample-wise learned patterns to effectively explain the model decision. Extensive experimental results demonstrate that the proposed method achieves the superior classification performance and has found neurophysiologically reliable evidence of classification.
Collapse
Affiliation(s)
- Xiao Feng
- School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, China
- Henan High-speed Railway Operation and Maintenance Engineering Research Center, Zhengzhou, Henan, China
| | - Zhongyuan Guo
- College of Electronic and Information Engineering, Southwest University, Chongqing, China
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Sam Kwong
- School of Data Science, Lingnan University, Hong Kong SAR, China
| |
Collapse
|
21
|
Zhang Y, Zhou L, Zhu S, Zhou Y, Wang Z, Ma L, Yuan Y, Xie Y, Niu X, Su Y, Liu H, Hei X, Shi Z, Ren X, Shi Y. Deep Learning for Obstructive Sleep Apnea Detection and Severity Assessment: A Multimodal Signals Fusion Multiscale Transformer Model. Nat Sci Sleep 2025; 17:1-15. [PMID: 39801628 PMCID: PMC11720996 DOI: 10.2147/nss.s492806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Accepted: 12/21/2024] [Indexed: 01/16/2025] Open
Abstract
Purpose To develop a deep learning (DL) model for obstructive sleep apnea (OSA) detection and severity assessment and provide a new approach for convenient, economical, and accurate disease detection. Methods Considering medical reliability and acquisition simplicity, we used electrocardiogram (ECG) and oxygen saturation (SpO2) signals to develop a multimodal signal fusion multiscale Transformer model for OSA detection and severity assessment. The proposed model comprises signal preprocessing, feature extraction, cross-modal interaction, and classification modules. A total of 510 patients who underwent polysomnography were included in the hospital dataset. The model was tested on hospital and public datasets. The hospital dataset was utilized to demonstrate the applicability and generalizability of the model. Two public datasets, Apnea-ECG dataset (consisting of 8 recordings) and UCD dataset (consisting of 21 recordings), were used to compare the results with those of previous studies. Results In the hospital dataset, the accuracy (Acc) values of per-segment and per-recording detection were 91.38 and 96.08%, respectively. The Acc values for mild, moderate, and severe OSA were 90.20, 88.24, and 92.16%, respectively. The Bland‒Altman plots revealed the consistency of the true apnea-hypopnea index (AHI) and the predicted AHI. In the public datasets, the per-segment detection Acc values of the Apnea-ECG and UCD datasets were 95.04 and 90.56%, respectively. Conclusion The experiments on hospital and public datasets have demonstrated that the proposed model is more advanced, accurate, and applicable in OSA detection and severity assessment than previous models.
Collapse
Affiliation(s)
- Yitong Zhang
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Liang Zhou
- School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, Shaanxi Province, People’s Republic of China
| | - Simin Zhu
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Yanuo Zhou
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Zitong Wang
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Lina Ma
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Yuqi Yuan
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Yushan Xie
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Xiaoxin Niu
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Yonglong Su
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Haiqin Liu
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Xinhong Hei
- School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, Shaanxi Province, People’s Republic of China
| | - Zhenghao Shi
- School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, Shaanxi Province, People’s Republic of China
| | - Xiaoyong Ren
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| | - Yewen Shi
- Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi Province, People’s Republic of China
| |
Collapse
|
22
|
Hu S, Wang Y, Liu J, Cui Z, Yang C, Yao Z, Ge J. IPCT-Net: Parallel information bottleneck modality fusion network for obstructive sleep apnea diagnosis. Neural Netw 2025; 181:106836. [PMID: 39471579 DOI: 10.1016/j.neunet.2024.106836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 09/14/2024] [Accepted: 10/19/2024] [Indexed: 11/01/2024]
Abstract
Obstructive sleep apnea (OSA) is a common sleep breathing disorder and timely diagnosis helps to avoid the serious medical expenses caused by related complications. Existing deep learning (DL)-based methods primarily focus on single-modal models, which cannot fully mine task-related representations. This paper develops a modality fusion representation enhancement (MFRE) framework adaptable to flexible modality fusion types with the objective of improving OSA diagnostic performance, and providing quantitative evidence for clinical diagnostic modality selection. The proposed parallel information bottleneck modality fusion network (IPCT-Net) can extract local-global multi-view representations and eliminate redundant information in modality fusion representations through branch sharing mechanisms. We utilize large-scale real-world home sleep apnea test (HSAT) multimodal data to comprehensively evaluate relevant modality fusion types. Extensive experiments demonstrate that the proposed method significantly outperforms existing methods in terms of participant numbers and OSA diagnostic performance. The proposed MFRE framework delves into modality fusion in OSA diagnosis and contributes to enhancing the screening performance of artificial intelligence (AI)-assisted diagnosis for OSA.
Collapse
Affiliation(s)
- Shuaicong Hu
- Department of Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, China
| | - Yanan Wang
- Department of Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, China
| | - Jian Liu
- Department of Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, China
| | - Zhaoqiang Cui
- Department of Cardiology, Zhongshan Hospital, Fudan University, Shanghai 200032, China; Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Cuiwei Yang
- Department of Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai 200093, China.
| | - Zhifeng Yao
- Department of Cardiology, Zhongshan Hospital, Fudan University, Shanghai 200032, China; Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University, Shanghai 200032, China.
| | - Junbo Ge
- Department of Cardiology, Zhongshan Hospital, Fudan University, Shanghai 200032, China; Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University, Shanghai 200032, China.
| |
Collapse
|
23
|
Zhou W, Zhu H, Chen W, Chen C, Xu J. Outlier Handling Strategy of Ensembled-Based Sequential Convolutional Neural Networks for Sleep Stage Classification. Bioengineering (Basel) 2024; 11:1226. [PMID: 39768044 PMCID: PMC11673830 DOI: 10.3390/bioengineering11121226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 11/22/2024] [Accepted: 11/30/2024] [Indexed: 01/11/2025] Open
Abstract
The pivotal role of sleep has led to extensive research endeavors aimed at automatic sleep stage classification. However, existing methods perform poorly when classifying small groups or individuals, and these results are often considered outliers in terms of overall performance. These outliers may introduce bias during model training, adversely affecting feature selection and diminishing model performance. To address the above issues, this paper proposes an ensemble-based sequential convolutional neural network (E-SCNN) that incorporates a clustering module and neural networks. E-SCNN effectively ensembles machine learning and deep learning techniques to minimize outliers, thereby enhancing model robustness at the individual level. Specifically, the clustering module categorizes individuals based on similarities in feature distribution and assigns personalized weights accordingly. Subsequently, by combining these tailored weights with the robust feature extraction capabilities of convolutional neural networks, the model generates more accurate sleep stage classifications. The proposed model was verified on two public datasets, and experimental results demonstrate that the proposed method obtains overall accuracies of 84.8% on the Sleep-EDF Expanded dataset and 85.5% on the MASS dataset. E-SCNN can alleviate the outlier problem, which is important for improving sleep quality monitoring for individuals.
Collapse
Affiliation(s)
- Wei Zhou
- Jiangsu Key Laboratory of Intelligent Medical Image Computing, Nanjing 210044, China;
- School of Future Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China
| | - Hangyu Zhu
- Center for Intelligent Medical Electronics (CIME), School of Information Science and Engineering, Fudan University, Shanghai 200433, China;
| | - Wei Chen
- School of Biomedical Engineering, The University of Sydney, Sydney, NSW 2006, Australia;
| | - Chen Chen
- Center for Medical Research and Innovation, Shanghai Pudong Hosptial, Fudan University Pudong Medical Center, Shanghai 201203, China
- Human Phenome Institute, Fudan University, Shanghai 200438, China
| | - Jun Xu
- Jiangsu Key Laboratory of Intelligent Medical Image Computing, Nanjing 210044, China;
- School of Future Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China
| |
Collapse
|
24
|
Liu Z, Zhang Q, Luo S, Qin M. FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention. Interdiscip Sci 2024; 16:769-780. [PMID: 39155326 DOI: 10.1007/s12539-024-00636-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 05/07/2024] [Accepted: 05/13/2024] [Indexed: 08/20/2024]
Abstract
Sleep staging is the most crucial work before diagnosing and treating sleep disorders. Traditional manual sleep staging is time-consuming and depends on the skill of experts. Nowadays, automatic sleep staging based on deep learning attracts more and more scientific researchers. As we know, the salient waves in sleep signals contain the most important information for automatic sleep staging. However, the key information is not fully utilized in existing deep learning methods since most of them only use CNN or RNN which could not capture multi-scale features in salient waves effectively. To tackle this limitation, we propose a lightweight end-to-end network for sleep stage prediction based on feature pyramid and joint attention. The feature pyramid module is designed to effectively extract multi-scale features in salient waves, and these features are then fed to the joint attention module to closely attend to the channel and location information of the salient waves. The proposed network has much fewer parameters and significant performance improvement, which is better than the state-of-the-art results. The overall accuracy and macro F1 score on the public dataset Sleep-EDF39, Sleep-EDF153 and SHHS are 90.1%, 87.8%, 87.4%, 84.4% and 86.9%, 83.9%, respectively. Ablation experiments confirm the effectiveness of each module.
Collapse
Affiliation(s)
- Zhi Liu
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, 401135, China.
| | - Qinhan Zhang
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, 401135, China
| | - Sixin Luo
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, 401135, China
| | - Meiqiao Qin
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, 401135, China
| |
Collapse
|
25
|
Wang J, Zhao S, Jiang H, Zhou Y, Yu Z, Li T, Li S, Pan G. CareSleepNet: A Hybrid Deep Learning Network for Automatic Sleep Staging. IEEE J Biomed Health Inform 2024; 28:7392-7405. [PMID: 38990749 DOI: 10.1109/jbhi.2024.3426939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Sleep staging is essential for sleep assessment and plays an important role in disease diagnosis, which refers to the classification of sleep epochs into different sleep stages. Polysomnography (PSG), consisting of many different physiological signals, e.g. electroencephalogram (EEG) and electrooculogram (EOG), is a gold standard for sleep staging. Although existing studies have achieved high performance on automatic sleep staging from PSG, there are still some limitations: 1) they focus on local features but ignore global features within each sleep epoch, and 2) they ignore cross-modality context relationship between EEG and EOG. In this paper, we propose CareSleepNet, a novel hybrid deep learning network for automatic sleep staging from PSG recordings. Specifically, we first design a multi-scale Convolutional-Transformer Epoch Encoder to encode both local salient wave features and global features within each sleep epoch. Then, we devise a Cross-Modality Context Encoder based on co-attention mechanism to model cross-modality context relationship between different modalities. Next, we use a Transformer-based Sequence Encoder to capture the sequential relationship among sleep epochs. Finally, the learned feature representations are fed into an epoch-level classifier to determine the sleep stages. We collected a private sleep dataset, SSND, and use two public datasets, Sleep-EDF-153 and ISRUC to evaluate the performance of CareSleepNet. The experiment results show that our CareSleepNet achieves the state-of-the-art performance on the three datasets. Moreover, we conduct ablation studies and attention visualizations to prove the effectiveness of each module and to analyze the influence of each modality.
Collapse
|
26
|
Wang X, Zhu Y. SleepGCN: A transition rule learning model based on Graph Convolutional Network for sleep staging. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 257:108405. [PMID: 39243591 DOI: 10.1016/j.cmpb.2024.108405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 08/29/2024] [Accepted: 08/30/2024] [Indexed: 09/09/2024]
Abstract
BACKGROUND AND OBJECTIVE Automatic sleep staging is essential for assessing and diagnosing sleep disorders, serving millions of people who suffer from them. Numerous sleep staging models have been proposed recently, but most of them have not fully explored the sleep transition rules that are essential for sleep experts to identify sleep stages. Therefore, one objective of this paper is to develop an automatic sleep staging model to capture the transition rules between sleep stages. METHODS In this paper, we propose a novel sleep staging model named SleepGCN. It utilizes the deep features of electroencephalogram (EEG) and electrooculogram (EOG) signals extracted by the sleep representation learning (SRL) module, in conjunction with the transition rules learned by the sleep transition rule learning (STRL) module to identify sleep stages. Specifically, the SRL module utilizes the residual network (ResNet) and Long Short Term Memory (LSTM) structure to capture the deep time-invariant features and temporal information of each sleep stage from the two-channel EEG-EOG, and then applies a feature enhancement block to obtain the refined features. The STRL module employs a Graph Convolutional Network (GCN) and a transition rule matrix to capture transition rules between sleep stages based on the sequence labels of the input signals. RESULTS We evaluate SleepGCN on five public datasets: SleepEDF-20, SleepEDF-78, SHHS, DOD-H and DOD-O. Overall, SleepGCN achieves an accuracy of 89.70%, 87.70%, 86.16%, 82.07%, and 81.20%, alongside a macro-average F1-score of 85.20%, 82.70%, 77.69%, 72.44%, and 72.93% across these datasets, respectively. CONCLUSIONS The results achieved by our proposed model are much better than those of all other compared models. The ablation study validates the contributions of the SRL and STRL modules proposed in SleepGCN to the sleep staging tasks. Additionally, it shows that the sleep staging model using two-channel EEG-EOG outperforms the model using single-channel EEG or EOG. Overall, SleepGCN is an effective solution for sleep staging using two-channel EEG-EOG.
Collapse
Affiliation(s)
- Xuhui Wang
- School of Computer Science, Wuhan University, Wuhan, 430061, China
| | - Yuanyuan Zhu
- School of Computer Science, Wuhan University, Wuhan, 430061, China.
| |
Collapse
|
27
|
Zhao C, Wu W, Zhang H, Zhang R, Zheng X, Kong X. Sleep Stage Classification Via Multi-View Based Self-Supervised Contrastive Learning of EEG. IEEE J Biomed Health Inform 2024; 28:7068-7077. [PMID: 39190518 DOI: 10.1109/jbhi.2024.3432633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
Self-supervised learning (SSL) is a challenging task in sleep stage classification (SSC) that is capable of mining valuable representations from unlabeled data. However, traditional SSL methods typically focus on single-view learning and do not fully exploit the interactions among information across multiple views. In this study, we focused on a multi-domain view of the same EEG signal and developed a self-supervised multi-view representation learning framework via time series and time-frequency contrasting (MV-TTFC). In the MV-TTFC framework, we built-in a cross-domain view contrastive learning prediction task to establish connections between the temporal view and time-frequency (TF) view, thereby enhancing the information exchange between multiple views. In addition, to improve the quality of the TF view inputs, we introduced an enhanced multisynchrosqueezing transform, which can create high energy concentration TF image views to compensate for the inaccurate representations in traditional TF processing techniques. Finally, integrating temporal, TF, and fusion space contrastive learning effectively captured the latent features in EEG signals. We evaluated MV-TTFC based on two real-world SSC datasets (SleepEDF-78 and SHHS) and compared it with baseline methods in downstream tasks. Our method exhibited state-of-the-art performance, achieving accuracies of 78.64% and 81.45% with SleepEDF-78 and SHHS, respectively, and macro F1-scores of 70.39% with SleepEDF-78 and 70.47% with SHHS.
Collapse
|
28
|
Chen Y, Lv Y, Sun X, Poluektov M, Zhang Y, Penzel T. ESSN: An Efficient Sleep Sequence Network for Automatic Sleep Staging. IEEE J Biomed Health Inform 2024; 28:7447-7456. [PMID: 39141450 DOI: 10.1109/jbhi.2024.3443340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
By modeling the temporal dependencies of sleep sequence, advanced automatic sleep staging algorithms have achieved satisfactory performance, approaching the level of medical technicians and laying the foundation for clinical assistance. However, existing algorithms cannot adapt well to computing scenarios with limited computing power, such as portable sleep detection and consumer-level sleep disorder screening. In addition, existing algorithms still have the problem of N1 confusion. To address these issues, we propose an efficient sleep sequence network (ESSN) with an ingenious structure to achieve efficient automatic sleep staging at a low computational cost. A novel N1 structure loss is introduced based on the prior knowledge of N1 transition probability to alleviate the N1 stage confusion problem. On the SHHS dataset containing 5,793 subjects, the overall accuracy, macro F1, and Cohen's kappa of ESSN are 88.0%, 81.2%, and 0.831, respectively. When the input length is 200, the parameters and floating-point operations of ESSN are 0.27M and 0.35G, respectively. With a lead in accuracy, ESSN inference is twice as fast as L-SeqSleepNet on the same device. Therefore, our proposed model exhibits solid competitive advantages comparing to other state-of-the-art automatic sleep staging methods.
Collapse
|
29
|
Zhang X, He G, Shang T, Fan F. Comparative Analysis of Single-Channel and Multi-Channel Classification of Sleep Stages Across Four Different Data Sets. Brain Sci 2024; 14:1201. [PMID: 39766400 PMCID: PMC11674470 DOI: 10.3390/brainsci14121201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 11/21/2024] [Accepted: 11/25/2024] [Indexed: 01/11/2025] Open
Abstract
Background: Manually labeling sleep stages is time-consuming and labor-intensive, making automatic sleep staging methods crucial for practical sleep monitoring. While both single- and multi-channel data are commonly used in automatic sleep staging, limited research has adequately investigated the differences in their effectiveness. Methods: In this study, four public data sets-Sleep-SC, APPLES, SHHS1, and MrOS1-are utilized, and an advanced hybrid attention neural network composed of a multi-branch convolutional neural network and the multi-head attention mechanism is employed for automatic sleep staging. Results: The experimental results show that, for sleep staging using 2-5 classes, a combination of single-channel electroencephalography (EEG) and dual-channel electrooculography (EOG) consistently outperforms single-channel EEG with single-channel EOG, which in turn outperforms single-channel EEG or single-channel EOG alone. For instance, for five-class sleep staging using the MrOS1 data set, the combination of single-channel EEG and dual-channel EOG resulted in an accuracy of 87.18%, whereas the combination of single-channel EEG and single-channel EOG yielded an accuracy of 85.77%. In comparison, single-channel EEG alone achieved an accuracy of 85.25% and single-channel EOG alone achieved an accuracy of 83.66%. Conclusions: This study highlights the significance of combining EEG and EOG signals in automatic sleep staging, while also providing valuable insights for the channel design of portable sleep monitoring devices.
Collapse
Affiliation(s)
- Xingjian Zhang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China;
| | - Gewen He
- Department of Computer Science, Florida State University, Tallahassee, FL 32306, USA;
| | - Tingyu Shang
- School of Mathematics and Statistics, Shaanxi Normal University, Xi’an 710062, China;
| | - Fangfang Fan
- Department of Neurology, Beth Isreal Deaconess Medical Center, Harvard Medical School, Harvard University, Cambridge, MA 02215, USA
| |
Collapse
|
30
|
Adey B, Habib A, Karmakar C. Exploration of an intrinsically explainable self-attention based model for prototype generation on single-channel EEG sleep stage classification. Sci Rep 2024; 14:27612. [PMID: 39528813 PMCID: PMC11555387 DOI: 10.1038/s41598-024-79139-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 11/05/2024] [Indexed: 11/16/2024] Open
Abstract
Prototype-based methods in deep learning offer interpretable explanations for decisions by comparing inputs to typical representatives in the data. This study explores the adaptation of SESM, a self-attention-based prototype method successful in electrocardiogram (ECG) tasks, for electroencephalogram (EEG) signals. The architecture is evaluated on sleep stage classification, exploring its efficacy in predicting stages with single-channel EEG. The model achieves comparable test accuracy compared to EEGNet, a state-of-the-art black-box architecture for EEG classification. The generated prototypical components are exaimed qualitatively and using the area over the perterbation curve (AOPC) indicate some alignment with expected bio-markers for different sleep stages such as alpha spindles and slow waves in non-REM sleep, but the results are severely limited by the model's ability to only extract and present information in the time-domain. Ablation studies are used to explore the impact of kernel size, number of heads, and diversity threshold on model performance and explainability. This study represents the first application of a self-attention based prototype method to EEG data and provides a step forward in explainable AI for EEG data analysis.
Collapse
Affiliation(s)
- Brenton Adey
- School of Information Technology, Deakin University, Geelong, 3225, Australia
| | - Ahsan Habib
- School of Information Technology, Deakin University, Geelong, 3225, Australia.
| | - Chandan Karmakar
- School of Information Technology, Deakin University, Geelong, 3225, Australia
| |
Collapse
|
31
|
Bao J, Wang G, Wang T, Wu N, Hu S, Lee WH, Lo SL, Yan X, Zheng Y, Wang G. A Feature Fusion Model Based on Temporal Convolutional Network for Automatic Sleep Staging Using Single-Channel EEG. IEEE J Biomed Health Inform 2024; 28:6641-6652. [PMID: 39504300 DOI: 10.1109/jbhi.2024.3457969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2024]
Abstract
Sleep staging is a crucial task in sleep monitoring and diagnosis, but clinical sleep staging is both time-consuming and subjective. In this study, we proposed a novel deep learning algorithm named feature fusion temporal convolutional network (FFTCN) for automatic sleep staging using single-channel EEG data. This algorithm employed a one-dimensional convolutional neural network (1D-CNN) to extract temporal features from raw EEG, and a two-dimensional CNN (2D-CNN) to extract time-frequency features from spectrograms generated through continuous wavelet transform (CWT) at the epoch level. These features were subsequently fused and further fed into a temporal convolutional network (TCN) to classify sleep stages at the sequence level. Moreover, a two-step training strategy was used to enhance the model's performance on an imbalanced dataset. Our proposed method exhibits superior performance in the 5-class classification task for healthy subjects, as evaluated on the SHHS-1, Sleep-EDF-153, and ISRUC-S1 datasets. This work provided a straightforward and promising method for improving the accuracy of automatic sleep staging using only single-channel EEG, and the proposed method exhibited great potential for future applications in professional sleep monitoring, which could effectively alleviate the workload of sleep technicians.
Collapse
|
32
|
Guo Y, Nowakowski M, Dai W. FlexSleepTransformer: a transformer-based sleep staging model with flexible input channel configurations. Sci Rep 2024; 14:26312. [PMID: 39487223 PMCID: PMC11530688 DOI: 10.1038/s41598-024-76197-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 10/11/2024] [Indexed: 11/04/2024] Open
Abstract
Clinical sleep diagnosis traditionally relies on polysomnography (PSG) and expert manual classification of sleep stages. Recent advancements in deep learning have shown promise in automating sleep stage classification using a single PSG channel. However, variations in PSG acquisition devices and environments mean that the number of PSG channels can differ across sleep centers. To integrate a sleep staging method into clinical practice effectively, it must accommodate a flexible number of PSG channels. In this paper, we proposed FlexSleepTransformer, a transformer-based model designed to handle varying number of input channels, making it adaptable to diverse sleep staging datasets. We evaluated FlexSleepTransformer using two distinct datasets: the public SleepEDF-78 dataset and the local SleepUHS dataset. Notably, FlexSleepTransformer is the first model capable of simultaneously training on datasets with differing number of PSG channels. Our experiments showed that FlexSleepTransformer trained on both datasets together achieved 98% of the accuracy compared to models trained on each dataset individually. Furthermore, it outperformed models trained exclusively on one dataset when tested on the other dataset. Additionally, FlexSleepTransformer surpassed state-of-the-art CNN and RNN-based models on both datasets. Due to its adaptability with varying channels numbers, FlexSleepTransformer holds significant potential for clinical adoption, especially when trained with data from a wide range of sleep centers.
Collapse
Affiliation(s)
- Yanchen Guo
- School of Computing, State University of New York at Binghamton, Binghamton, NY, 13902, USA
| | - Maciej Nowakowski
- Sleep Medicine, United Health Services Hospitals, Inc, Binghamton, NY, 13902, USA
| | - Weiying Dai
- School of Computing, State University of New York at Binghamton, Binghamton, NY, 13902, USA.
| |
Collapse
|
33
|
Guo Y, Xia X, Shi Y, Ying Y, Men H. Olfactory EEG induced by odor: Used for food identification and pleasure analysis. Food Chem 2024; 455:139816. [PMID: 38816280 DOI: 10.1016/j.foodchem.2024.139816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 05/13/2024] [Accepted: 05/22/2024] [Indexed: 06/01/2024]
Abstract
As the need for food authenticity verification increases, sensory evaluation of food odors has become widely recognized. This study presents a theory based on electroencephalography (EEG) to create an Olfactory Perception Dimensional Space (EEG-OPDS), using feature engineering and ensemble learning to establish material and emotional spaces based on odor perception and pleasure. The study examines the intrinsic connection between these two spaces and explores the mechanisms of integration and differentiation in constructing the OPDS. This method effectively visualizes various types of food odors while identifying their perceptual intensity and pleasantness. The average classification accuracy for odor recognition in an eight-category experiment is 96.1%. Conversely, the average classification accuracy for sensory pleasantness recognition in a two-category experiment is 98.8%. The theoretical approach proposed in this study, based on olfactory EEG signals to construct an OPDS, captures the subtle perceptual differences and individualized pleasantness responses to food odors.
Collapse
Affiliation(s)
- Yuchen Guo
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China; Bionic Sensing and Pattern Recognition Research Team, Northeast Electric Power University, Jilin 132012, China.
| | - Xiuxin Xia
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China; Bionic Sensing and Pattern Recognition Research Team, Northeast Electric Power University, Jilin 132012, China.
| | - Yan Shi
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China; Bionic Sensing and Pattern Recognition Research Team, Northeast Electric Power University, Jilin 132012, China
| | - Yuxiang Ying
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China
| | - Hong Men
- School of Automation Engineering, Northeast Electric Power University, Jilin 132012, China.
| |
Collapse
|
34
|
Xia X, Shi Y, Li P, Liu X, Liu J, Men H. FBANet: An Effective Data Mining Method for Food Olfactory EEG Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13550-13560. [PMID: 37220050 DOI: 10.1109/tnnls.2023.3269949] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
At present, the sensory evaluation of food mostly depends on artificial sensory evaluation and machine perception, but artificial sensory evaluation is greatly interfered with by subjective factors, and machine perception is difficult to reflect human feelings. In this article, a frequency band attention network (FBANet) for olfactory electroencephalogram (EEG) was proposed to distinguish the difference in food odor. First, the olfactory EEG evoked experiment was designed to collect the olfactory EEG, and the preprocessing of olfactory EEG, such as frequency division, was completed. Second, the FBANet consisted of frequency band feature mining and frequency band feature self-attention, in which frequency band feature mining can effectively mine multiband features of olfactory EEG with different scales, and frequency band feature self-attention can integrate the extracted multiband features and realize classification. Finally, compared with other advanced models, the performance of the FBANet was evaluated. The results show that FBANet was better than the state-of-the-art techniques. In conclusion, FBANet effectively mined the olfactory EEG data information and distinguished the differences between the eight food odors, which proposed a new idea for food sensory evaluation based on multiband olfactory EEG analysis.
Collapse
|
35
|
Tan X, Wang D, Xu M, Chen J, Wu S. Efficient Multi-View Graph Convolutional Network with Self-Attention for Multi-Class Motor Imagery Decoding. Bioengineering (Basel) 2024; 11:926. [PMID: 39329668 PMCID: PMC11428916 DOI: 10.3390/bioengineering11090926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Revised: 09/11/2024] [Accepted: 09/14/2024] [Indexed: 09/28/2024] Open
Abstract
Research on electroencephalogram-based motor imagery (MI-EEG) can identify the limbs of subjects that generate motor imagination by decoding EEG signals, which is an important issue in the field of brain-computer interface (BCI). Existing deep-learning-based classification methods have not been able to entirely employ the topological information among brain regions, and thus, the classification performance needs further improving. In this paper, we propose a multi-view graph convolutional attention network (MGCANet) with residual learning structure for multi-class MI decoding. Specifically, we design a multi-view graph convolution spatial feature extraction method based on the topological relationship of brain regions to achieve more comprehensive information aggregation. During the modeling, we build an adaptive weight fusion (Awf) module to adaptively merge feature from different brain views to improve classification accuracy. In addition, the self-attention mechanism is introduced for feature selection to expand the receptive field of EEG signals to global dependence and enhance the expression of important features. The proposed model is experimentally evaluated on two public MI datasets and achieved a mean accuracy of 78.26% (BCIC IV 2a dataset) and 73.68% (OpenBMI dataset), which significantly outperforms representative comparative methods in classification accuracy. Comprehensive experiment results verify the effectiveness of our proposed method, which can provide novel perspectives for MI decoding.
Collapse
Affiliation(s)
| | - Dan Wang
- College of Computer Science, Beijing University of Technology, Beijing 100124, China; (X.T.)
| | | | | | | |
Collapse
|
36
|
Mostafaei SH, Tanha J, Sharafkhaneh A. A novel deep learning model based on transformer and cross modality attention for classification of sleep stages. J Biomed Inform 2024; 157:104689. [PMID: 39029770 DOI: 10.1016/j.jbi.2024.104689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 06/13/2024] [Accepted: 07/10/2024] [Indexed: 07/21/2024]
Abstract
The classification of sleep stages is crucial for gaining insights into an individual's sleep patterns and identifying potential health issues. Employing several important physiological channels in different views, each providing a distinct perspective on sleep patterns, can have a great impact on the efficiency of the classification models. In the context of neural networks and deep learning models, transformers are very effective, especially when dealing with time series data, and have shown remarkable compatibility with sequential data analysis as physiological channels. On the other hand, cross-modality attention by integrating information from multiple views of the data enables to capture relationships among different modalities, allowing models to selectively focus on relevant information from each modality. In this paper, we introduce a novel deep-learning model based on transformer encoder-decoder and cross-modal attention for sleep stage classification. The proposed model processes information from various physiological channels with different modalities using the Sleep Heart Health Study Dataset (SHHS) data and leverages transformer encoders for feature extraction and cross-modal attention for effective integration to feed into the transformer decoder. The combination of these elements increased the accuracy of the model up to 91.33% in classifying five classes of sleep stages. Empirical evaluations demonstrated the model's superior performance compared to standalone approaches and other state-of-the-art techniques, showcasing the potential of combining transformer and cross-modal attention for improved sleep stage classification.
Collapse
Affiliation(s)
| | - Jafar Tanha
- Faculty of Electrical and Computer Engineering, University of Tabriz, P.O. Box 51666-16471, Tabriz, Iran.
| | - Amir Sharafkhaneh
- Professor of Medicine, Section of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
37
|
Zaman A, Kumar S, Shatabda S, Dehzangi I, Sharma A. SleepBoost: a multi-level tree-based ensemble model for automatic sleep stage classification. Med Biol Eng Comput 2024; 62:2769-2783. [PMID: 38700613 DOI: 10.1007/s11517-024-03096-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 04/14/2024] [Indexed: 05/16/2024]
Abstract
Neurodegenerative diseases often exhibit a strong link with sleep disruption, highlighting the importance of effective sleep stage monitoring. In this light, automatic sleep stage classification (ASSC) plays a pivotal role, now more streamlined than ever due to the advancements in deep learning (DL). However, the opaque nature of DL models can be a barrier in their clinical adoption, due to trust concerns among medical practitioners. To bridge this gap, we introduce SleepBoost, a transparent multi-level tree-based ensemble model specifically designed for ASSC. Our approach includes a crafted feature engineering block (FEB) that extracts 41 time and frequency domain features, out of which 23 are selected based on their high mutual information score (> 0.23). Uniquely, SleepBoost integrates three fundamental linear models into a cohesive multi-level tree structure, further enhanced by a novel reward-based adaptive weight allocation mechanism. Tested on the Sleep-EDF-20 dataset, SleepBoost demonstrates superior performance with an accuracy of 86.3%, F1-score of 80.9%, and Cohen kappa score of 0.807, outperforming leading DL models in ASSC. An ablation study underscores the critical role of our selective feature extraction in enhancing model accuracy and interpretability, crucial for clinical settings. This innovative approach not only offers a more transparent alternative to traditional DL models but also extends potential implications for monitoring and understanding sleep patterns in the context of neurodegenerative disorders. The open-source availability of SleepBoost's implementation at https://github.com/akibzaman/SleepBoost can further facilitate its accessibility and potential for widespread clinical adoption.
Collapse
Affiliation(s)
- Akib Zaman
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering and Computer Science Department, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Shiu Kumar
- School of Electrical & Electronics Engineering, Fiji National University, Suva, Fiji.
| | - Swakkhar Shatabda
- Centre for Artificial Intelligence and Robotics (CAIR), United International University, Dhaka, Bangladesh
| | - Iman Dehzangi
- Department of Computer Science, Rutgers University, Camden, NJ, USA
- Center for Computational and Integrative Biology, Rutgers University, Camden, USA
| | - Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan
- Institute for Integrated and Intelligent Systems, Griffith University, Nathan, Brisbane, QLD, Australia
| |
Collapse
|
38
|
Zhou W, Shen N, Zhou L, Liu M, Zhang Y, Fu C, Yu H, Shu F, Chen W, Chen C. PSEENet: A Pseudo-Siamese Neural Network Incorporating Electroencephalography and Electrooculography Characteristics for Heterogeneous Sleep Staging. IEEE J Biomed Health Inform 2024; 28:5189-5200. [PMID: 38771683 DOI: 10.1109/jbhi.2024.3403878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
Sleep staging plays a critical role in evaluating the quality of sleep. Currently, most studies are either suffering from dramatic performance drops when coping with varying input modalities or unable to handle heterogeneous signals. To handle heterogeneous signals and guarantee favorable sleep staging performance when a single modality is available, a pseudo-siamese neural network (PSN) to incorporate electroencephalography (EEG), electrooculography (EOG) characteristics is proposed (PSEENet). PSEENet consists of two parts, spatial mapping modules (SMMs) and a weight-shared classifier. SMMs are used to extract high-dimensional features. Meanwhile, joint linkages among multi-modalities are provided by quantifying the similarity of features. Finally, with the cooperation of heterogeneous characteristics, associations within various sleep stages can be established by the classifier. The evaluation of the model is validated on two public datasets, namely, Montreal Archive of Sleep Studies (MASS) and SleepEDFX, and one clinical dataset from Huashan Hospital of Fudan University (HSFU). Experimental results show that the model can handle heterogeneous signals, provide superior results under multimodal signals and show good performance with single modality. PSEENet obtains accuracy of 79.1%, 82.1% with EEG, EEG and EOG on Sleep-EDFX, and significantly improves the accuracy with EOG from 73.7% to 76% by introducing similarity information.
Collapse
|
39
|
Li C, Mu Y, Zhu P, Pan Y, Zhang S, Yang L, Xu P, Li F. Sleep stages classification by fusing the time-related synchronization analysis and brain activations. Brain Res Bull 2024; 215:111017. [PMID: 38914295 DOI: 10.1016/j.brainresbull.2024.111017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 05/28/2024] [Accepted: 06/15/2024] [Indexed: 06/26/2024]
Abstract
Sleep staging plays an important role in the diagnosis and treatment of clinical sleep disorders. The sleep staging standard defines every 30 seconds as a sleep period, which may mean that there exist similar brain activity patterns during the same sleep period. Thus, in this work, we propose a novel time-related synchronization analysis framework named time-related multimodal sleep scoring model (TRMSC) to explore the potential time-related patterns of sleeping. In the proposed TRMSC, the time-related synchronization analysis is first conducted on the single channel electrophysiological signal, i.e., Electroencephalogram (EEG) and Electrooculogram (EOG), to explore the time-related patterns, and the spectral activation features are also extracted by spectrum analysis to obtain the multimodal features. With the extracted multimodal features, the feature fusion and selection strategy is utilized to obtain the optimal feature set and achieve robust sleep staging. To verify the effectiveness of the proposed TRMSC, sleep staging experiments were conducted on the Sleep-EDF dataset, and the experimental results indicate that the proposed TRMSC has achieved better performance than other existing strategies, which proves that the time-related synchronization features can make up for the shortcomings of traditional spectrum-based strategies and achieve a higher classification accuracy. The proposed TRMSC model may be helpful for portable sleep analyzers and provide a new analytical method for clinical sleeping research.
Collapse
Affiliation(s)
- Cunbo Li
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yufeng Mu
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Pengcheng Zhu
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yue Pan
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Shuhan Zhang
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Lei Yang
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Peng Xu
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fali Li
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
40
|
Pradeepkumar J, Anandakumar M, Kugathasan V, Suntharalingham D, Kappel SL, De Silva AC, Edussooriya CUS. Toward Interpretable Sleep Stage Classification Using Cross-Modal Transformers. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2893-2904. [PMID: 39102323 DOI: 10.1109/tnsre.2024.3438610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/07/2024]
Abstract
Accurate sleep stage classification is significant for sleep health assessment. In recent years, several machine-learning based sleep staging algorithms have been developed, and in particular, deep-learning based algorithms have achieved performance on par with human annotation. Despite improved performance, a limitation of most deep-learning based algorithms is their black-box behavior, which have limited their use in clinical settings. Here, we propose a cross-modal transformer, which is a transformer-based method for sleep stage classification. The proposed cross-modal transformer consists of a cross-modal transformer encoder architecture along with a multi-scale one-dimensional convolutional neural network for automatic representation learning. The performance of our method is on-par with the state-of-the-art methods and eliminates the black-box behavior of deep-learning models by utilizing the interpretability aspect of the attention modules. Furthermore, our method provides considerable reductions in the number of parameters and training time compared to the state-of-the-art methods. Our code is available at https://github.com/Jathurshan0330/Cross-Modal-Transformer. A demo of our work can be found at https://bit.ly/Cross_modal_transformer_demo.
Collapse
|
41
|
Fox B, Jiang J, Wickramaratne S, Kovatch P, Suarez-Farinas M, Shah NA, Parekh A, Nadkarni GN. A foundational transformer leveraging full night, multichannel sleep study data accurately classifies sleep stages. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.08.02.24311417. [PMID: 39148827 PMCID: PMC11326349 DOI: 10.1101/2024.08.02.24311417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Study Objectives To investigate whether a foundational transformer model using 8-hour, multichannel data from polysomnograms can outperform existing artificial intelligence (AI) methods for sleep stage classification. Methods We utilized the Sleep Heart Health Study (SHHS) visits 1 and 2 for training and validation and the Multi-Ethnic Study of Atherosclerosis (MESA) for testing of our model. We trained a self-supervised foundational transformer (called PFTSleep) that encodes 8-hour long sleep studies at 125 Hz with 7 signals including brain, movement, cardiac, oxygen, and respiratory channels. These encodings are used as input for training of an additional model to classify sleep stages, without adjusting the weights of the foundational transformer. We compared our results to existing AI methods that did not utilize 8-hour data or the full set of signals but did report evaluation metrics for the SHHS dataset. Results We trained and validated a model with 8,444 sleep studies with 7 signals including brain, movement, cardiac, oxygen, and respiratory channels and tested on an additional 2,055 studies. In total, we trained and tested 587,944 hours of sleep study signal data. Area under the precision recall curve (AUPRC) scores were 0.82, 0.40, 0.53, 0.75, and 0.82 and area under the receiving operating characteristics curve (AUROC) scores were 0.99, 0.95, 0.96, 0.98, and 0.99 for wake, N1, N2, N3, and REM, respectively, on the SHHS validation set. For MESA, the AUPRC scores were 0.56, 0.16, 0.40, 0.45, and 0.65 and AUROC scores were 0.94, 0.77, 0.87, 0.91, and 0.96, respectively. Our model was compared to the longest context window state-of-the-art model and showed increases in macro evaluation scores, notably sensitivity (3.7% increase) and multi-class REM (3.39% increase) and wake (0.97% increase) F1 scores. Conclusions Utilizing full night, multi-channel PSG data encodings derived from a foundational transformer improve sleep stage classification over existing methods.
Collapse
Affiliation(s)
- Benjamin Fox
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joy Jiang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sajila Wickramaratne
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Patricia Kovatch
- Department of Scientific Computing, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mayte Suarez-Farinas
- Center for Biostatistics, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Neomi A Shah
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ankit Parekh
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Girish N Nadkarni
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Division of Digital and Data Driven Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
42
|
Moctezuma LA, Suzuki Y, Furuki J, Molinas M, Abe T. GRU-powered sleep stage classification with permutation-based EEG channel selection. Sci Rep 2024; 14:17952. [PMID: 39095608 PMCID: PMC11297028 DOI: 10.1038/s41598-024-68978-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 07/30/2024] [Indexed: 08/04/2024] Open
Abstract
We present a new approach to classifying the sleep stage that incorporates a computationally inexpensive method based on permutations for channel selection and takes advantage of deep learning power, specifically the gated recurrent unit (GRU) model, along with other deep learning methods. By systematically permuting the electroencephalographic (EEG) channels, different combinations of EEG channels are evaluated to identify the most informative subset for the classification of the 5-class sleep stage. For analysis, we used an EEG dataset that was collected at the International Institute for Integrative Sleep Medicine (WPI-IIIS) at the University of Tsukuba in Japan. The results of these explorations provide many new insights such as the (1) drastic decrease in performance when channels are fewer than 3, (2) 3-random channels selected by permutation provide the same or better prediction than the 3 channels recommended by the American Academy of Sleep Medicine (AASM), (3) N1 class suffers the most in prediction accuracy as the channels drop from 128 to 3 random or 3 AASM, and (4) no single channel provides acceptable levels of accuracy in the prediction of 5 classes. The results obtained show the GRU's ability to retain essential temporal information from EEG data, which allows capturing the underlying patterns associated with each sleep stage effectively. Using permutation-based channel selection, we enhance or at least maintain as high model efficiency as when using high-density EEG, incorporating only the most informative EEG channels.
Collapse
Affiliation(s)
- Luis Alfredo Moctezuma
- International Institute for Integrative Sleep Medicine (WPI-IIIS), University of Tsukuba, Tsukuba, Ibaraki, Japan.
| | - Yoko Suzuki
- International Institute for Integrative Sleep Medicine (WPI-IIIS), University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Junya Furuki
- International Institute for Integrative Sleep Medicine (WPI-IIIS), University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Marta Molinas
- Department of Engineering Cybernetics, Norwegian University of Science and Technology, Trondheim, Norway
| | - Takashi Abe
- International Institute for Integrative Sleep Medicine (WPI-IIIS), University of Tsukuba, Tsukuba, Ibaraki, Japan
| |
Collapse
|
43
|
Li Y, Xu Z, Chen Z, Zhang Y, Zhang B. Insights from the 2nd China intelligent sleep staging competition. Sleep Breath 2024; 28:1661-1669. [PMID: 38730204 DOI: 10.1007/s11325-024-03055-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 04/25/2024] [Accepted: 05/08/2024] [Indexed: 05/12/2024]
Abstract
STUDY OBJECTIVES Artificial intelligence (AI) is quickly advancing in the field of sleep medicine, which bodes well for the potential of actual clinical use. In this study, an analysis of the 2nd China Intelligent Sleep Staging Competition was conducted to gain insights into the general level and constraints of AI-assisted sleep staging in China. METHODS The outcomes of 10 teams from the children's track and 13 teams from the adult track were investigated in this study. The analysis included overall performance, differences between five different sleep stages, variations across subjects, and performance during stage transitions. RESULTS The adult track's accuracy peaked at 80.46%, while the children's track's accuracy peaked at 88.96%. On average, accuracy rates stood at 71.43% for children and 68.40% for adults. All results were produced within a mere 5-min timeframe. The N1 stage was prone to misclassification as W, N2, and R stages. In the adult track, significant differences were apparent among subjects (p < 0.05), whereas in the children's track, such differences were not observed. Nonetheless, both tracks experienced a performance decline during stage transitions. CONCLUSIONS The computational speed of AI is remarkably fast, simultaneously holding the potential to surpass the accuracy of physicians. Improving the machine learning model's classification of the N1 stage and transitional periods between stages, along with bolstering its robustness to individual subject variations, is imperative for maximizing its ability in assisting clinical scoring.
Collapse
Affiliation(s)
- Yamei Li
- College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China
| | - Zhifei Xu
- Department of Respiratory Medicine, Beijing Children's Hospital, Capital Medical University, Beijing, 100045, China
| | - Zhiqiang Chen
- College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China
| | - Yuan Zhang
- College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China.
| | - Bin Zhang
- Department of Psychiatry, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
| |
Collapse
|
44
|
McMahon M, Goldin J, Kealy ES, Wicks DJ, Zilberg E, Freeman W, Aliahmad B. Performance Investigation of Somfit Sleep Staging Algorithm. Nat Sci Sleep 2024; 16:1027-1043. [PMID: 39071546 PMCID: PMC11277903 DOI: 10.2147/nss.s463026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/01/2024] [Indexed: 07/30/2024] Open
Abstract
Purpose To investigate accuracy of the sleep staging algorithm in a new miniaturized home sleep monitoring device - Compumedics® Somfit. Somfit is attached to patient's forehead and combines channels specified for a pulse arterial tonometry (PAT)-based home sleep apnea testing (HSAT) device with the neurological signals. Somfit sleep staging deep learning algorithm is based on convolutional neural network architecture. Patients and Methods One hundred and ten participants referred for sleep investigation with suspected or preexisting obstructive sleep apnea (OSA) in need of a review were enrolled into the study involving simultaneous recording of full overnight polysomnography (PSG) and Somfit data. The recordings were conducted at three centers in Australia. The reported statistics include standard measures of agreement between Somfit automatic hypnogram and consensus PSG hypnogram. Results Overall percent agreement across five sleep stages (N1, N2, N3, REM, and wake) between Somfit automatic and consensus PSG hypnograms was 76.14 (SE: 0.79). The percent agreements between different pairs of sleep technologists' PSG hypnograms varied from 74.36 (1.93) to 85.50 (0.64), with interscorer agreement being greater for scorers from the same sleep laboratory. The estimate of kappa between Somfit and consensus PSG was 0.672 (0.002). Percent agreement for sleep/wake discrimination was 89.30 (0.37). The accuracy of Somfit sleep staging algorithm varied with increasing OSA severity - percent agreement was 79.67 (1.87) for the normal subjects, 77.38 (1.06) for mild OSA, 74.83 (1.79) for moderate OSA and 72.93 (1.68) for severe OSA. Conclusion Agreement between Somfit and PSG hypnograms was non-inferior to PSG interscorer agreement for a number of scorers, thus confirming acceptability of electrode placement at the center of the forehead. The directions for algorithm improvement include additional arousal detection, integration of motion and oximetry signals and separate inference models for individual sleep stages.
Collapse
Affiliation(s)
- Marcus McMahon
- Department of Respiratory and Sleep Medicine, Epworth Hospital, Richmond, Victoria, Australia and Department of Respiratory and Sleep Medicine, Austin Health, Heidelberg, Victoria, Australia
| | - Jeremy Goldin
- Department of Respiratory and Sleep Medicine, Royal Melbourne Hospital, Parkvile, Victoria, Australia
| | | | | | - Eugene Zilberg
- Medical Innovations, Compumedics Limited, Abbotsford, Victoria, Australia
| | - Warwick Freeman
- Medical Innovations, Compumedics Limited, Abbotsford, Victoria, Australia
| | - Behzad Aliahmad
- Medical Innovations, Compumedics Limited, Abbotsford, Victoria, Australia
| |
Collapse
|
45
|
Lee M, Kang H, Yu SH, Cho H, Oh J, van der Lande G, Gosseries O, Jeong JH. Automatic Sleep Stage Classification Using Nasal Pressure Decoding Based on a Multi-Kernel Convolutional BiLSTM Network. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2533-2544. [PMID: 38941194 DOI: 10.1109/tnsre.2024.3420715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Sleep quality is an essential parameter of a healthy human life, while sleep disorders such as sleep apnea are abundant. In the investigation of sleep and its malfunction, the gold-standard is polysomnography, which utilizes an extensive range of variables for sleep stage classification. However, undergoing full polysomnography, which requires many sensors that are directly connected to the heaviness of the setup and the discomfort of sleep, brings a significant burden. In this study, sleep stage classification was performed using the single dimension of nasal pressure, dramatically decreasing the complexity of the process. In turn, such improvements could increase the much needed clinical applicability. Specifically, we propose a deep learning structure consisting of multi-kernel convolutional neural networks and bidirectional long short-term memory for sleep stage classification. Sleep stages of 25 healthy subjects were classified into 3-class (wake, rapid eye movement (REM), and non-REM) and 4-class (wake, REM, light, and deep sleep) based on nasal pressure. Following a leave-one-subject-out cross-validation, in the 3-class the accuracy was 0.704, the F1-score was 0.490, and the kappa value was 0.283 for the overall metrics. In the 4-class, the accuracy was 0.604, the F1-score was 0.349, and the kappa value was 0.217 for the overall metrics. This was higher than the four comparative models, including the class-wise F1-score. This result demonstrates the possibility of a sleep stage classification model only using easily applicable and highly practical nasal pressure recordings. This is also likely to be used with interventions that could help treat sleep-related diseases.
Collapse
|
46
|
Coon WG, Ogg M. Laying the Foundation: Modern Transformers for Gold-Standard Sleep Analysis and Beyond. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-7. [PMID: 40039238 DOI: 10.1109/embc53108.2024.10782964] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Accurate sleep assessment is critical to the practice of sleep medicine and sleep research. The recent availability of large quantities of publicly available sleep data, alongside recent breakthroughs in AI like transformer architectures, present novel opportunities for data-driven discovery efforts. Transformers are flexible neural networks that not only excel at classification tasks, but also can enable data-driven discovery through un- or self-supervised learning, which requires no human annotations to the input data. While transformers have been extensively used in supervised learning scenarios for sleep stage classification, they have not been fully explored or optimized in forms designed from the ground up for use in un- or self-supervised learning tasks in sleep. A necessary first step will be to study these models on a canonical benchmark supervised learning task (5-class sleep stage classification). Hence, to lay the groundwork for future data-driven discovery efforts, we evaluated optimizations of a transformer-based architecture that has already demonstrated substantial success in self-supervised learning in another domain (audio speech recognition), and trained it to perform the canonical 5-class sleep stage classification task, to establish foundational baselines in the sleep domain. We found that small transformer models designed from the start for (later) self-supervised learning can match other state-of-the-art automated sleep scoring techniques, while also providing the basis for future data-driven discovery efforts using large sleep data sets.
Collapse
|
47
|
Oh S, Kweon YS, Shin GH, Lee SW. MEDi-SOL: Multi Ensemble Distribution Model for Estimating Sleep Onset Latency. IEEE J Biomed Health Inform 2024; 28:4249-4259. [PMID: 38598376 DOI: 10.1109/jbhi.2024.3386885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Sleep onset latency (SOL) is an important factor relating to the sleep quality of a subject. Therefore, accurate prediction of SOL is useful to identify individuals at risk of sleep disorders and to improve sleep quality. In this study, we estimate SOL distribution and falling asleep function using an electroencephalogram (EEG), which can measure the electric field of brain activity. We proposed a Multi Ensemble Distribution model for estimating Sleep Onset Latency (MEDi-SOL), consisting of a temporal encoder and a time distribution decoder. We evaluated the performance of the proposed model using a public dataset from the Sleep Heart Health Study. We considered four distributions, Normal, log-Normal, Weibull, and log-Logistic, and compared them with a survival model and a regression model. The temporal encoder with the ensemble log-Logistic and log-Normal distribution showed the best and second-best scores in the concordance index (C-index) and mean absolute error (MAE). Our MEDi-SOL, multi ensemble distribution with combining log-Logistic and log-Normal distribution, shows the best score in C-index and MAE, with a fast training time. Furthermore, our model can visualize the process of falling asleep for individual subjects. As a result, a distribution-based ensemble approach with appropriate distribution is more useful than point estimation.
Collapse
|
48
|
Li W, Liu T, Xu B, Song A. SleepFC: Feature Pyramid and Cross-Scale Context Learning for Sleep Staging. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2198-2208. [PMID: 38805336 DOI: 10.1109/tnsre.2024.3406383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Automated sleep staging is essential to assess sleep quality and treat sleep disorders, so the issue of electroencephalography (EEG)-based sleep staging has gained extensive research interests. However, the following difficulties exist in this issue: 1) how to effectively learn the intrinsic features of salient waves from single-channel EEG signals; 2) how to learn and capture the useful information of sleep stage transition rules; 3) how to address the class imbalance problem of sleep stages. To handle these problems in sleep staging, we propose a novel method named SleepFC. This method comprises convolutional feature pyramid network (CFPN), cross-scale temporal context learning (CSTCL), and class adaptive fine-tuning loss function (CAFTLF) based classification network. CFPN learns the multi-scale features from salient waves of EEG signals. CSTCL extracts the informative multi-scale transition rules between sleep stages. CAFTLF-based classification network handles the class imbalance problem. Extensive experiments on three public benchmark datasets demonstrate the superiority of SleepFC over the state-of-the-art approaches. Particularly, SleepFC has a significant performance advantage in recognizing the N1 sleep stage, which is challenging to distinguish.
Collapse
|
49
|
Li X, Yang S, Fei N, Wang J, Huang W, Hu Y. A Convolutional Neural Network for SSVEP Identification by Using a Few-Channel EEG. Bioengineering (Basel) 2024; 11:613. [PMID: 38927850 PMCID: PMC11200714 DOI: 10.3390/bioengineering11060613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/11/2024] [Accepted: 06/08/2024] [Indexed: 06/28/2024] Open
Abstract
The application of wearable electroencephalogram (EEG) devices is growing in brain-computer interfaces (BCI) owing to their good wearability and portability. Compared with conventional devices, wearable devices typically support fewer EEG channels. Devices with few-channel EEGs have been proven to be available for steady-state visual evoked potential (SSVEP)-based BCI. However, fewer-channel EEGs can cause the BCI performance to decrease. To address this issue, an attention-based complex spectrum-convolutional neural network (atten-CCNN) is proposed in this study, which combines a CNN with a squeeze-and-excitation block and uses the spectrum of the EEG signal as the input. The proposed model was assessed on a wearable 40-class dataset and a public 12-class dataset under subject-independent and subject-dependent conditions. The results show that whether using a three-channel EEG or single-channel EEG for SSVEP identification, atten-CCNN outperformed the baseline models, indicating that the new model can effectively enhance the performance of SSVEP-BCI with few-channel EEGs. Therefore, this SSVEP identification algorithm based on a few-channel EEG is particularly suitable for use with wearable EEG devices.
Collapse
Affiliation(s)
- Xiaodong Li
- Orthopedics Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen 518053, China
- Department of Orthopaedics and Traumatology, The University of Hong Kong, Hong Kong SAR, China
| | - Shuoheng Yang
- Orthopedics Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen 518053, China
- Department of Orthopaedics and Traumatology, The University of Hong Kong, Hong Kong SAR, China
| | - Ningbo Fei
- Department of Orthopaedics and Traumatology, The University of Hong Kong, Hong Kong SAR, China
| | - Junlin Wang
- Orthopedics Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen 518053, China
- Department of Orthopaedics and Traumatology, The University of Hong Kong, Hong Kong SAR, China
| | - Wei Huang
- Department of Rehabilitation, The Second Affiliated Hospital of Guangdong Medical University, Zhanjiang 524003, China
| | - Yong Hu
- Orthopedics Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen 518053, China
- Department of Orthopaedics and Traumatology, The University of Hong Kong, Hong Kong SAR, China
- Department of Rehabilitation, The Second Affiliated Hospital of Guangdong Medical University, Zhanjiang 524003, China
| |
Collapse
|
50
|
Bressler S, Neely R, Yost RM, Wang D. A randomized controlled trial of alpha phase-locked auditory stimulation to treat symptoms of sleep onset insomnia. Sci Rep 2024; 14:13039. [PMID: 38844793 PMCID: PMC11156862 DOI: 10.1038/s41598-024-63385-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 05/28/2024] [Indexed: 06/09/2024] Open
Abstract
Sleep onset insomnia is a pervasive problem that contributes significantly to the poor health outcomes associated with insufficient sleep. Auditory stimuli phase-locked to slow-wave sleep oscillations have been shown to augment deep sleep, but it is unknown whether a similar approach can be used to accelerate sleep onset. The present randomized controlled crossover trial enrolled adults with objectively verified sleep onset latencies (SOLs) greater than 30 min to test the effect of auditory stimuli delivered at specific phases of participants' alpha oscillations prior to sleep onset. During the intervention week, participants wore an electroencephalogram (EEG)-enabled headband that delivered acoustic pulses timed to arrive anti-phase with alpha for 30 min (Stimulation). During the Sham week, the headband silently recorded EEG. The primary outcome was SOL determined by blinded scoring of EEG records. For the 21 subjects included in the analyses, stimulation had a significant effect on SOL according to a linear mixed effects model (p = 0.0019), and weekly average SOL decreased by 10.5 ± 15.9 min (29.3 ± 44.4%). These data suggest that phase-locked acoustic stimulation can be a viable alternative to pharmaceuticals to accelerate sleep onset in individuals with prolonged sleep onset latencies. Trial Registration: This trial was first registered on clinicaltrials.gov on 24/02/2023 under the name Sounds Locked to ElectroEncephalogram Phase For the Acceleration of Sleep Onset Time (SLEEPFAST), and assigned registry number NCT05743114.
Collapse
Affiliation(s)
- Scott Bressler
- Elemind Technologies, Inc., Cambridge, MA, USA
- Science and Research, Elemind Technologies, Inc., Cambridge, MA, 02139, USA
| | - Ryan Neely
- Elemind Technologies, Inc., Cambridge, MA, USA.
- Science and Research, Elemind Technologies, Inc., Cambridge, MA, 02139, USA.
| | - Ryan M Yost
- Elemind Technologies, Inc., Cambridge, MA, USA
- Science and Research, Elemind Technologies, Inc., Cambridge, MA, 02139, USA
| | - David Wang
- Elemind Technologies, Inc., Cambridge, MA, USA
| |
Collapse
|