1
|
Xie J, Fonseca P, van Dijk JP, Overeem S, Long X. Multi-modal multi-task deep neural networks for sleep disordered breathing assessment using cardiac and audio signals. Int J Med Inform 2025; 201:105932. [PMID: 40286704 DOI: 10.1016/j.ijmedinf.2025.105932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2025] [Revised: 04/11/2025] [Accepted: 04/16/2025] [Indexed: 04/29/2025]
Abstract
BACKGROUND AND OBJECTIVE Sleep disordered breathing (SDB) is one of the most common sleep disorders and has short-term consequences for daytime functioning while being a risk factor for several conditions, such as cardiovascular disease. Polysomnography, the current diagnostic gold standard, is expensive and has limited accessibility. Therefore, cost-effective and easily accessible methods for SDB detection are needed. Both cardiac and audio signals have received attention for SDB detection as they can be obtained with unobtrusive sensors, suitable for home applications. METHODS This paper introduces a multi-modal multi-task deep learning approach for SDB assessment using a combination of cardiac and audio signals under the assumption that they can provide complementary information. We aimed to estimate the apnea-hypopnea index (AHI) and assess AHI-based SDB severity through the detection of SDB events, combined with total sleep time estimated from simultaneous sleep-wake classification. Inter-beat interval and electrocardiogram-derived respiration from the electrocardiogram, and Mel-scale frequency cepstral coefficients from concurrent audio recordings were used as inputs. We compared the performance of several models trained with different combinations of these inputs. RESULTS Using cross-validation with a dataset comprising overnight recordings of 161 subjects, we achieved an F1 score of 0.588 for SDB event detection, a correlation coefficient of 0.825 for AHI estimation, and an accuracy of 57.8% for SDB severity classification (normal, mild, moderate, and severe). CONCLUSION Results show that combining cardiac and audio signals can enhance the performance of SDB detection and highlight the potential of multi-modal data fusion for further research in this domain.
Collapse
Affiliation(s)
- Jiali Xie
- Biomedical Diagnostics Lab, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven 5612 AP, the Netherlands
| | - Pedro Fonseca
- Biomedical Diagnostics Lab, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven 5612 AP, the Netherlands; Philips Research, High Tech Campus, Eindhoven 5656 AE, The Netherlands
| | - Johannes P van Dijk
- Biomedical Diagnostics Lab, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven 5612 AP, the Netherlands; Kempenhaeghe Center for Sleep Medicine, Heeze 5591 VE, The Netherlands; Department of Orthodontics, Ulm University, Ulm 89081, Germany
| | - Sebastiaan Overeem
- Biomedical Diagnostics Lab, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven 5612 AP, the Netherlands; Kempenhaeghe Center for Sleep Medicine, Heeze 5591 VE, The Netherlands
| | - Xi Long
- Biomedical Diagnostics Lab, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven 5612 AP, the Netherlands.
| |
Collapse
|
2
|
Chao YP, Chuang HH, Lee ZH, Huang SY, Zhan WT, Shyu LY, Lo YL, Lee GS, Li HY, Lee LA. Distinguishing severe sleep apnea from habitual snoring using a neck-wearable piezoelectric sensor and deep learning: A pilot study. Comput Biol Med 2025; 190:110070. [PMID: 40147187 DOI: 10.1016/j.compbiomed.2025.110070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Revised: 01/29/2025] [Accepted: 03/21/2025] [Indexed: 03/29/2025]
Abstract
This study explores the development of a deep learning model using a neck-wearable piezoelectric sensor to accurately distinguish severe sleep apnea syndrome (SAS) from habitual snoring, addressing the underdiagnosis of SAS in adults. From 2018 to 2020, 60 adult habitual snorers underwent polysomnography while wearing a neck piezoelectric sensor that recorded snoring vibrations (70-250 Hz) and carotid artery pulsations (0.01-1.5 Hz). The initial dataset comprised 1167 silence, 1304 snoring, and 399 noise samples from 20 participants. Using a hybrid deep learning model comprising a one-dimensional convolutional neural network and gated-recurrent unit, the model identified snoring and apnea/hypopnea events, with sleep phases detected via pulse wave variability criteria. The model's efficacy in predicting severe SAS was assessed in the remaining 40 participants, achieving snoring detection rates of 0.88, 0.86, and 0.92, with respective loss rates of 0.39, 0.90, and 0.23. Classification accuracy for severe SAS improved from 0.85 for total sleep time to 0.90 for partial sleep time, excluding the first sleep phase, demonstrating precision of 0.84, recall of 1.00, and an F1 score of 0.91. This innovative approach of combining a hybrid deep learning model with a neck-wearable piezoelectric sensor suggests a promising route for early and precise differentiation of severe SAS from habitual snoring, aiding guiding further standard diagnostic evaluations and timely patient management. Future studies should focus on expanding the sample size, diversifying the patient population, and external validations in real-world settings to enhance the robustness and applicability of the findings.
Collapse
Affiliation(s)
- Yi-Ping Chao
- Department of Computer Science and Information Engineering, Chang Gung University, 33302, Taoyuan, Taiwan; Department of Otorhinolaryngology, Head and Neck Surgery, Sleep Center, Linkou Medical Center, Chang Gung Memorial Hospital, Chang Gung University, 33305 Taoyuan, Taiwan
| | - Hai-Hua Chuang
- Department of Community Medicine, Cathay General Hospital, 10630 Taipei, Taiwan; School of Medicine, College of Life Science and Medicine, National Tsing Hua University, 300044, Hsinchu, Taiwan; Department of Industrial Engineering and Management, National Taipei University of Technology, 10608, Taipei, Taiwan
| | - Zong-Han Lee
- Department of Biomedical Engineering, Chung Yuan Christian University, 320314 Taoyuan, Taiwan
| | - Shu-Yi Huang
- Department of Biomedical Engineering, Chung Yuan Christian University, 320314 Taoyuan, Taiwan
| | - Wan-Ting Zhan
- Department of Biomedical Engineering, Chung Yuan Christian University, 320314 Taoyuan, Taiwan
| | - Liang-Yu Shyu
- Department of Biomedical Engineering, Chung Yuan Christian University, 320314 Taoyuan, Taiwan
| | - Yu-Lun Lo
- Department of Pulmonary and Critical Care Medicine, Linkou Main Branch, Chang Gung Memorial Hospital, Chang Gung University, 33305, Taoyuan, Taiwan
| | - Guo-She Lee
- Faculty of Medicine, National Yang Ming Chiao Tung University, 112304, Taipei, Taiwan; Department of Otolaryngology, Taipei City Hospital, Ren-Ai Branch, 106243, Taipei, Taiwan
| | - Hsueh-Yu Li
- Department of Otorhinolaryngology, Head and Neck Surgery, Sleep Center, Linkou Medical Center, Chang Gung Memorial Hospital, Chang Gung University, 33305 Taoyuan, Taiwan
| | - Li-Ang Lee
- Department of Otorhinolaryngology, Head and Neck Surgery, Sleep Center, Linkou Medical Center, Chang Gung Memorial Hospital, Chang Gung University, 33305 Taoyuan, Taiwan; School of Medicine, College of Life Science and Medicine, National Tsing Hua University, 300044, Hsinchu, Taiwan.
| |
Collapse
|
3
|
Werthen-Brabants L, Castillo-Escario Y, Groenendaal W, Jane R, Dhaene T, Deschrijver D. Deep Learning-Based Event Counting for Apnea-Hypopnea Index Estimation Using Recursive Spiking Neural Networks. IEEE Trans Biomed Eng 2025; 72:1306-1315. [PMID: 40030371 DOI: 10.1109/tbme.2024.3498097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]
Abstract
OBJECTIVE To develop a novel method for improved screening of sleep apnea in home environments, focusing on reliable estimation of the Apnea-Hypopnea Index (AHI) without the need for highly precise event localization. METHODS RSN-Count is introduced, a technique leveraging Spiking Neural Networks to directly count apneic events in recorded signals. This approach aims to reduce dependence on the exact time-based pinpointing of events, a potential source of variability in conventional analysis. RESULTS RSN-Count demonstrates a superior ability to quantify apneic events (AHI MAE ) compared to established methods (AHI MAE ) on a dataset of whole-night audio and SpO recordings (N = 33). This is particularly valuable for accurate AHI estimation, even in the absence of highly precise event localization. CONCLUSION RSN-Count offers a promising improvement in sleep apnea screening within home settings. Its focus on event quantification enhances AHI estimation accuracy. SIGNIFICANCE This method addresses limitations in current sleep apnea diagnostics, potentially increasing screening accuracy and accessibility while reducing dependence on costly and complex polysomnography.
Collapse
|
4
|
Zhang E, Jia X, Wu Y, Liu J, Yu L. Cascaded redundant convolutional encoder-decoder network improved apnea detection performance using tracheal sounds in post anesthesia care unit patients. Biomed Phys Eng Express 2024; 10:065051. [PMID: 39437805 DOI: 10.1088/2057-1976/ad89c6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 10/22/2024] [Indexed: 10/25/2024]
Abstract
Objective. Methods of detecting apnea based on acoustic features can be prone to misdiagnosed and missed diagnoses due to the influence of noise. The aim of this paper is to improve the performance of apnea detection algorithms in the Post Anesthesia Care Unit (PACU) using a denoising method that processes tracheal sounds without the need for separate background noise.Approach. Tracheal sound data from laboratory subjects was collected using a microphone. Record a segment of clinical background noise and clean tracheal sound data to synthesize the noisy tracheal sound data according to a specified signal-to-noise ratio. Extract the frequency-domain features of the tracheal sounds using the Short Time Fourier Transform (STFT) and input the Cascaded Redundant Convolutional Encoder-Decoder network (CR-CED) network for training. Patients' tracheal sound data collected in the PACU were then fed into the CR-CED network as test data and inversely transformed by STFT to obtain denoised tracheal sounds. The apnea detection algorithm was used to detect the tracheal sound after denoising.Results. Apnea events were correctly detected 207 times and normal respiratory events 11,305 times using tracheal sounds denoised by the CR-CED network. The sensitivity and specificity of apnea detection were 88% and 98.6%, respectively.Significance. The apnea detection results of tracheal sounds after CR-CED network denoising in the PACU are accurate and reliable. Tracheal sound can be denoised using this approach without separate background noise. It effectively improves the applicability of the tracheal sound denoising method in the medical environment while ensuring its correctness.
Collapse
Affiliation(s)
- Erpeng Zhang
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, People's Republic of China
| | - Xiuzhu Jia
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, People's Republic of China
| | - Yanan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning, People's Republic of China
| | - Jing Liu
- Department of Nuclear Medicine, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Lu Yu
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, People's Republic of China
| |
Collapse
|
5
|
Moslemi C, Sækmose S, Larsen R, Brodersen T, Bay JT, Didriksen M, Nielsen KR, Bruun MT, Dowsett J, Dinh KM, Mikkelsen C, Hyvärinen K, Ritari J, Partanen J, Ullum H, Erikstrup C, Ostrowski SR, Olsson ML, Pedersen OB. A deep learning approach to prediction of blood group antigens from genomic data. Transfusion 2024; 64:2179-2195. [PMID: 39268576 DOI: 10.1111/trf.18013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 07/17/2024] [Accepted: 08/27/2024] [Indexed: 09/17/2024]
Abstract
BACKGROUND Deep learning methods are revolutionizing natural science. In this study, we aim to apply such techniques to develop blood type prediction models based on cheap to analyze and easily scalable screening array genotyping platforms. METHODS Combining existing blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used deep learning techniques to train and validate blood type prediction models for 36 antigens in 15 blood group systems. To account for missing genotypes a denoising autoencoder initial step was utilized, followed by a convolutional neural network blood type classifier. RESULTS Two thirds of the trained blood type prediction models demonstrated an F1-accuracy above 99%. Models for antigens with low or high frequencies like, for example, Cw, low training cohorts like, for example, Cob, or very complicated genetic underpinning like, for example, RhD, proved to be more challenging for high accuracy (>99%) DL modeling. However, in the Danish cohort only 4 out of 36 models (Cob, Cw, D-weak, Kpa) failed to achieve a prediction F1-accuracy above 97%. This high predictive performance was replicated in the Finnish cohort. DISCUSSION High accuracy in a variety of blood groups proves viability of deep learning-based blood type prediction using array chip genotypes, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation.
Collapse
Affiliation(s)
- Camous Moslemi
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
- Institute of Science and Environment, Roskilde University, Roskilde, Denmark
| | - Susanne Sækmose
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Rune Larsen
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Thorsten Brodersen
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Jakob T Bay
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Maria Didriksen
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshopitalet, Copenhagen, Denmark
| | - Kaspar R Nielsen
- Department of Clinical Immunology, Aalborg University Hospital, Aalborg, Denmark
| | - Mie T Bruun
- Department of Clinical Immunology, Odense University Hospital, Odense, Denmark
| | - Joseph Dowsett
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshopitalet, Copenhagen, Denmark
| | - Khoa M Dinh
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
| | - Christina Mikkelsen
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshopitalet, Copenhagen, Denmark
| | | | - Jarmo Ritari
- Finnish Red Cross Blood Service, Helsinki, Finland
| | | | | | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Sisse R Ostrowski
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshopitalet, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Martin L Olsson
- Department of Laboratory Medicine, Lund University, Lund, Sweden
- Department of Clinical Immunology and Transfusion, Office for Medical Services, Region Skåne, Sweden
| | - Ole B Pedersen
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
6
|
Li T, Wang Y, Zhu Y, Zhao Q. Study on early intervention strategies of children's snoring: a meta-analysis based on network. Minerva Pediatr (Torino) 2024; 76:457-459. [PMID: 38445962 DOI: 10.23736/s2724-5276.24.07533-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Affiliation(s)
- Tingting Li
- Early Childhood Development Center I, Gansu Provincial Maternity and Child-Care Hospital, Lanzhou, Gansu, China
| | - Yongjun Wang
- Pediatric Respiratory Department II, Gansu Provincial Maternity and Child-Care Hospital, Lanzhou, Gansu, China
| | - Ying Zhu
- Early Childhood Development Center I, Gansu Provincial Maternity and Child-Care Hospital, Lanzhou, Gansu, China
| | - Qijun Zhao
- Pediatric Respiratory Department II, Gansu Provincial Maternity and Child-Care Hospital, Lanzhou, Gansu, China -
| |
Collapse
|
7
|
Dong H, Wu H, Yang G, Zhang J, Wan K. A multi-branch convolutional neural network for snoring detection based on audio. Comput Methods Biomech Biomed Engin 2024:1-12. [PMID: 38372231 DOI: 10.1080/10255842.2024.2317438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 02/03/2024] [Indexed: 02/20/2024]
Abstract
Obstructive sleep apnea (OSA) is associated with various health complications, and snoring is a prominent characteristic of this disorder. Therefore, the exploration of a concise and effective method for detecting snoring has consistently been a crucial aspect of sleep medicine. As the easily accessible data, the identification of snoring through sound analysis offers a more convenient and straightforward method. The objective of this study was to develop a convolutional neural network (CNN) for classifying snoring and non-snoring events based on audio. This study utilized Mel-frequency cepstral coefficients (MFCCs) as a method for extracting features during the preprocessing of raw data. In order to extract multi-scale features from the frequency domain of sound sources, this study proposes the utilization of a multi-branch convolutional neural network (MBCNN) for the purpose of classification. The network utilized asymmetric convolutional kernels to acquire additional information, while the adoption of one-hot encoding labels aimed to mitigate the impact of labels. The experiment tested the network's performance by utilizing a publicly available dataset consisting of 1,000 sound samples. The test results indicate that the MBCNN achieved a snoring detection accuracy of 99.5%. The integration of multi-scale features and the implementation of MBCNN, based on audio data, have demonstrated a substantial improvement in the performance of snoring classification.
Collapse
Affiliation(s)
- Hao Dong
- School of Computer Science, Zhongyuan University of Technology, Henan, China
- School of Computing and Artificial Intelligence, Huanghuai University, Henan, China
| | - Haitao Wu
- School of Computing and Artificial Intelligence, Huanghuai University, Henan, China
- Henan Key Laboratory of Smart Lighting, Henan, China
| | - Guan Yang
- School of Computer Science, Zhongyuan University of Technology, Henan, China
| | - Junming Zhang
- School of Computing and Artificial Intelligence, Huanghuai University, Henan, China
- Henan Key Laboratory of Smart Lighting, Henan, China
- Henan Joint International Research Laboratory of Behavior Optimization Control for Smart Robots, Henan, China
- Zhumadian Artificial Intelligence and Medical Engineering Technical Research Centre, Henan, China
| | - Keqin Wan
- School of Computing and Artificial Intelligence, Huanghuai University, Henan, China
| |
Collapse
|
8
|
Khante P, Thomaz E, de Barbaro K. Auditory chaos classification in real-world environments. Front Digit Health 2023; 5:1261057. [PMID: 38178925 PMCID: PMC10764466 DOI: 10.3389/fdgth.2023.1261057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 11/15/2023] [Indexed: 01/06/2024] Open
Abstract
Background & motivation Household chaos is an established risk factor for child development. However, current methods for measuring household chaos rely on parent surveys, meaning existing research efforts cannot disentangle potentially dynamic bidirectional relations between high chaos environments and child behavior problems. Proposed approach We train and make publicly available a classifier to provide objective, high-resolution predictions of household chaos from real-world child-worn audio recordings. To do so, we collect and annotate a novel dataset of ground-truth auditory chaos labels compiled from over 411 h of daylong recordings collected via audio recorders worn by N = 22 infants in their homes. We leverage an existing sound event classifier to identify candidate high chaos segments, increasing annotation efficiency 8.32× relative to random sampling. Result Our best-performing model successfully classifies four levels of real-world household auditory chaos with a macro F1 score of 0.701 (Precision: 0.705, Recall: 0.702) and a weighted F1 score of 0.679 (Precision: 0.685, Recall: 0.680). Significance In future work, high-resolution objective chaos predictions from our model can be leveraged for basic science and intervention, including testing theorized mechanisms by which chaos affects children's cognition and behavior. Additionally, to facilitate further model development we make publicly available the first and largest balanced annotated audio dataset of real-world household chaos.
Collapse
Affiliation(s)
- Priyanka Khante
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States
| | - Edison Thomaz
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States
| | - Kaya de Barbaro
- Department of Psychology, The University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
9
|
Li R, Li W, Yue K, Zhang R, Li Y. Automatic snoring detection using a hybrid 1D-2D convolutional neural network. Sci Rep 2023; 13:14009. [PMID: 37640790 PMCID: PMC10462688 DOI: 10.1038/s41598-023-41170-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 08/23/2023] [Indexed: 08/31/2023] Open
Abstract
Snoring, as a prevalent symptom, seriously interferes with life quality of patients with sleep disordered breathing only (simple snorers), patients with obstructive sleep apnea (OSA) and their bed partners. Researches have shown that snoring could be used for screening and diagnosis of OSA. Therefore, accurate detection of snoring sounds from sleep respiratory audio at night has been one of the most important parts. Considered that the snoring is somewhat dangerously overlooked around the world, an automatic and high-precision snoring detection algorithm is required. In this work, we designed a non-contact data acquire equipment to record nocturnal sleep respiratory audio of subjects in their private bedrooms, and proposed a hybrid convolutional neural network (CNN) model for the automatic snore detection. This model consists of a one-dimensional (1D) CNN processing the original signal and a two-dimensional (2D) CNN representing images mapped by the visibility graph method. In our experiment, our algorithm achieves an average classification accuracy of 89.3%, an average sensitivity of 89.7%, an average specificity of 88.5%, and an average AUC of 0.947, which surpasses some state-of-the-art models trained on our data. In conclusion, our results indicate that the proposed method in this study could be effective and significance for massive screening of OSA patients in daily life. And our work provides an alternative framework for time series analysis.
Collapse
Affiliation(s)
- Ruixue Li
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Wenjun Li
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China.
| | - Keqiang Yue
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Rulin Zhang
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Yilin Li
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| |
Collapse
|
10
|
Ahn JH, Lee JH, Lim CY, Joo EY, Youn J, Chung MJ, Cho JW, Kim K. Automatic stridor detection using small training set via patch-wise few-shot learning for diagnosis of multiple system atrophy. Sci Rep 2023; 13:10899. [PMID: 37407621 DOI: 10.1038/s41598-023-37620-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 06/24/2023] [Indexed: 07/07/2023] Open
Abstract
Stridor is a rare but important non-motor symptom that can support the diagnosis and prediction of worse prognosis in multiple system atrophy. Recording sounds generated during sleep by video-polysomnography is recommended for detecting stridor, but the analysis is labor intensive and time consuming. A method for automatic stridor detection should be developed using technologies such as artificial intelligence (AI) or machine learning. However, the rarity of stridor hinders the collection of sufficient data from diverse patients. Therefore, an AI method with high diagnostic performance should be devised to address this limitation. We propose an AI method for detecting patients with stridor by combining audio splitting and reintegration with few-shot learning for diagnosis. We used video-polysomnography data from patients with stridor (19 patients with multiple system atrophy) and without stridor (28 patients with parkinsonism and 18 patients with sleep disorders). To the best of our knowledge, this is the first study to propose a method for stridor detection and attempt the validation of few-shot learning to process medical audio signals. Even with a small training set, a substantial improvement was achieved for stridor detection, confirming the clinical utility of our method compared with similar developments. The proposed method achieved a detection accuracy above 96% using data from only eight patients with stridor for training. Performance improvements of 4%-13% were achieved compared with a state-of-the-art AI baseline. Moreover, our method determined whether a patient had stridor and performed real-time localization of the corresponding audio patches, thus providing physicians with support for interpreting and efficiently employing the results of this method.
Collapse
Grants
- SMX1210791 Future Medicine 20*30 Project of Samsung Medical Center
- SMX1210791 Future Medicine 20*30 Project of Samsung Medical Center
- SMX1210791 Future Medicine 20*30 Project of Samsung Medical Center
- SMX1210791 Future Medicine 20*30 Project of Samsung Medical Center
- SMX1210791 Future Medicine 20*30 Project of Samsung Medical Center
- SMX1210791 Future Medicine 20*30 Project of Samsung Medical Center
- 202011B08-02, KMDF_PR_20200901_0014-2021-02 Korea Medical Device Development Fund grant funded by the Korean government (Ministry of Science and ICT, Ministry of Trade, Industry and Energy, Ministry of Health & Welfare, Ministry of Food and Drug Safety)
- 202011B08-02, KMDF_PR_20200901_0014-2021-02 Korea Medical Device Development Fund grant funded by the Korean government (Ministry of Science and ICT, Ministry of Trade, Industry and Energy, Ministry of Health & Welfare, Ministry of Food and Drug Safety)
- 202011B08-02, KMDF_PR_20200901_0014-2021-02 Korea Medical Device Development Fund grant funded by the Korean government (Ministry of Science and ICT, Ministry of Trade, Industry and Energy, Ministry of Health & Welfare, Ministry of Food and Drug Safety)
- 20014111 Technology Innovation Program funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)
- 20014111 Technology Innovation Program funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)
- 20014111 Technology Innovation Program funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)
- 2021R1F1A106153511 National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT)
Collapse
Affiliation(s)
- Jong Hyeon Ahn
- Department of Neurology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
- Neuroscience Center, Samsung Medical Center, Seoul, Republic of Korea
| | - Ju Hwan Lee
- Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea
| | - Chae Yeon Lim
- Department of Medical Device Management and Research, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea
| | - Eun Yeon Joo
- Department of Neurology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
- Neuroscience Center, Samsung Medical Center, Seoul, Republic of Korea
| | - Jinyoung Youn
- Department of Neurology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
- Neuroscience Center, Samsung Medical Center, Seoul, Republic of Korea
| | - Myung Jin Chung
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea
- Department of Data Convergence and Future Medicine, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jin Whan Cho
- Department of Neurology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
- Neuroscience Center, Samsung Medical Center, Seoul, Republic of Korea.
| | - Kyungsu Kim
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea.
- Department of Data Convergence and Future Medicine, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
11
|
Bazoukis G, Bollepalli SC, Chung CT, Li X, Tse G, Bartley BL, Batool-Anwar S, Quan SF, Armoundas AA. Application of artificial intelligence in the diagnosis of sleep apnea. J Clin Sleep Med 2023; 19:1337-1363. [PMID: 36856067 PMCID: PMC10315608 DOI: 10.5664/jcsm.10532] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 02/21/2023] [Accepted: 02/21/2023] [Indexed: 03/02/2023]
Abstract
STUDY OBJECTIVES Machine learning (ML) models have been employed in the setting of sleep disorders. This review aims to summarize the existing data about the role of ML techniques in the diagnosis, classification, and treatment of sleep-related breathing disorders. METHODS A systematic search in Medline, EMBASE, and Cochrane databases through January 2022 was performed. RESULTS Our search strategy revealed 132 studies that were included in the systematic review. Existing data show that ML models have been successfully used for diagnostic purposes. Specifically, ML models showed good performance in diagnosing sleep apnea using easily obtained features from the electrocardiogram, pulse oximetry, and sound signals. Similarly, ML showed good performance for the classification of sleep apnea into obstructive and central categories, as well as predicting apnea severity. Existing data show promising results for the ML-based guided treatment of sleep apnea. Specifically, the prediction of outcomes following surgical treatment and optimization of continuous positive airway pressure therapy can be guided by ML models. CONCLUSIONS The adoption and implementation of ML in the field of sleep-related breathing disorders is promising. Advancements in wearable sensor technology and ML models can help clinicians predict, diagnose, and classify sleep apnea more accurately and efficiently. CITATION Bazoukis G, Bollepalli SC, Chung CT, et al. Application of artificial intelligence in the diagnosis of sleep apnea. J Clin Sleep Med. 2023;19(7):1337-1363.
Collapse
Affiliation(s)
- George Bazoukis
- Department of Cardiology, Larnaca General Hospital, Larnaca, Cyprus
- Department of Basic and Clinical Sciences, University of Nicosia Medical School, Nicosia, Cyprus
| | | | - Cheuk To Chung
- Cardiac Electrophysiology Unit, Cardiovascular Analytics Group, China-UK Collaboration, Hong Kong
| | - Xinmu Li
- Tianjin Key Laboratory of Ionic-Molecular Function of Cardiovascular disease, Department of Cardiology, Tianjin Institute of Cardiology, the Second Hospital of Tianjin Medical University, Tianjin, China
| | - Gary Tse
- Cardiac Electrophysiology Unit, Cardiovascular Analytics Group, China-UK Collaboration, Hong Kong
- Kent and Medway Medical School, Canterbury, Kent, United Kingdom
| | - Bethany L. Bartley
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Salma Batool-Anwar
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Stuart F. Quan
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts
- Asthma and Airway Disease Research Center, University of Arizona College of Medicine, Tucson, Arizona
| | - Antonis A. Armoundas
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts
- Broad Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts
| |
Collapse
|
12
|
Tran NT, Tran HN, Mai AT. A wearable device for at-home obstructive sleep apnea assessment: State-of-the-art and research challenges. Front Neurol 2023; 14:1123227. [PMID: 36824418 PMCID: PMC9941521 DOI: 10.3389/fneur.2023.1123227] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 01/16/2023] [Indexed: 02/10/2023] Open
Abstract
In the last 3 years, almost all medical resources have been reserved for the screening and treatment of patients with coronavirus disease (COVID-19). Due to a shortage of medical staff and equipment, diagnosing sleep disorders, such as obstructive sleep apnea (OSA), has become more difficult than ever. In addition to being diagnosed using polysomnography at a hospital, people seem to pay more attention to alternative at-home OSA detection solutions. This study aims to review state-of-the-art assessment techniques for out-of-center detection of the main characteristics of OSA, such as sleep, cardiovascular function, oxygen balance and consumption, sleep position, breathing effort, respiratory function, and audio, as well as recent progress in the implementation of data acquisition and processing and machine learning techniques that support early detection of severe OSA levels.
Collapse
Affiliation(s)
- Ngoc Thai Tran
- Faculty of Electronics and Telecommunication, VNU University of Engineering and Technology, Hanoi, Vietnam
| | - Huu Nam Tran
- Faculty of Electronics and Telecommunication, VNU University of Engineering and Technology, Hanoi, Vietnam
| | | |
Collapse
|
13
|
Khanjani Z, Watson G, Janeja VP. Audio deepfakes: A survey. Front Big Data 2023; 5:1001063. [PMID: 36700137 PMCID: PMC9869423 DOI: 10.3389/fdata.2022.1001063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 11/14/2022] [Indexed: 01/11/2023] Open
Abstract
A deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video, image, and text synthesis. The key difference between manual editing and deepfakes is that deepfakes are AI generated or AI manipulated and closely resemble authentic artifacts. In some cases, deepfakes can be fabricated using AI-generated content in its entirety. Deepfakes have started to have a major impact on society with more generation mechanisms emerging everyday. This article makes a contribution in understanding the landscape of deepfakes, and their detection and generation methods. We evaluate various categories of deepfakes especially in audio. The purpose of this survey is to provide readers with a deeper understanding of (1) different deepfake categories; (2) how they could be created and detected; (3) more specifically, how audio deepfakes are created and detected in more detail, which is the main focus of this paper. We found that generative adversarial networks (GANs), convolutional neural networks (CNNs), and deep neural networks (DNNs) are common ways of creating and detecting deepfakes. In our evaluation of over 150 methods, we found that the majority of the focus is on video deepfakes, and, in particular, the generation of video deepfakes. We found that for text deepfakes, there are more generation methods but very few robust methods for detection, including fake news detection, which has become a controversial area of research because of the potential heavy overlaps with human generation of fake content. Our study reveals a clear need to research audio deepfakes and particularly detection of audio deepfakes. This survey has been conducted with a different perspective, compared to existing survey papers that mostly focus on just video and image deepfakes. This survey mainly focuses on audio deepfakes that are overlooked in most of the existing surveys. This article's most important contribution is to critically analyze and provide a unique source of audio deepfake research, mostly ranging from 2016 to 2021. To the best of our knowledge, this is the first survey focusing on audio deepfakes generation and detection in English.
Collapse
Affiliation(s)
| | | | - Vandana P. Janeja
- Department of Information System, University of Maryland Baltimore County, Baltimore, MD, United States
| |
Collapse
|
14
|
Monitoring of Sleep Breathing States Based on Audio Sensor Utilizing Mel-Scale Features in Home Healthcare. JOURNAL OF HEALTHCARE ENGINEERING 2023; 2023:6197564. [PMID: 36818388 PMCID: PMC9935909 DOI: 10.1155/2023/6197564] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 07/03/2022] [Accepted: 11/24/2022] [Indexed: 02/11/2023]
Abstract
Sleep-related breathing disorders (SBDs) will lead to poor sleep quality and increase the risk of cardiovascular and cerebrovascular diseases which may cause death in serious cases. This paper aims to detect breathing states related to SBDs by breathing sound signals. A moment waveform analysis is applied to locate and segment the breathing cycles. As the core of our study, a set of useful features of breathing signal is proposed based on Mel frequency cepstrum analysis. Finally, the normal and abnormal sleep breathing states can be distinguished by the extracted Mel-scale indexes. Young healthy testers and patients who suffered from obstructive sleep apnea are tested utilizing the proposed method. The average accuracy for detecting abnormal breathing states can reach 93.1%. It will be helpful to prevent SBDs and improve the sleep quality of home healthcare.
Collapse
|
15
|
Asif M, Usaid M, Rashid M, Rajab T, Hussain S, Wasi S. Large-scale audio dataset for emergency vehicle sirens and road noises. Sci Data 2022; 9:599. [PMID: 36195730 PMCID: PMC9532391 DOI: 10.1038/s41597-022-01727-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 09/20/2022] [Indexed: 11/17/2022] Open
Abstract
Traffic congestion, accidents, and pollution are becoming a challenge for researchers. It is essential to develop new ideas to solve these problems, either by improving the infrastructure or applying the latest technology to use the existing infrastructure better. This research paper presents a high-resolution dataset that will help the research community to apply AI techniques to classify any emergency vehicle from traffic and road noises. Demand for such datasets is high as they can control traffic flow and reduce traffic congestion. It also improves emergency response time, especially for fire and health events. This work collects audio data using different methods, and pre-processed them to develop a high-quality and clean dataset. The dataset is divided into two labelled classes one for emergency vehicle sirens and one for traffic noises. The developed dataset offers high quality and range of real-world traffic sounds and emergency vehicle sirens. The technical validity of the dataset is also established.
Collapse
Affiliation(s)
- Muhammad Asif
- Data Acquisition, Processing, and Predictive Analytics, NCBC, Ziauddin University, Karachi, Pakistan.
| | - Muhammad Usaid
- Data Acquisition, Processing, and Predictive Analytics, NCBC, Ziauddin University, Karachi, Pakistan.
| | - Munaf Rashid
- Data Acquisition, Processing, and Predictive Analytics, NCBC, Ziauddin University, Karachi, Pakistan
| | - Tabarka Rajab
- Data Acquisition, Processing, and Predictive Analytics, NCBC, Ziauddin University, Karachi, Pakistan
| | - Samreen Hussain
- Aror University of Art, Architecture, Design and Heritage, Sukker, Pakistan
| | - Sarwar Wasi
- Data Acquisition, Processing, and Predictive Analytics, NCBC, Ziauddin University, Karachi, Pakistan
| |
Collapse
|
16
|
Kok XH, Imtiaz SA, Rodriguez-Villegas E. Automatic Identification of Snoring and Groaning Segments in Acoustic Recordings. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1993-1996. [PMID: 36086260 DOI: 10.1109/embc48229.2022.9871863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Sleep-related breathing disorders have severe impact on the quality of lives of those suffering from them. These disorders present with a variety of symptoms, out of which snoring and groaning are very common. This paper presents an algorithm to identify and classify segments of acoustic respiratory sound recordings that contain both groaning and snoring events. The recordings were obtained from a database containing 20 subjects from which features based on the Mel-frequency cepstral coefficients (MFCC) were extracted. In the first stage of the algorithm, segments of recordings consisting of either snoring or groaning episodes - without classifying them - were identified. In the second stage, these segments were further differentiated into individual groaning or snoring events. The algorithm in the first stage achieved a sensitivity and specificity of 90.5% ±2.9% and 90.0% ±1.6% respectively, using a RUSBoost model. In the second stage, a random forest classifier was used, and the accuracies for groan and snore events were 78.1% ±4.7% and 78.4% ±4.7% respectively.
Collapse
|