1
|
TaghiBeyglou B, Čuljak I, Bagheri F, Suntharalingam H, Yadollahi A. Estimating the severity of obstructive sleep apnea during wakefulness using speech: A review. Comput Biol Med 2024; 181:109020. [PMID: 39173487 DOI: 10.1016/j.compbiomed.2024.109020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 06/12/2024] [Accepted: 08/09/2024] [Indexed: 08/24/2024]
Abstract
Obstructive sleep apnea (OSA) is a chronic breathing disorder during sleep that affects 10-30% of adults in North America. The gold standard for diagnosing OSA is polysomnography (PSG). However, PSG has several drawbacks, for example, it is a cumbersome and expensive procedure, which can be quite inconvenient for patients. Additionally, patients often have to endure long waitlists before they can undergo PSG. As a result, other alternatives for screening OSA have gained attention. Speech, as an accessible modality, is generated by variations in the pharyngeal airway, vocal tract, and soft tissues in the pharynx, which shares similar anatomical structures that contribute to OSA. Consequently, in this study, we aim to provide a comprehensive review of the existing research on the use of speech for estimating the severity of OSA. In this regard, a total of 851 papers were initially identified from the PubMed database using a specified set of keywords defined by population, intervention, comparison and outcome (PICO) criteria, along with a concatenated graph of the 5 most cited papers in the field extracted from ConnectedPapers platform. Following a rigorous filtering process that considered the preferred reporting items for systematic reviews and meta-analyses (PRISMA) approach, 32 papers were ultimately included in this review. Among these, 28 papers primarily focused on developing methodology, while the remaining 4 papers delved into the clinical perspective of the association between OSA and speech. In the next step, we investigate the physiological similarities between OSA and speech. Subsequently, we highlight the features extracted from speech, the employed feature selection techniques, and the details of the developed models to predict OSA severity. By thoroughly discussing the current findings and limitations of studies in the field, we provide valuable insights into the gaps that need to be addressed in future research directions.
Collapse
Affiliation(s)
- Behrad TaghiBeyglou
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada; KITE Research Institute, Toronto Rehabilitation Institute- University Health Network, Toronto, ON, Canada
| | - Ivana Čuljak
- KITE Research Institute, Toronto Rehabilitation Institute- University Health Network, Toronto, ON, Canada
| | - Fatemeh Bagheri
- Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada; North York General Hospital, Toronto, ON, Canada
| | - Haarini Suntharalingam
- KITE Research Institute, Toronto Rehabilitation Institute- University Health Network, Toronto, ON, Canada
| | - Azadeh Yadollahi
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada; KITE Research Institute, Toronto Rehabilitation Institute- University Health Network, Toronto, ON, Canada.
| |
Collapse
|
2
|
Prabhakar SK, Rajaguru H, Won DO. Coherent Feature Extraction with Swarm Intelligence Based Hybrid Adaboost Weighted ELM Classification for Snoring Sound Classification. Diagnostics (Basel) 2024; 14:1857. [PMID: 39272642 PMCID: PMC11393855 DOI: 10.3390/diagnostics14171857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 08/22/2024] [Accepted: 08/23/2024] [Indexed: 09/15/2024] Open
Abstract
For patients suffering from obstructive sleep apnea and sleep-related breathing disorders, snoring is quite common, and it greatly interferes with the quality of life for them and for the people surrounding them. For diagnosing obstructive sleep apnea, snoring is used as a screening parameter, so the exact detection and classification of snoring sounds are quite important. Therefore, automated and very high precision snoring analysis and classification algorithms are required. In this work, initially the features are extracted from six different domains, such as time domain, frequency domain, Discrete Wavelet Transform (DWT) domain, sparse domain, eigen value domain, and cepstral domain. The extracted features are then selected using three efficient feature selection techniques, such as Golden Eagle Optimization (GEO), Salp Swarm Algorithm (SSA), and Refined SSA. The selected features are finally classified with the help of eight traditional machine learning classifiers and two proposed classifiers, such as the Firefly Algorithm-Weighted Extreme Learning Machine hybrid with Adaboost model (FA-WELM-Adaboost) and the Capuchin Search Algorithm-Weighted Extreme Learning Machine hybrid with Adaboost model (CSA-WELM-Adaboost). The analysis is performed on the MPSSC Interspeech dataset, and the best results are obtained when the DWT features with the refined SSA feature selection technique and FA-WELM-Adaboost hybrid classifier are utilized, reporting an Unweighted Average Recall (UAR) of 74.23%. The second-best results are obtained when DWT features are selected with the GEO feature selection technique and a CSA-WELM-Adaboost hybrid classifier is utilized, reporting an UAR of 73.86%.
Collapse
Affiliation(s)
- Sunil Kumar Prabhakar
- Department of Artificial Intelligence Convergence, Chuncheon 24252, Republic of Korea
| | - Harikumar Rajaguru
- Department of ECE, Bannari Amman Institute of Technology, Sathyamangalam 638401, India
| | - Dong-Ok Won
- Department of Artificial Intelligence Convergence, Chuncheon 24252, Republic of Korea
| |
Collapse
|
3
|
Triantafyllopoulos A, Kathan A, Baird A, Christ L, Gebhard A, Gerczuk M, Karas V, Hübner T, Jing X, Liu S, Mallol-Ragolta A, Milling M, Ottl S, Semertzidou A, Rajamani ST, Yan T, Yang Z, Dineley J, Amiriparian S, Bartl-Pokorny KD, Batliner A, Pokorny FB, Schuller BW. HEAR4Health: a blueprint for making computer audition a staple of modern healthcare. Front Digit Health 2023; 5:1196079. [PMID: 37767523 PMCID: PMC10520966 DOI: 10.3389/fdgth.2023.1196079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/01/2023] [Indexed: 09/29/2023] Open
Abstract
Recent years have seen a rapid increase in digital medicine research in an attempt to transform traditional healthcare systems to their modern, intelligent, and versatile equivalents that are adequately equipped to tackle contemporary challenges. This has led to a wave of applications that utilise AI technologies; first and foremost in the fields of medical imaging, but also in the use of wearables and other intelligent sensors. In comparison, computer audition can be seen to be lagging behind, at least in terms of commercial interest. Yet, audition has long been a staple assistant for medical practitioners, with the stethoscope being the quintessential sign of doctors around the world. Transforming this traditional technology with the use of AI entails a set of unique challenges. We categorise the advances needed in four key pillars: Hear, corresponding to the cornerstone technologies needed to analyse auditory signals in real-life conditions; Earlier, for the advances needed in computational and data efficiency; Attentively, for accounting to individual differences and handling the longitudinal nature of medical data; and, finally, Responsibly, for ensuring compliance to the ethical standards accorded to the field of medicine. Thus, we provide an overview and perspective of HEAR4Health: the sketch of a modern, ubiquitous sensing system that can bring computer audition on par with other AI technologies in the strive for improved healthcare systems.
Collapse
Affiliation(s)
- Andreas Triantafyllopoulos
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Alexander Kathan
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Alice Baird
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Lukas Christ
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Alexander Gebhard
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Maurice Gerczuk
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Vincent Karas
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Tobias Hübner
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Xin Jing
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Shuo Liu
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Adria Mallol-Ragolta
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
- Centre for Interdisciplinary Health Research, University of Augsburg, Augsburg, Germany
| | - Manuel Milling
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Sandra Ottl
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Anastasia Semertzidou
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | | | - Tianhao Yan
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Zijiang Yang
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Judith Dineley
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Shahin Amiriparian
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Katrin D. Bartl-Pokorny
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
- Division of Phoniatrics, Medical University of Graz, Graz, Austria
| | - Anton Batliner
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Florian B. Pokorny
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
- Division of Phoniatrics, Medical University of Graz, Graz, Austria
- Centre for Interdisciplinary Health Research, University of Augsburg, Augsburg, Germany
| | - Björn W. Schuller
- EIHW – Chair of Embedded Intelligence for Healthcare and Wellbeing, University of Augsburg, Augsburg, Germany
- Centre for Interdisciplinary Health Research, University of Augsburg, Augsburg, Germany
- GLAM – Group on Language, Audio, & Music, Imperial College London, London, United Kingdom
| |
Collapse
|
4
|
Li R, Li W, Yue K, Zhang R, Li Y. Automatic snoring detection using a hybrid 1D-2D convolutional neural network. Sci Rep 2023; 13:14009. [PMID: 37640790 PMCID: PMC10462688 DOI: 10.1038/s41598-023-41170-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 08/23/2023] [Indexed: 08/31/2023] Open
Abstract
Snoring, as a prevalent symptom, seriously interferes with life quality of patients with sleep disordered breathing only (simple snorers), patients with obstructive sleep apnea (OSA) and their bed partners. Researches have shown that snoring could be used for screening and diagnosis of OSA. Therefore, accurate detection of snoring sounds from sleep respiratory audio at night has been one of the most important parts. Considered that the snoring is somewhat dangerously overlooked around the world, an automatic and high-precision snoring detection algorithm is required. In this work, we designed a non-contact data acquire equipment to record nocturnal sleep respiratory audio of subjects in their private bedrooms, and proposed a hybrid convolutional neural network (CNN) model for the automatic snore detection. This model consists of a one-dimensional (1D) CNN processing the original signal and a two-dimensional (2D) CNN representing images mapped by the visibility graph method. In our experiment, our algorithm achieves an average classification accuracy of 89.3%, an average sensitivity of 89.7%, an average specificity of 88.5%, and an average AUC of 0.947, which surpasses some state-of-the-art models trained on our data. In conclusion, our results indicate that the proposed method in this study could be effective and significance for massive screening of OSA patients in daily life. And our work provides an alternative framework for time series analysis.
Collapse
Affiliation(s)
- Ruixue Li
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Wenjun Li
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China.
| | - Keqiang Yue
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Rulin Zhang
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Yilin Li
- Key Laboratory of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| |
Collapse
|
5
|
Li H, Lin X, Lu Y, Wang M, Cheng H. Pilot study of contactless sleep apnea detection based on snore signals with hardware implementation. Physiol Meas 2023; 44:085003. [PMID: 37506712 DOI: 10.1088/1361-6579/acebb5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 07/28/2023] [Indexed: 07/30/2023]
Abstract
Objective.Sleep apnea has a high incidence and is a potentially dangerous disease, and its early detection and diagnosis are challenging. Polysomnography (PSG) is considered the best approach for sleep apnea detection, but it requires cumbersome and complicated operations. Thus, it cannot satisfy the family healthcare needs.Approach.To facilitate the initial detection of sleep apnea in the home environment, we developed a sleep apnea classification model based on snoring and hybrid neural network, and implemented the well trained model in an embedded hardware platform. We used snore signals from 32 patients at Shenzhen People's Hospital. The Mel-Fbank features were extracted from snore signals to build a sleep apnea classification model based on Bi-LSTM with attention mechanism.Main results.The proposed model classified snore signals into four types: hypopnea, normal condition, obstructive sleep apnea, and central sleep apnea, with 83.52% and 62.31% accuracies, corresponding to the subject-dependence and subject-independence validation, respectively. After pruning and model quantization, at the cost of 0.81% and 0.95% accuracy loss of the subject dependence and subject independence classification, respectively, the number of model parameters and model storage space were reduced by 32.12% and 60.37%, respectively. The model exhibited accuracies of 82.71% and 61.36% based on the subject dependence and subject independence validations, respectively. When the well trained model was successfully porting and running on an STM32 ARM-embedded platform, the model accuracy was 58.85% for the four classifications based on leave-one-subject-out validation.Significance.The proposed sleep apnea detection model can be used in home healthcare for the initial detection of sleep apnea.
Collapse
Affiliation(s)
- Heng Li
- Shenzhen Key Laboratory of IoT Key Technology, Harbin Institute of Technology, Shenzhen 518055, People's Republic of China
| | - Xu Lin
- Shenzhen Key Laboratory of IoT Key Technology, Harbin Institute of Technology, Shenzhen 518055, People's Republic of China
| | - Yun Lu
- Shenzhen Key Laboratory of IoT Key Technology, Harbin Institute of Technology, Shenzhen 518055, People's Republic of China
- School of Computer Science and Engineering, Huizhou University, Huizhou, Guangdong 516007, People's Republic of China
| | - Mingjiang Wang
- Shenzhen Key Laboratory of IoT Key Technology, Harbin Institute of Technology, Shenzhen 518055, People's Republic of China
| | - Hanrong Cheng
- Department of Sleep Medicine, Shenzhen People's Hospital, The Second Clinical Medical College of Jinan University, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen, Guangdong, People's Republic of China
| |
Collapse
|
6
|
Wang L, Jiang Z. Tidal Volume Level Estimation Using Respiratory Sounds. JOURNAL OF HEALTHCARE ENGINEERING 2023; 2023:4994668. [PMID: 36844947 PMCID: PMC9949945 DOI: 10.1155/2023/4994668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 10/19/2022] [Accepted: 11/24/2022] [Indexed: 02/18/2023]
Abstract
Respiratory sounds have been used as a noninvasive and convenient method to estimate respiratory flow and tidal volume. However, current methods need calibration, making them difficult to use in a home environment. A respiratory sound analysis method is proposed to estimate tidal volume levels during sleep qualitatively. Respiratory sounds are filtered and segmented into one-minute clips, all clips are clustered into three categories: normal breathing/snoring/uncertain with agglomerative hierarchical clustering (AHC). Formant parameters are extracted to classify snoring clips into simple snoring and obstructive snoring with the K-means algorithm. For simple snoring clips, the tidal volume level is calculated based on snoring last time. For obstructive snoring clips, the tidal volume level is calculated by the maximum breathing pause interval. The performance of the proposed method is evaluated on an open dataset, PSG-Audio, in which full-night polysomnography (PSG) and tracheal sound were recorded simultaneously. The calculated tidal volume levels are compared with the corresponding lowest nocturnal oxygen saturation (LoO2) data. Experiments show that the proposed method calculates tidal volume levels with high accuracy and robustness.
Collapse
Affiliation(s)
- Lurui Wang
- Graduate School of Science and Engineering, Yamaguchi University, Yamaguchi, Japan
| | - Zhongwei Jiang
- Graduate School of Science and Engineering, Yamaguchi University, Yamaguchi, Japan
| |
Collapse
|
7
|
Aktuelle Entwicklungen in der Schlafforschung und Schlafmedizin – eine Einschätzung der AG „Chirurgische Therapieverfahren“. SOMNOLOGIE 2022. [DOI: 10.1007/s11818-022-00367-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
8
|
Lin X, Cheng H, Lu Y, Luo H, Li H, Qian Y, Zhou L, Zhang L, Wang M. Contactless sleep apnea detection in snoring signals using hybrid deep neural networks targeted for embedded hardware platform with real-time applications. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
9
|
Amiriparian S, Hübner T, Karas V, Gerczuk M, Ottl S, Schuller BW. DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing From Decentralized Data. Front Artif Intell 2022; 5:856232. [PMID: 35372830 PMCID: PMC8969434 DOI: 10.3389/frai.2022.856232] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 02/18/2022] [Indexed: 11/13/2022] Open
Abstract
Deep neural speech and audio processing systems have a large number of trainable parameters, a relatively complex architecture, and require a vast amount of training data and computational power. These constraints make it more challenging to integrate such systems into embedded devices and utilize them for real-time, real-world applications. We tackle these limitations by introducing DeepSpectrumLite, an open-source, lightweight transfer learning framework for on-device speech and audio recognition using pre-trained image Convolutional Neural Networks (CNNs). The framework creates and augments Mel spectrogram plots on the fly from raw audio signals which are then used to finetune specific pre-trained CNNs for the target classification task. Subsequently, the whole pipeline can be run in real-time with a mean inference lag of 242.0 ms when a DenseNet121 model is used on a consumer-grade Motorola moto e7 plus smartphone. DeepSpectrumLite operates decentralized, eliminating the need for data upload for further processing. We demonstrate the suitability of the proposed transfer learning approach for embedded audio signal processing by obtaining state-of-the-art results on a set of paralinguistic and general audio tasks, including speech and music emotion recognition, social signal processing, COVID-19 cough and COVID-19 speech analysis, and snore sound classification. We provide an extensive command-line interface for users and developers which is comprehensively documented and publicly available at https://github.com/DeepSpectrum/DeepSpectrumLite.
Collapse
Affiliation(s)
- Shahin Amiriparian
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
- *Correspondence: Shahin Amiriparian
| | - Tobias Hübner
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Vincent Karas
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Maurice Gerczuk
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Sandra Ottl
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Björn W. Schuller
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
- Group on Language, Audio, and Music (GLAM), Imperial College London, London, United Kingdom
| |
Collapse
|
10
|
Borsky M, Serwatko M, Arnardottir ES, Mallett J. Towards Sleep Study Automation: Detection Evaluation of Respiratory-Related Events. IEEE J Biomed Health Inform 2022; 26:3418-3426. [PMID: 35294367 DOI: 10.1109/jbhi.2022.3159727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The diagnosis of sleep disordered breathing depends on the detection of several respiratory-related events: apneas, hypopneas, snores, or respiratory event-related arousals from sleep studies. While a number of automatic detection methods have been proposed, reproducibility of these methods has been an issue, in part due to the absence of a generally accepted protocol for evaluating their results. With sleep measurements this is usually treated as a classification problem and the accompanying issue of localization is not treated as similarly critical. To address these problems we present a detection evaluation protocol that is able to qualitatively assess the match between two annotations of respiratory-related events. This protocol relies on measuring the relative temporal overlap between two annotations in order to find an alignment that maximizes their F1-score at the sequence level. This protocol can be used in applications which require a precise estimate of the number of events, total event duration, and a joint estimate of event number and duration. We assess its application using a data set that contains over 10,000 manually annotated snore events from 9 subjects, and show that when using the American Academy of Sleep Medicine Manual standard, two sleep technologists can achieve an F1-score of 0.88 when identifying the presence of snore events. In addition, we drafted rules for marking snore boundaries and showed that one sleep technologist can achieve F1-score of 0.94 at the same tasks. Finally, we compared our protocol against the protocol that is used to evaluate sleep spindle detection and highlighted the differences.
Collapse
|
11
|
Qian K, Schmitt M, Zheng H, Koike T, Han J, Liu J, Ji W, Duan J, Song M, Yang Z, Ren Z, Liu S, Zhang Z, Yamamoto Y, Schuller BW. Computer Audition for Fighting the SARS-CoV-2 Corona Crisis-Introducing the Multitask Speech Corpus for COVID-19. IEEE INTERNET OF THINGS JOURNAL 2021; 8:16035-16046. [PMID: 35782182 PMCID: PMC8768988 DOI: 10.1109/jiot.2021.3067605] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 02/24/2021] [Accepted: 03/17/2021] [Indexed: 05/29/2023]
Abstract
Computer audition (CA) has experienced a fast development in the past decades by leveraging advanced signal processing and machine learning techniques. In particular, for its noninvasive and ubiquitous character by nature, CA-based applications in healthcare have increasingly attracted attention in recent years. During the tough time of the global crisis caused by the coronavirus disease 2019 (COVID-19), scientists and engineers in data science have collaborated to think of novel ways in prevention, diagnosis, treatment, tracking, and management of this global pandemic. On the one hand, we have witnessed the power of 5G, Internet of Things, big data, computer vision, and artificial intelligence in applications of epidemiology modeling, drug and/or vaccine finding and designing, fast CT screening, and quarantine management. On the other hand, relevant studies in exploring the capacity of CA are extremely lacking and underestimated. To this end, we propose a novel multitask speech corpus for COVID-19 research usage. We collected 51 confirmed COVID-19 patients' in-the-wild speech data in Wuhan city, China. We define three main tasks in this corpus, i.e., three-category classification tasks for evaluating the physical and/or mental status of patients, i.e., sleep quality, fatigue, and anxiety. The benchmarks are given by using both classic machine learning methods and state-of-the-art deep learning techniques. We believe this study and corpus cannot only facilitate the ongoing research on using data science to fight against COVID-19, but also the monitoring of contagious diseases for general purpose.
Collapse
Affiliation(s)
- Kun Qian
- Educational Physiology Laboratory, Graduate School of EducationThe University of TokyoTokyo113-0033Japan
| | - Maximilian Schmitt
- Chair of Embedded Intelligence for Health Care and WellbeingUniversity of Augsburg86159AugsburgGermany
| | - Huaiyuan Zheng
- Department of Hand SurgeryWuhan Union Hospital, Tongji Medical CollegeHuazhong University of Science and TechnologyWuhan430074China
| | - Tomoya Koike
- Educational Physiology Laboratory, Graduate School of EducationThe University of TokyoTokyo113-0033Japan
| | - Jing Han
- Mobile Systems GroupUniversity of CambridgeCambridgeCB2 1TNU.K.
| | - Juan Liu
- Department of Plastic SurgeryCentral Hospital of Wuhan, Tongji Medical CollegeHuazhong University of Science and TechnologyWuhan430074China
| | - Wei Ji
- Department of Plastic SurgeryWuhan Third Hospital and Tongren Hospital of Wuhan UniversityWuhan430072China
| | - Junjun Duan
- Department of Plastic SurgeryCentral Hospital of Wuhan, Tongji Medical CollegeHuazhong University of Science and TechnologyWuhan430074China
| | - Meishu Song
- Chair of Embedded Intelligence for Health Care and WellbeingUniversity of Augsburg86159AugsburgGermany
| | - Zijiang Yang
- Chair of Embedded Intelligence for Health Care and WellbeingUniversity of Augsburg86159AugsburgGermany
| | - Zhao Ren
- Chair of Embedded Intelligence for Health Care and WellbeingUniversity of Augsburg86159AugsburgGermany
| | - Shuo Liu
- Chair of Embedded Intelligence for Health Care and WellbeingUniversity of Augsburg86159AugsburgGermany
| | - Zixing Zhang
- GLAM—the Group on Language, Audio, and MusicImperial College LondonLondonSW7 2BUU.K.
| | - Yoshiharu Yamamoto
- Educational Physiology Laboratory, Graduate School of EducationThe University of TokyoTokyo113-0033Japan
| | - Björn W. Schuller
- Chair of Embedded Intelligence for Health Care and WellbeingUniversity of Augsburg86159AugsburgGermany
- GLAM—the Group on Language, Audio, and MusicImperial College LondonLondonSW7 2BUU.K.
| |
Collapse
|
12
|
Huang Z, Aarab G, Ravesloot MJL, Zhou N, Bosschieter PFN, van Selms MKA, den Haan C, de Vries N, Lobbezoo F, Hilgevoord AAJ. Prediction of the obstruction sites in the upper airway in sleep-disordered breathing based on snoring sound parameters: a systematic review. Sleep Med 2021; 88:116-133. [PMID: 34749271 DOI: 10.1016/j.sleep.2021.10.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 09/16/2021] [Accepted: 10/12/2021] [Indexed: 11/19/2022]
Abstract
BACKGROUND Identification of the obstruction site in the upper airway may help in treatment selection for patients with sleep-disordered breathing. Because of limitations of existing techniques, there is a continuous search for more feasible methods. Snoring sound parameters were hypothesized to be potential predictors of the obstruction site. Therefore, this review aims to i) investigate the association between snoring sound parameters and the obstruction sites; and ii) analyze the methodology of reported prediction models of the obstruction sites. METHODS The literature search was conducted in PubMed, Embase.com, CENTRAL, Web of Science, and Scopus in collaboration with a medical librarian. Studies were eligible if they investigated the associations between snoring sound parameters and the obstruction sites, and/or reported prediction models of the obstruction sites based on snoring sound. RESULTS Of the 1016 retrieved references, 28 eligible studies were included. It was found that the characteristic frequency components generated from lower-level obstructions of the upper airway were higher than those generated from upper-level obstructions. Prediction models were built mainly based on snoring sound parameters in frequency domain. The reported accuracies ranged from 60.4% to 92.2%. CONCLUSIONS Available evidence points toward associations between the snoring sound parameters in the frequency domain and the obstruction sites in the upper airway. It is promising to build a prediction model of the obstruction sites based on snoring sound parameters and participant characteristics, but so far snoring sound analysis does not seem to be a viable diagnostic modality for treatment selection.
Collapse
Affiliation(s)
- Zhengfei Huang
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Department of Clinical Neurophysiology, OLVG, Amsterdam, the Netherlands.
| | - Ghizlane Aarab
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Madeline J L Ravesloot
- Department of Otorhinolaryngology - Head and Neck Surgery, OLVG, Amsterdam, the Netherlands
| | - Ning Zhou
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Department of Oral and Maxillofacial Surgery, Amsterdam UMC Location AMC and Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam, Amsterdam, the Netherlands
| | - Pien F N Bosschieter
- Department of Otorhinolaryngology - Head and Neck Surgery, OLVG, Amsterdam, the Netherlands
| | - Maurits K A van Selms
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Chantal den Haan
- Medical Library, Department of Research and Education, OLVG, Amsterdam, the Netherlands
| | - Nico de Vries
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Department of Otorhinolaryngology - Head and Neck Surgery, OLVG, Amsterdam, the Netherlands; Department of Otorhinolaryngology - Head and Neck Surgery, Antwerp University Hospital (UZA), Antwerp, Belgium
| | - Frank Lobbezoo
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | | |
Collapse
|
13
|
Korompili G, Amfilochiou A, Kokkalas L, Mitilineos SA, Tatlas NA, Kouvaras M, Kastanakis E, Maniou C, Potirakis SM. PSG-Audio, a scored polysomnography dataset with simultaneous audio recordings for sleep apnea studies. Sci Data 2021; 8:197. [PMID: 34344893 PMCID: PMC8333307 DOI: 10.1038/s41597-021-00977-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 06/17/2021] [Indexed: 11/22/2022] Open
Abstract
The sleep apnea syndrome is a chronic condition that affects the quality of life and increases the risk of severe health conditions such as cardiovascular diseases. However, the prevalence of the syndrome in the general population is considered to be heavily underestimated due to the restricted number of people seeking diagnosis, with the leading cause for this being the inconvenience of the current reference standard for apnea diagnosis: Polysomnography. To enhance patients' awareness of the syndrome, a great endeavour is conducted in the literature. Various home-based apnea detection systems are being developed, profiting from information in a restricted set of polysomnography signals. In particular, breathing sound has been proven highly effective in detecting apneic events during sleep. The development of accurate systems requires multitudinous datasets of audio recordings and polysomnograms. In this work, we provide the first open access dataset, comprising 212 polysomnograms along with synchronized high-quality tracheal and ambient microphone recordings. We envision this dataset to be widely used for the development of home-based apnea detection techniques and frameworks.
Collapse
Affiliation(s)
- Georgia Korompili
- Department of Electrical and Electronic Engineering, University of West Attica, Attica, Greece
| | - Anastasia Amfilochiou
- Sleep Study Unit, Sismanoglio - Amalia Fleming General Hospital of Athens, Athens, Greece
| | - Lampros Kokkalas
- Department of Electrical and Electronic Engineering, University of West Attica, Attica, Greece
| | - Stelios A Mitilineos
- Department of Electrical and Electronic Engineering, University of West Attica, Attica, Greece
| | | | - Marios Kouvaras
- Department of Electrical and Electronic Engineering, University of West Attica, Attica, Greece
| | - Emmanouil Kastanakis
- Sleep Study Unit, Sismanoglio - Amalia Fleming General Hospital of Athens, Athens, Greece
| | - Chrysoula Maniou
- Sleep Study Unit, Sismanoglio - Amalia Fleming General Hospital of Athens, Athens, Greece
| | - Stelios M Potirakis
- Department of Electrical and Electronic Engineering, University of West Attica, Attica, Greece.
| |
Collapse
|
14
|
Sebastian A, Cistulli PA, Cohen G, de Chazal P. Association of Snoring Characteristics with Predominant Site of Collapse of Upper Airway in Obstructive Sleep Apnoea Patients. Sleep 2021; 44:6322655. [PMID: 34270768 DOI: 10.1093/sleep/zsab176] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 06/11/2021] [Indexed: 11/14/2022] Open
Abstract
STUDY OBJECTIVES Acoustic analysis of isolated events and snoring by previous researchers suggests a correlation between individual acoustic features and individual site of collapse events. In this study, we hypothesised that multi-parameter evaluation of snore sounds during natural sleep would provide a robust prediction of the predominant site of airway collapse. METHODS The audio signals of 58 OSA patients were recorded simultaneously with full night polysomnography. The site of collapse was determined by manual analysis of the shape of the airflow signal during hypopnoea events and corresponding audio signal segments containing snore were manually extracted and processed. Machine learning algorithms were developed to automatically annotate the site of collapse of each hypopnoea event into three classes (lateral wall, palate and tongue-base). The predominant site of collapse for a sleep period was determined from the individual hypopnoea annotations and compared to the manually determined annotations. This was a retrospective study that used cross-validation to estimate performance. RESULTS Cluster analysis showed that the data fits well in two clusters with a mean silhouette coefficient of 0.79 and an accuracy of 68% for classifying tongue/non-tongue collapse. A classification model using linear discriminants achieved an overall accuracy of 81% for discriminating tongue/non-tongue predominant site of collapse and accuracy of 64% for all site of collapse classes. CONCLUSIONS Our results reveal that the snore signal during hypopnoea can provide information regarding the predominant site of collapse in the upper airway. Therefore, the audio signal recorded during sleep could potentially be used as a new tool in identifying the predominant site of collapse and consequently improving the treatment selection and outcome.
Collapse
Affiliation(s)
- Arun Sebastian
- School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, Australia.,Charles Perkins Centre, The University of Sydney, Sydney, Australia
| | - Peter A Cistulli
- Charles Perkins Centre, The University of Sydney, Sydney, Australia.,Northern Clinical School, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia.,Sleep Investigation Laboratory, Department of Respiratory and Sleep Medicine, Royal North Shore Hospital, Sydney, Australia
| | - Gary Cohen
- Sleep Investigation Laboratory, Department of Respiratory and Sleep Medicine, Royal North Shore Hospital, Sydney, Australia
| | - Philip de Chazal
- School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, Australia.,Charles Perkins Centre, The University of Sydney, Sydney, Australia
| |
Collapse
|
15
|
Dogan S, Akbal E, Tuncer T, Acharya UR. Application of substitution box of present cipher for automated detection of snoring sounds. Artif Intell Med 2021; 117:102085. [PMID: 34127246 DOI: 10.1016/j.artmed.2021.102085] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 04/30/2021] [Accepted: 05/03/2021] [Indexed: 01/06/2023]
Abstract
BACKGROUND AND PURPOSE Snoring is one of the sleep disorders, and snoring sounds have been used to diagnose many sleep-related diseases. However, the snoring sound classification is done manually which is time-consuming and prone to human errors. An automated snoring sound classification model is proposed to overcome these problems. MATERIAL AND METHOD This work proposes an automated snoring sound classification method using three new methods. These methods are maximum absolute pooling (MAP), the nonlinear present pattern, and two-layered neighborhood component analysis, and iterative neighborhood component analysis (NCAINCA) selector. Using these methods, a new snoring sound classification (SSC) model is presented. The MAP decomposition model is applied to snoring sounds to extract both low and high-level features. The presented model aims to attain high performance for SSC problem. The developed present pattern (Present-Pat) uses substitution box (SBox) and statistical feature generator. By deploying these feature generators, both textural and statistical features are generated. NCAINCA chooses the most informative/valuable features, and these selected features are fed to k-nearest neighbor (kNN) classifier with leave-one-out cross-validation (LOOCV). The Present-Pat based SSC system is developed using Munich-Passau Snore Sound Corpus (MPSSC) dataset comprising of four categories. RESULTS Our model reached an accuracy and unweighted average recall (UAR) of 97.10 % and 97.60 %, respectively, using LOOCV. Moreover, a nocturnal sound dataset is used to show the universal success of the presented model. Our model attained an accuracy of 98.14 % using the used nocturnal sound dataset. CONCLUSIONS Our developed classification model is ready to be tested with more data and can be used by sleep specialists to diagnose the sleep disorders based on snoring sounds.
Collapse
Affiliation(s)
- Sengul Dogan
- Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey.
| | - Erhan Akbal
- Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey
| | - Turker Tuncer
- Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey
| | - U Rajendra Acharya
- Ngee Ann Polytechnic, Department of Electronics and Computer Engineering, 599489, Singapore; Department of Biomedical Engineering, School of Science and Technology, SUSS University, Singapore; Department of Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
16
|
Sun J, Hu X, Peng S, Peng CK, Ma Y. Automatic classification of excitation location of snoring sounds. J Clin Sleep Med 2021; 17:1031-1038. [PMID: 33560203 DOI: 10.5664/jcsm.9094] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
STUDY OBJECTIVES For surgical treatment of patients with obstructive sleep apnea-hypopnea syndrome, it is crucial to locate accurately the obstructive sites in the upper airway; however, noninvasive methods for locating the obstructive sites have not been well explored. Snoring, as the cardinal symptom of obstructive sleep apnea-hypopnea syndrome, should contain information that reflects the state of the upper airway. Through the classification of snores produced at four different locations, this study aimed to test the hypothesis that snores generated by various obstructive sites differ. METHODS We trained and tested our model on a public data set that comprised 219 participants. For each snore episode, an acoustic and a physiological feature were extracted and concatenated, forming a 59-dimensional fusion feature. A principal component analysis and a support machine vector were used for dimensional reduction and snore classification. The performance of the proposed model was evaluated using several metrics: sensitivity, precision, specificity, area under the receiver operating characteristic curve, and F1 score. RESULTS The unweighted average values of sensitivity, precision, specificity, area under the curve, and F1 were 86.36%, 89.09%, 96.4%, 87.9%, and 87.63%, respectively. The model achieved 98.04%, 80.56%, 72.73%, and 94.12% sensitivity for types V (velum), O (oropharyngeal), T (tongue), and E (epiglottis) snores. CONCLUSIONS The characteristics of snores are related to the state of the upper airway. The machine-learning-based model can be used to locate the vibration sites in the upper airway.
Collapse
Affiliation(s)
- Jingpeng Sun
- Institute of Automation, Chinese Academy of Sciences, Beijing, People's Republic of China.,University of Chinese Academy of Sciences, Beijing, People's Republic of China.,Division of Interdisciplinary Medicine and Biotechnology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts
| | - Xiyuan Hu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, People's Republic of China
| | - Silong Peng
- Institute of Automation, Chinese Academy of Sciences, Beijing, People's Republic of China.,University of Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Chung-Kang Peng
- Division of Interdisciplinary Medicine and Biotechnology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts
| | - Yan Ma
- Division of Interdisciplinary Medicine and Biotechnology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
17
|
Qian K, Janott C, Schmitt M, Zhang Z, Heiser C, Hemmert W, Yamamoto Y, Schuller BW. Can Machine Learning Assist Locating the Excitation of Snore Sound? A Review. IEEE J Biomed Health Inform 2021; 25:1233-1246. [PMID: 32750978 DOI: 10.1109/jbhi.2020.3012666] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the past three decades, snoring (affecting more than 30 % adults of the UK population) has been increasingly studied in the transdisciplinary research community involving medicine and engineering. Early work demonstrated that, the snore sound can carry important information about the status of the upper airway, which facilitates the development of non-invasive acoustic based approaches for diagnosing and screening of obstructive sleep apnoea and other sleep disorders. Nonetheless, there are more demands from clinical practice on finding methods to localise the snore sound's excitation rather than only detecting sleep disorders. In order to further the relevant studies and attract more attention, we provide a comprehensive review on the state-of-the-art techniques from machine learning to automatically classify snore sounds. First, we introduce the background and definition of the problem. Second, we illustrate the current work in detail and explain potential applications. Finally, we discuss the limitations and challenges in the snore sound classification task. Overall, our review provides a comprehensive guidance for researchers to contribute to this area.
Collapse
|
18
|
Schuller BW, Schuller DM, Qian K, Liu J, Zheng H, Li X. COVID-19 and Computer Audition: An Overview on What Speech & Sound Analysis Could Contribute in the SARS-CoV-2 Corona Crisis. Front Digit Health 2021; 3:564906. [PMID: 34713079 PMCID: PMC8521916 DOI: 10.3389/fdgth.2021.564906] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 02/03/2021] [Indexed: 01/07/2023] Open
Abstract
At the time of writing this article, the world population is suffering from more than 2 million registered COVID-19 disease epidemic-induced deaths since the outbreak of the corona virus, which is now officially known as SARS-CoV-2. However, tremendous efforts have been made worldwide to counter-steer and control the epidemic by now labelled as pandemic. In this contribution, we provide an overview on the potential for computer audition (CA), i.e., the usage of speech and sound analysis by artificial intelligence to help in this scenario. We first survey which types of related or contextually significant phenomena can be automatically assessed from speech or sound. These include the automatic recognition and monitoring of COVID-19 directly or its symptoms such as breathing, dry, and wet coughing or sneezing sounds, speech under cold, eating behaviour, sleepiness, or pain to name but a few. Then, we consider potential use-cases for exploitation. These include risk assessment and diagnosis based on symptom histograms and their development over time, as well as monitoring of spread, social distancing and its effects, treatment and recovery, and patient well-being. We quickly guide further through challenges that need to be faced for real-life usage and limitations also in comparison with non-audio solutions. We come to the conclusion that CA appears ready for implementation of (pre-)diagnosis and monitoring tools, and more generally provides rich and significant, yet so far untapped potential in the fight against COVID-19 spread.
Collapse
Affiliation(s)
- Björn W. Schuller
- GLAM – Group on Language, Audio & Music, Imperial College London, London, United Kingdom
- EIHW – Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
- audEERING GmbH, Gilching, Germany
| | | | - Kun Qian
- Educational Physiology Laboratory, The University of Tokyo, Tokyo, Japan
| | - Juan Liu
- Department of Plastic Surgery, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Huaiyuan Zheng
- Department of Hand Surgery, Wuhan Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xiao Li
- Department of Neurology, Children's Hospital of Chongqing Medical University, Chongqing Medical University, Chongqing, China
| |
Collapse
|
19
|
Tuncer T, Akbal E, Dogan S. An automated snoring sound classification method based on local dual octal pattern and iterative hybrid feature selector. Biomed Signal Process Control 2021; 63:102173. [PMID: 32922509 PMCID: PMC7476581 DOI: 10.1016/j.bspc.2020.102173] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 08/18/2020] [Accepted: 08/22/2020] [Indexed: 02/08/2023]
Abstract
In this research, a novel snoring sound classification (SSC) method is presented by proposing a new feature generation function to yield a high classification rate. The proposed feature extractor is named as Local Dual Octal Pattern (LDOP). A novel LDOP based SSC method is presented to solve the low success rate problems for Munich-Passau Snore Sound Corpus (MPSSC) dataset. Multilevel discrete wavelet transform (DWT) decomposition and the LDOP based feature generation, informative features selection with ReliefF and iterative neighborhood component analysis (RFINCA), and classification using k nearest neighbors (kNN) are fundamental phases of the proposed SSC method. Seven leveled DWT transform, and LDOP are used together to generate low, medium, and high levels features. This feature generation network extracts 4096 features in total. RFINCA selects 95 the most discriminative and informative ones of these 4096 features. In the classification phase, kNN with leave one out cross-validation (LOOCV) is used. 95.53% classification accuracy and 94.65% unweighted average recall (UAR) have been achieved using this method. The proposed LDOP based SSC method reaches 22% better result than the best of the other state-of-the-art machine learning and deep learning-based methods. These results clearly denote the success of the proposed SSC method.
Collapse
Affiliation(s)
- Turker Tuncer
- Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
| | - Erhan Akbal
- Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
| | - Sengul Dogan
- Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
| |
Collapse
|
20
|
Pandit V, Schmitt M, Cummins N, Schuller B. I see it in your eyes: Training the shallowest-possible CNN to recognise emotions and pain from muted web-assisted in-the-wild video-chats in real-time. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
21
|
Sun J, Hu X, Chen C, Peng S, Ma Y. Amplitude spectrum trend-based feature for excitation location classification from snore sounds. Physiol Meas 2020; 41:085006. [PMID: 32721937 DOI: 10.1088/1361-6579/abaa34] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
OBJECTIVE Successful surgical treatment of obstructive sleep apnea (OSA) depends on the precise location of the vibrating tissue. Snoring is the main symptom of OSA and can be utilized to detect the active location of tissues. However, existing approaches are limited, owing to their inability to capture the characteristics of snoring produced from the upper airway. This paper proposes a new approach to better distinguish different snoring sounds that are generated from four different excitation locations. APPROACH First, we propose a robust null space pursuit algorithm for extracting the trend from the amplitude spectrum of snoring. Second, a new feature from this extracted amplitude spectrum trend, which outperforms the Mel-frequency cepstral coefficient (MFCC) feature, is designed. Subsequently, the newly proposed feature, namely the trend-based MFCC (TCC), is reduced in dimensionality by using principal component analysis. Finally, a support vector machine is employed for the classification task. MAIN RESULTS By using the TCC, the proposed approach achieves an unweighted average recall of 87.5% on the classification of four excitation locations on the public dataset Munich Passau Snore Sound Corpus. SIGNIFICANCE The TCC is a promising feature for capturing the characteristics of snoring. The proposed method can effectively perform snore classification and assist in accurate OSA diagnosis.
Collapse
Affiliation(s)
- Jingpeng Sun
- Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | | | | | | | | |
Collapse
|
22
|
Snoring patterns during home polysomnography. A proposal for a new classification. Am J Otolaryngol 2020; 41:102589. [PMID: 32563786 DOI: 10.1016/j.amjoto.2020.102589] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 05/16/2020] [Accepted: 05/25/2020] [Indexed: 10/24/2022]
Abstract
PURPOSE Snoring is a very common disorder, but, at present, there is no universally accepted classification for the condition. The main aim of this paper is to introduce a home sleep monitoring-based classification of common snoring patterns in simple snorers and in patients with obstructive sleep apnea-hypopnea syndrome (OSAHS). MATERIALS AND METHODS In total, 561 consecutive patients with a history of snoring, either simple or associated with apnea, were enrolled in this home sleep monitoring study. Analysis of the polysomnographic traces and the snoring sensor allowed the main patterns of snoring and their characteristics to be determined. RESULTS Four patterns of snoring were identified. In a spectrum of increasing severity (mild, moderate or severe), snoring can be episodic, positional, continuous, or alternating, whereas in obstructive sleep apnea syndrome, the snoring events only occur between successive respiratory obstructive events. In mild snoring, the episodic pattern is the most frequent, whereas in moderate and severe snoring, the continuous snoring pattern occurs in most cases. CONCLUSIONS The proposed classification of snoring patterns would be beneficial for providing a realistic disturbance index, for the selection and evaluation of the outcomes of surgical techniques.
Collapse
|
23
|
Zhang Z, Han J, Qian K, Janott C, Guo Y, Schuller B. Snore-GANs: Improving Automatic Snore Sound Classification With Synthesized Data. IEEE J Biomed Health Inform 2020; 24:300-310. [DOI: 10.1109/jbhi.2019.2907286] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
24
|
Janott C, Rohrmeier C, Schmitt M, Hemmert W, Schuller B. Snoring - An Acoustic Definition. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2019; 2019:3653-3657. [PMID: 31946668 DOI: 10.1109/embc.2019.8856615] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Objective- The distinction of snoring and loud breathing is often subjective and lies in the ear of the beholder. The aim of this study is to identify and assess acoustic features with a high suitability to distinguish these two classes of sound, in order to facilitate an objective definition of snoring based on acoustic parameters. Methods- A corpus of snore and breath sounds from 23 subjects has been used that were classified by 25 human raters. Using the openSMILE feature extractor, 6 373 acoustic features have been evaluated for their selectivity comparing SVM classification, logistic regression, and the recall of each single feature. Results- Most selective single features were several statistical functionals of the first and second mel frequency spectrum-generated perceptual linear predictive (PLP) cepstral coefficient with an unweighted average recall (UAR) of up to 93.8%. The best performing feature sets were low level descriptors (LLDs), derivatives and statistical functionals based on fast Fourier transformation (FFT), with a UAR of 93.0%, and on the summed mel frequency spectrum-generated PLP cepstral coefficients, with a UAR of 92.2% using SVM classification. Compared to SVM classification, logistic regression did not show considerable differences in classification performance. Conclusion- It could be shown that snoring and loud breathing can be distinguished by robust acoustic features. The findings might serve as a guidance to find a consensus for an objective definition of snoring compared to loud breathing.
Collapse
|
25
|
VOTE versus ACLTE: Vergleich zweier Schnarchgeräuschklassifikationen mit Methoden des maschinellen Lernens. HNO 2019; 67:670-678. [DOI: 10.1007/s00106-019-0696-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
26
|
A Bag of Wavelet Features for Snore Sound Classification. Ann Biomed Eng 2019; 47:1000-1011. [DOI: 10.1007/s10439-019-02217-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 01/21/2019] [Indexed: 10/27/2022]
|
27
|
Kim JW, Kim T, Shin J, Choe G, Lim HJ, Rhee CS, Lee K, Cho SW. Prediction of Obstructive Sleep Apnea Based on Respiratory Sounds Recorded Between Sleep Onset and Sleep Offset. Clin Exp Otorhinolaryngol 2018; 12:72-78. [PMID: 30189718 PMCID: PMC6315207 DOI: 10.21053/ceo.2018.00388] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Accepted: 07/14/2018] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVES To develop a simple algorithm for prescreening of obstructive sleep apnea (OSA) on the basis of respiratory sounds recorded during polysomnography during all sleep stages between sleep onset and offset. METHODS Patients who underwent attended, in-laboratory, full-night polysomnography were included. For all patients, audio recordings were performed with an air-conduction microphone during polysomnography. Analyses included all sleep stages (i.e., N1, N2, N3, rapid eye movement, and waking). After noise reduction preprocessing, data were segmented into 5-s windows and sound features were extracted. Prediction models were established and validated with 10-fold cross-validation by using simple logistic regression. Binary classifications were separately conducted for three different threshold criteria at apnea hypopnea index (AHI) of 5, 15, or 30. Prediction model characteristics, including accuracy, sensitivity, specificity, positive predictive value (precision), negative predictive value, and area under the curve (AUC) of the receiver operating characteristic were computed. RESULTS A total of 116 subjects were included; their mean age, body mass index, and AHI were 50.4 years, 25.5 kg/m2 , and 23.0/hr, respectively. A total of 508 sound features were extracted from respiratory sounds recorded throughout sleep. Accuracies of binary classifiers at AHIs of 5, 15, and 30 were 82.7%, 84.4%, and 85.3%, respectively. Prediction performances for the classifiers at AHIs of 5, 15, and 30 were AUC, 0.83, 0.901, and 0.91; sensitivity, 87.5%, 81.6%, and 60%; and specificity, 67.8%, 87.5%, and 94.1%. Respective precision values of the classifiers were 89.5%, 87.5%, and 78.2% for AHIs of 5, 15, and 30. CONCLUSION This study showed that our binary classifier predicted patients with AHI of ≥15 with sensitivity and specificity of >80% by using respiratory sounds during sleep. Since our prediction model included all sleep stage data, algorithms based on respiratory sounds may have a high value for prescreening OSA with mobile devices.
Collapse
Affiliation(s)
- Jeong-Whun Kim
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Taehoon Kim
- Music and Audio Research Group, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea
| | - Jaeyoung Shin
- Music and Audio Research Group, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea
| | - Goun Choe
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Hyun Jung Lim
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Chae-Seo Rhee
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Kyogu Lee
- Music and Audio Research Group, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea
| | - Sung-Woo Cho
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| |
Collapse
|