1
|
Partovi E, Babic A, Gharehbaghi A. A review on deep learning methods for heart sound signal analysis. Front Artif Intell 2024; 7:1434022. [PMID: 39605951 PMCID: PMC11599230 DOI: 10.3389/frai.2024.1434022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Accepted: 10/09/2024] [Indexed: 11/29/2024] Open
Abstract
Introduction Application of Deep Learning (DL) methods is being increasingly appreciated by researchers from the biomedical engineering domain in which heart sound analysis is an important topic of study. Diversity in methodology, results, and complexity causes uncertainties in obtaining a realistic picture of the methodological performance from the reported methods. Methods This survey paper provides the results of a broad retrospective study on the recent advances in heart sound analysis using DL methods. Representation of the results is performed according to both methodological and applicative taxonomies. The study method covers a wide span of related keywords using well-known search engines. Implementation of the observed methods along with the related results is pervasively represented and compared. Results and discussion It is observed that convolutional neural networks and recurrent neural networks are the most commonly used ones for discriminating abnormal heart sounds and localization of heart sounds with 67.97% and 33.33% of the related papers, respectively. The convolutional neural network and the autoencoder network show a perfect accuracy of 100% in the case studies on the classification of abnormal from normal heart sounds. Nevertheless, this superiority against other methods with lower accuracy is not conclusive due to the inconsistency in evaluation.
Collapse
Affiliation(s)
- Elaheh Partovi
- Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Ankica Babic
- Department of Biomedical Engineering, Linköping University, Linköping, Sweden
- Department of Information Science and Media Studies, University of Bergen, Bergen, Norway
| | - Arash Gharehbaghi
- Department of Biomedical Engineering, Linköping University, Linköping, Sweden
| |
Collapse
|
2
|
Zeng Y, Li M, He Z, Zhou L. Segmentation of Heart Sound Signal Based on Multi-Scale Feature Fusion and Multi-Classification of Congenital Heart Disease. Bioengineering (Basel) 2024; 11:876. [PMID: 39329618 PMCID: PMC11428210 DOI: 10.3390/bioengineering11090876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 08/22/2024] [Accepted: 08/27/2024] [Indexed: 09/28/2024] Open
Abstract
Analyzing heart sound signals presents a novel approach for early diagnosis of pediatric congenital heart disease. The existing segmentation algorithms have limitations in accurately distinguishing the first (S1) and second (S2) heart sounds, limiting the diagnostic utility of cardiac cycle data for pediatric pathology assessment. This study proposes a time bidirectional long short-term memory network (TBLSTM) based on multi-scale analysis to segment pediatric heart sound signals according to different cardiac cycles. Mel frequency cepstral coefficients and dynamic characteristics of the heart sound fragments were extracted and input into random forest for multi-classification of congenital heart disease. The segmentation model achieved an overall F1 score of 94.15% on the verification set, with specific F1 scores of 90.25% for S1 and 86.04% for S2. In a situation where the number of cardiac cycles in the heart sound fragments was set to six, the results for multi-classification achieved stabilization. The performance metrics for this configuration were as follows: accuracy of 94.43%, sensitivity of 95.58%, and an F1 score of 94.51%. Furthermore, the segmentation model demonstrates robustness in accurately segmenting pediatric heart sound signals across different heart rates and in the presence of noise. Notably, the number of cardiac cycles in heart sound fragments directly impacts the multi-classification of these heart sound signals.
Collapse
Affiliation(s)
- Yuan Zeng
- Research Center of Fluid Machinery Engineering and Technology, Jiangsu University, Zhenjiang 212013, China; (Y.Z.); (M.L.); (Z.H.)
| | - Mingzhe Li
- Research Center of Fluid Machinery Engineering and Technology, Jiangsu University, Zhenjiang 212013, China; (Y.Z.); (M.L.); (Z.H.)
| | - Zhaoming He
- Research Center of Fluid Machinery Engineering and Technology, Jiangsu University, Zhenjiang 212013, China; (Y.Z.); (M.L.); (Z.H.)
- Department of Mechanical Engineering, Texas Tech University, Lubbock, TX 79411, USA
| | - Ling Zhou
- Research Center of Fluid Machinery Engineering and Technology, Jiangsu University, Zhenjiang 212013, China; (Y.Z.); (M.L.); (Z.H.)
| |
Collapse
|
3
|
Zhu B, Zhou Z, Yu S, Liang X, Xie Y, Sun Q. Review of Phonocardiogram Signal Analysis: Insights from the PhysioNet/CinC Challenge 2016 Database. ELECTRONICS 2024; 13:3222. [DOI: 10.3390/electronics13163222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
The phonocardiogram (PCG) is a crucial tool for the early detection, continuous monitoring, accurate diagnosis, and efficient management of cardiovascular diseases. It has the potential to revolutionize cardiovascular care and improve patient outcomes. The PhysioNet/CinC Challenge 2016 database, a large and influential resource, encourages contributions to accurate heart sound state classification (normal versus abnormal), achieving promising benchmark performance (accuracy: 99.80%; sensitivity: 99.70%; specificity: 99.10%; and score: 99.40%). This study reviews recent advances in analytical techniques applied to this database, and 104 publications on PCG signal analysis are retrieved. These techniques encompass heart sound preprocessing, signal segmentation, feature extraction, and heart sound state classification. Specifically, this study summarizes methods such as signal filtering and denoising; heart sound segmentation using hidden Markov models and machine learning; feature extraction in the time, frequency, and time-frequency domains; and state-of-the-art heart sound state recognition techniques. Additionally, it discusses electrocardiogram (ECG) feature extraction and joint PCG and ECG heart sound state recognition. Despite significant technical progress, challenges remain in large-scale high-quality data collection, model interpretability, and generalizability. Future directions include multi-modal signal fusion, standardization and validation, automated interpretation for decision support, real-time monitoring, and longitudinal data analysis. Continued exploration and innovation in heart sound signal analysis are essential for advancing cardiac care, improving patient outcomes, and enhancing user trust and acceptance.
Collapse
Affiliation(s)
- Bing Zhu
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| | - Zihong Zhou
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| | - Shaode Yu
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| | - Xiaokun Liang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Qiurui Sun
- Center of Information & Network Technology, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
4
|
Rohr M, Müller B, Dill S, Güney G, Hoog Antink C. Multiple instance learning framework can facilitate explainability in murmur detection. PLOS DIGITAL HEALTH 2024; 3:e0000461. [PMID: 38502666 PMCID: PMC10950224 DOI: 10.1371/journal.pdig.0000461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 02/04/2024] [Indexed: 03/21/2024]
Abstract
OBJECTIVE Cardiovascular diseases (CVDs) account for a high fatality rate worldwide. Heart murmurs can be detected from phonocardiograms (PCGs) and may indicate CVDs. Still, they are often overlooked as their detection and correct clinical interpretation require expert skills. In this work, we aim to predict the presence of murmurs and clinical outcomes from multiple PCG recordings employing an explainable multitask model. APPROACH Our approach consists of a two-stage multitask model. In the first stage, we predict the murmur presence in single PCGs using a multiple instance learning (MIL) framework. MIL also allows us to derive sample-wise classifications (i.e. murmur locations) while only needing one annotation per recording ("weak label") during training. In the second stage, we fuse explainable hand-crafted features with features from a pooling-based artificial neural network (PANN) derived from the MIL framework. Finally, we predict the presence of murmurs and the clinical outcome for a single patient based on multiple recordings using a simple feed-forward neural network. MAIN RESULTS We show qualitatively and quantitatively that the MIL approach yields useful features and can be used to detect murmurs on multiple time instances and may thus guide a practitioner through PCGs. We analyze the second stage of the model in terms of murmur classification and clinical outcome. We achieved a weighted accuracy of 0.714 and an outcome cost of 13612 when using the PANN model and demographic features on the CirCor dataset (hidden test set of the George B. Moody PhysioNet challenge 2022, team "Heart2Beat", rank 12 / 40). SIGNIFICANCE To the best of our knowledge, we are the first to demonstrate the usefulness of MIL in PCG classification. Also, we showcase how the explainability of the model can be analyzed quantitatively, thus avoiding confirmation bias inherent to many post-hoc methods. Finally, our overall results demonstrate the merit of employing MIL combined with handcrafted features for the generation of explainable features as well as for a competitive classification performance.
Collapse
Affiliation(s)
- Maurice Rohr
- KIS*MED – AI Systems in Medicine, Technische Universität Darmstadt, Darmstadt, Germany
| | - Benedikt Müller
- KIS*MED – AI Systems in Medicine, Technische Universität Darmstadt, Darmstadt, Germany
| | - Sebastian Dill
- KIS*MED – AI Systems in Medicine, Technische Universität Darmstadt, Darmstadt, Germany
| | - Gökhan Güney
- KIS*MED – AI Systems in Medicine, Technische Universität Darmstadt, Darmstadt, Germany
| | - Christoph Hoog Antink
- KIS*MED – AI Systems in Medicine, Technische Universität Darmstadt, Darmstadt, Germany
| |
Collapse
|
5
|
PCG signal classification using a hybrid multi round transfer learning classifier. Biocybern Biomed Eng 2023. [DOI: 10.1016/j.bbe.2023.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
6
|
Ismail S, Ismail B, Siddiqi I, Akram U. PCG classification through spectrogram using transfer learning. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
7
|
Rezaee K, Khosravi MR, Jabari M, Hesari S, Anari MS, Aghaei F. Graph convolutional network‐based deep feature learning for cardiovascular disease recognition from heart sound signals. INT J INTELL SYST 2022. [DOI: 10.1002/int.23041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Khosro Rezaee
- Department of Biomedical Engineering Meybod University Meybod Iran
| | - Mohammad R. Khosravi
- Shandong Provincial University Laboratory for Protected Horticulture Weifang University of Science and Technology Weifang Shandong China
- Department of Computer Engineering Persian Gulf University Bushehr Iran
| | - Mohammad Jabari
- Faculty of Mechanical Engineering University of Tabriz Tabriz Iran
| | - Shabnam Hesari
- Department of Electrical and Computer Engineering Ferdows Branch Islamic Azad University Ferdows Iran
| | - Maryam Saberi Anari
- Department of Computer Engineering Technical and Vocational University (TVU) Tehran Iran
| | - Fahimeh Aghaei
- Department of Electrical and Electronics Engineering Ozyegin University Istanbul Turkey
| |
Collapse
|
8
|
Rahman T, Ibtehaz N, Khandakar A, Hossain MSA, Mekki YMS, Ezeddin M, Bhuiyan EH, Ayari MA, Tahir A, Qiblawey Y, Mahmud S, Zughaier SM, Abbas T, Al-Maadeed S, Chowdhury MEH. QUCoughScope: An Intelligent Application to Detect COVID-19 Patients Using Cough and Breath Sounds. Diagnostics (Basel) 2022; 12:920. [PMID: 35453968 PMCID: PMC9028864 DOI: 10.3390/diagnostics12040920] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/17/2022] [Accepted: 02/28/2022] [Indexed: 11/17/2022] Open
Abstract
Problem-Since the outbreak of the COVID-19 pandemic, mass testing has become essential to reduce the spread of the virus. Several recent studies suggest that a significant number of COVID-19 patients display no physical symptoms whatsoever. Therefore, it is unlikely that these patients will undergo COVID-19 testing, which increases their chances of unintentionally spreading the virus. Currently, the primary diagnostic tool to detect COVID-19 is a reverse-transcription polymerase chain reaction (RT-PCR) test from the respiratory specimens of the suspected patient, which is invasive and a resource-dependent technique. It is evident from recent researches that asymptomatic COVID-19 patients cough and breathe in a different way than healthy people. Aim-This paper aims to use a novel machine learning approach to detect COVID-19 (symptomatic and asymptomatic) patients from the convenience of their homes so that they do not overburden the healthcare system and also do not spread the virus unknowingly by continuously monitoring themselves. Method-A Cambridge University research group shared such a dataset of cough and breath sound samples from 582 healthy and 141 COVID-19 patients. Among the COVID-19 patients, 87 were asymptomatic while 54 were symptomatic (had a dry or wet cough). In addition to the available dataset, the proposed work deployed a real-time deep learning-based backend server with a web application to crowdsource cough and breath datasets and also screen for COVID-19 infection from the comfort of the user's home. The collected dataset includes data from 245 healthy individuals and 78 asymptomatic and 18 symptomatic COVID-19 patients. Users can simply use the application from any web browser without installation and enter their symptoms, record audio clips of their cough and breath sounds, and upload the data anonymously. Two different pipelines for screening were developed based on the symptoms reported by the users: asymptomatic and symptomatic. An innovative and novel stacking CNN model was developed using three base learners from of eight state-of-the-art deep learning CNN algorithms. The stacking CNN model is based on a logistic regression classifier meta-learner that uses the spectrograms generated from the breath and cough sounds of symptomatic and asymptomatic patients as input using the combined (Cambridge and collected) dataset. Results-The stacking model outperformed the other eight CNN networks with the best classification performance for binary classification using cough sound spectrogram images. The accuracy, sensitivity, and specificity for symptomatic and asymptomatic patients were 96.5%, 96.42%, and 95.47% and 98.85%, 97.01%, and 99.6%, respectively. For breath sound spectrogram images, the metrics for binary classification of symptomatic and asymptomatic patients were 91.03%, 88.9%, and 91.5% and 80.01%, 72.04%, and 82.67%, respectively. Conclusion-The web-application QUCoughScope records coughing and breathing sounds, converts them to a spectrogram, and applies the best-performing machine learning model to classify the COVID-19 patients and healthy subjects. The result is then reported back to the test user in the application interface. Therefore, this novel system can be used by patients in their premises as a pre-screening method to aid COVID-19 diagnosis by prioritizing the patients for RT-PCR testing and thereby reducing the risk of spreading of the disease.
Collapse
Affiliation(s)
- Tawsifur Rahman
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Nabil Ibtehaz
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Amith Khandakar
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Md Sakib Abrar Hossain
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | | | - Maymouna Ezeddin
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Enamul Haque Bhuiyan
- BioMedical Engineering and Imaging Institute (BMEII), Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
| | - Mohamed Arselene Ayari
- Department of Civil Engineering, College of Engineering, Qatar University, Doha 2713, Qatar;
| | - Anas Tahir
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Yazan Qiblawey
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Sakib Mahmud
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| | - Susu M. Zughaier
- College of Medicine, Qatar University, Doha 2713, Qatar; (Y.M.S.M.); (S.M.Z.)
| | - Tariq Abbas
- Urology Division, Surgery Department, Sidra Medicine, Doha 26999, Qatar;
| | - Somaya Al-Maadeed
- Department of Computer Science and Engineering, College of Engineering, Qatar University, Doha 2713, Qatar;
| | - Muhammad E. H. Chowdhury
- Electrical Engineering Department, College of Engineering, Qatar University, Doha 2713, Qatar; (T.R.); (N.I.); (A.K.); (M.S.A.H.); (M.E.); (A.T.); (Y.Q.); (S.M.)
| |
Collapse
|
9
|
Podder KK, Chowdhury MEH, Tahir AM, Mahbub ZB, Khandakar A, Hossain MS, Kadir MA. Bangla Sign Language (BdSL) Alphabets and Numerals Classification Using a Deep Learning Model. SENSORS (BASEL, SWITZERLAND) 2022; 22:574. [PMID: 35062533 PMCID: PMC8780505 DOI: 10.3390/s22020574] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 12/22/2021] [Accepted: 12/23/2021] [Indexed: 02/04/2023]
Abstract
A real-time Bangla Sign Language interpreter can enable more than 200 k hearing and speech-impaired people to the mainstream workforce in Bangladesh. Bangla Sign Language (BdSL) recognition and detection is a challenging topic in computer vision and deep learning research because sign language recognition accuracy may vary on the skin tone, hand orientation, and background. This research has used deep machine learning models for accurate and reliable BdSL Alphabets and Numerals using two well-suited and robust datasets. The dataset prepared in this study comprises of the largest image database for BdSL Alphabets and Numerals in order to reduce inter-class similarity while dealing with diverse image data, which comprises various backgrounds and skin tones. The papers compared classification with and without background images to determine the best working model for BdSL Alphabets and Numerals interpretation. The CNN model trained with the images that had a background was found to be more effective than without background. The hand detection portion in the segmentation approach must be more accurate in the hand detection process to boost the overall accuracy in the sign recognition. It was found that ResNet18 performed best with 99.99% accuracy, precision, F1 score, sensitivity, and 100% specificity, which outperforms the works in the literature for BdSL Alphabets and Numerals recognition. This dataset is made publicly available for researchers to support and encourage further research on Bangla Sign Language Interpretation so that the hearing and speech-impaired individuals can benefit from this research.
Collapse
Affiliation(s)
- Kanchon Kanti Podder
- Department of Biomedical Physics & Technology, University of Dhaka, Dhaka 1000, Bangladesh; (K.K.P.); (M.A.K.)
| | | | - Anas M. Tahir
- Department of Electrical Engineering, Qatar University, Doha 2713, Qatar; (A.M.T.); (A.K.)
| | - Zaid Bin Mahbub
- Department of Mathematics and Physics, North South University, Dhaka 1229, Bangladesh;
| | - Amith Khandakar
- Department of Electrical Engineering, Qatar University, Doha 2713, Qatar; (A.M.T.); (A.K.)
| | - Md Shafayet Hossain
- Department of Electrical, Electronic and Systems Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia;
| | - Muhammad Abdul Kadir
- Department of Biomedical Physics & Technology, University of Dhaka, Dhaka 1000, Bangladesh; (K.K.P.); (M.A.K.)
| |
Collapse
|
10
|
Abstract
The fatiguing work of air traffic controllers inevitably threatens air traffic safety. Determining whether eyes are in an open or closed state is currently the main method for detecting fatigue in air traffic controllers. Here, an eye state recognition model based on deep-fusion neural networks is proposed for determination of the fatigue state of controllers. This method uses transfer learning strategies to pre-train deep neural networks and deep convolutional neural networks and performs network fusion at the decision-making layer. The fused network demonstrated an improved ability to classify the target domain dataset. First, a deep-cascaded neural network algorithm was used to realize face detection and eye positioning. Second, according to the eye selection mechanism, the pictures of the eyes to be tested were cropped and passed into the deep-fusion neural network to determine the eye state. Finally, the PERCLOS indicator was combined to detect the fatigue state of the controller. On the ZJU, CEW and ATCE datasets, the accuracy, F1 score and AUC values of different networks were compared, and, on the ZJU and CEW datasets, the recognition accuracy and AUC values among different methods were evaluated based on a comparative experiment. The experimental results show that the deep-fusion neural network model demonstrated better performance than the other assessed network models. When applied to the controller eye dataset, the recognition accuracy was 98.44%, and the recognition accuracy for the test video was 97.30%.
Collapse
|