1
|
Zhou W, Yu L, Zhang M, Xiao W. A low power respiratory sound diagnosis processing unit based on LSTM for wearable health monitoring. BIOMED ENG-BIOMED TE 2023; 68:469-480. [PMID: 37080905 DOI: 10.1515/bmt-2022-0421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 04/05/2023] [Indexed: 04/22/2023]
Abstract
Early prevention and detection of respiratory disease have attracted extensive attention due to the significant increase in people with respiratory issues. Restraining the spread and relieving the symptom of this disease is essential. However, the traditional auscultation technique demands a high-level medical skill, and computational respiratory sound analysis approaches have limits in constrained locations. A wearable auscultation device is required to real-time monitor respiratory system health and provides consumers with ease. In this work, we developed a Respiratory Sound Diagnosis Processor Unit (RSDPU) based on Long Short-Term Memory (LSTM). The experiments and analyses were conducted on feature extraction and abnormality diagnosis algorithm of respiratory sound, and Dynamic Normalization Mapping (DNM) was proposed to better utilize quantization bits and lessen overfitting. Furthermore, we developed the hardware implementation of RSDPU including a corrector to filter diagnosis noise. We presented the FPGA prototyping verification and layout of the RSDPU for power and area evaluation. Experimental results demonstrated that RSDPU achieved an abnormality diagnosis accuracy of 81.4 %, an area of 1.57 × 1.76 mm under the SMIC 130 nm process, and power consumption of 381.8 μW, which met the requirements of high accuracy, low power consumption, and small area.
Collapse
Affiliation(s)
- Weixin Zhou
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| | - Lina Yu
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| | - Ming Zhang
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| | - Wan'ang Xiao
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| |
Collapse
|
2
|
Garcia-Mendez JP, Lal A, Herasevich S, Tekin A, Pinevich Y, Lipatov K, Wang HY, Qamar S, Ayala IN, Khapov I, Gerberi DJ, Diedrich D, Pickering BW, Herasevich V. Machine Learning for Automated Classification of Abnormal Lung Sounds Obtained from Public Databases: A Systematic Review. Bioengineering (Basel) 2023; 10:1155. [PMID: 37892885 PMCID: PMC10604310 DOI: 10.3390/bioengineering10101155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 09/15/2023] [Accepted: 09/26/2023] [Indexed: 10/29/2023] Open
Abstract
Pulmonary auscultation is essential for detecting abnormal lung sounds during physical assessments, but its reliability depends on the operator. Machine learning (ML) models offer an alternative by automatically classifying lung sounds. ML models require substantial data, and public databases aim to address this limitation. This systematic review compares characteristics, diagnostic accuracy, concerns, and data sources of existing models in the literature. Papers published from five major databases between 1990 and 2022 were assessed. Quality assessment was accomplished with a modified QUADAS-2 tool. The review encompassed 62 studies utilizing ML models and public-access databases for lung sound classification. Artificial neural networks (ANN) and support vector machines (SVM) were frequently employed in the ML classifiers. The accuracy ranged from 49.43% to 100% for discriminating abnormal sound types and 69.40% to 99.62% for disease class classification. Seventeen public databases were identified, with the ICBHI 2017 database being the most used (66%). The majority of studies exhibited a high risk of bias and concerns related to patient selection and reference standards. Summarizing, ML models can effectively classify abnormal lung sounds using publicly available data sources. Nevertheless, inconsistent reporting and methodologies pose limitations to advancing the field, and therefore, public databases should adhere to standardized recording and labeling procedures.
Collapse
Affiliation(s)
- Juan P. Garcia-Mendez
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Amos Lal
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Svetlana Herasevich
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Aysun Tekin
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Yuliya Pinevich
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
- Department of Cardiac Anesthesiology and Intensive Care, Republican Clinical Medical Center, 223052 Minsk, Belarus
| | - Kirill Lipatov
- Division of Pulmonary Medicine, Mayo Clinic Health Systems, Essentia Health, Duluth, MN 55805, USA
| | - Hsin-Yi Wang
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
- Department of Anesthesiology, Taipei Veterans General Hospital, National Yang Ming Chiao Tung University, Taipei 11217, Taiwan
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan 320317, Taiwan
| | - Shahraz Qamar
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Ivan N. Ayala
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Ivan Khapov
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | | | - Daniel Diedrich
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Brian W. Pickering
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| | - Vitaly Herasevich
- Department of Anesthesiology and Perioperative Medicine, Division of Critical Care, Mayo Clinic, Rochester, MN 55905, USA (Y.P.); (H.-Y.W.); (I.K.); (V.H.)
| |
Collapse
|
3
|
Kim Y, Hyon Y, Woo SD, Lee S, Lee SI, Ha T, Chung C. Evolution of the Stethoscope: Advances with the Adoption of Machine Learning and Development of Wearable Devices. Tuberc Respir Dis (Seoul) 2023; 86:251-263. [PMID: 37592751 PMCID: PMC10555525 DOI: 10.4046/trd.2023.0065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/04/2023] [Accepted: 08/15/2023] [Indexed: 08/19/2023] Open
Abstract
The stethoscope has long been used for the examination of patients, but the importance of auscultation has declined due to its several limitations and the development of other diagnostic tools. However, auscultation is still recognized as a primary diagnostic device because it is non-invasive and provides valuable information in real-time. To supplement the limitations of existing stethoscopes, digital stethoscopes with machine learning (ML) algorithms have been developed. Thus, now we can record and share respiratory sounds and artificial intelligence (AI)-assisted auscultation using ML algorithms distinguishes the type of sounds. Recently, the demands for remote care and non-face-to-face treatment diseases requiring isolation such as coronavirus disease 2019 (COVID-19) infection increased. To address these problems, wireless and wearable stethoscopes are being developed with the advances in battery technology and integrated sensors. This review provides the history of the stethoscope and classification of respiratory sounds, describes ML algorithms, and introduces new auscultation methods based on AI-assisted analysis and wireless or wearable stethoscopes.
Collapse
Affiliation(s)
- Yoonjoo Kim
- Division of Pulmonology, Department of Internal Medicine, Chungnam National University College of Medicine, Daejeon, Republic of Korea
| | - YunKyong Hyon
- Division of Industrial Mathematics, National Institute for Mathematical Sciences, Daejeon, Republic of Korea
| | - Seong-Dae Woo
- Division of Pulmonology, Department of Internal Medicine, Chungnam National University College of Medicine, Daejeon, Republic of Korea
| | - Sunju Lee
- Division of Industrial Mathematics, National Institute for Mathematical Sciences, Daejeon, Republic of Korea
| | - Song-I Lee
- Division of Pulmonology, Department of Internal Medicine, Chungnam National University College of Medicine, Daejeon, Republic of Korea
| | - Taeyoung Ha
- Division of Industrial Mathematics, National Institute for Mathematical Sciences, Daejeon, Republic of Korea
| | - Chaeuk Chung
- Division of Pulmonology, Department of Internal Medicine, Chungnam National University College of Medicine, Daejeon, Republic of Korea
| |
Collapse
|
4
|
Prabhakar SK, Won DO. HISET: Hybrid interpretable strategies with ensemble techniques for respiratory sound classification. Heliyon 2023; 9:e18466. [PMID: 37554776 PMCID: PMC10404967 DOI: 10.1016/j.heliyon.2023.e18466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 07/18/2023] [Accepted: 07/18/2023] [Indexed: 08/10/2023] Open
Abstract
The human respiratory systems can be affected by several diseases and it is associated with distinctive sounds. For advanced biomedical signal processing, one of the most complex issues is automated respiratory sound classification. In this research, five Hybrid Interpretable Strategies with Ensemble Techniques (HISET) which are quite interesting and robust are proposed for the purpose of respiratory sounds classification. The first approach is termed as an Ensemble GSSR technique which utilizes L 2 Granger Analysis and the proposed Supportive Ensemble Empirical Mode Decomposition (SEEMD) technique and then Support Vector Machine based Recursive Feature Elimination (SVM-RFE) is used for feature selection and followed by classification with Machine Learning (ML) classifiers. The second approach proposed is the implementation of a novel Realm Revamping Sparse Representation Classification (RR-SRC) technique and third approach proposed is a Distance Metric dependent Variational Mode Decomposition (DM-VMD) with Extreme Learning Machine (ELM) classification process. The fourth approach proposed is with the usage of Harris Hawks Optimization (HHO) with a Scaling Factor based Pliable Differential Evolution (SFPDE) algorithm termed as HHO-SFPDE and it is classified with ML classifiers. The fifth or the final approach proposed analyzes the application of dimensionality reduction techniques with the proposed Gray Wolf Optimization based Support Vector Classification (GWO-SVC) and another parallel approach utilizes a similar kind of analysis with the Grasshopper Optimization Algorithm (GOA) based Sparse Autoencoder. The results are examined for ICBHI dataset and the best results are shown for the 2-class classification when the analysis is carried out with Manhattan distance-based VMD-ELM reporting an accuracy of 95.39%, and for 3-class classification Euclidean distance-based VMD-ELM reported an accuracy of 90.61% and for 4-class classification, Manhattan distance-based VMD-ELM reported an accuracy of 89.27%.
Collapse
Affiliation(s)
- Sunil Kumar Prabhakar
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, Gangwon-do, South Korea
| | - Dong-Ok Won
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, Gangwon-do, South Korea
| |
Collapse
|
5
|
Heitmann J, Glangetas A, Doenz J, Dervaux J, Shama DM, Garcia DH, Benissa MR, Cantais A, Perez A, Müller D, Chavdarova T, Ruchonnet-Metrailler I, Siebert JN, Lacroix L, Jaggi M, Gervaix A, Hartley MA. DeepBreath-automated detection of respiratory pathology from lung auscultation in 572 pediatric outpatients across 5 countries. NPJ Digit Med 2023; 6:104. [PMID: 37268730 DOI: 10.1038/s41746-023-00838-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 05/05/2023] [Indexed: 06/04/2023] Open
Abstract
The interpretation of lung auscultation is highly subjective and relies on non-specific nomenclature. Computer-aided analysis has the potential to better standardize and automate evaluation. We used 35.9 hours of auscultation audio from 572 pediatric outpatients to develop DeepBreath : a deep learning model identifying the audible signatures of acute respiratory illness in children. It comprises a convolutional neural network followed by a logistic regression classifier, aggregating estimates on recordings from eight thoracic sites into a single prediction at the patient-level. Patients were either healthy controls (29%) or had one of three acute respiratory illnesses (71%) including pneumonia, wheezing disorders (bronchitis/asthma), and bronchiolitis). To ensure objective estimates on model generalisability, DeepBreath is trained on patients from two countries (Switzerland, Brazil), and results are reported on an internal 5-fold cross-validation as well as externally validated (extval) on three other countries (Senegal, Cameroon, Morocco). DeepBreath differentiated healthy and pathological breathing with an Area Under the Receiver-Operator Characteristic (AUROC) of 0.93 (standard deviation [SD] ± 0.01 on internal validation). Similarly promising results were obtained for pneumonia (AUROC 0.75 ± 0.10), wheezing disorders (AUROC 0.91 ± 0.03), and bronchiolitis (AUROC 0.94 ± 0.02). Extval AUROCs were 0.89, 0.74, 0.74 and 0.87 respectively. All either matched or were significant improvements on a clinical baseline model using age and respiratory rate. Temporal attention showed clear alignment between model prediction and independently annotated respiratory cycles, providing evidence that DeepBreath extracts physiologically meaningful representations. DeepBreath provides a framework for interpretable deep learning to identify the objective audio signatures of respiratory pathology.
Collapse
Affiliation(s)
- Julien Heitmann
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Alban Glangetas
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Jonathan Doenz
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Juliane Dervaux
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Deeksha M Shama
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Daniel Hinjos Garcia
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Mohamed Rida Benissa
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Aymeric Cantais
- Pediatric Emergency Department, Hospital University of Saint Etienne, Saint Etienne, France
| | - Alexandre Perez
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Daniel Müller
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Tatjana Chavdarova
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Isabelle Ruchonnet-Metrailler
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Johan N Siebert
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Laurence Lacroix
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Martin Jaggi
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Alain Gervaix
- Division of Pediatric Emergency Medicine, Department of Women, Child and Adolescent, Geneva University Hospitals (HUG), University of Geneva, Switzerland, Geneva, Switzerland
| | - Mary-Anne Hartley
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland.
- Center for Intelligent Systems (CIS), Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland.
- Division of Pediatric Emergency Medicine, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland.
| |
Collapse
|
6
|
Jiang H, Zong D, Song Q, Gao K, Shao H, Liu Z, Tian J. Coal-gangue recognition via multi-branch convolutional neural network based on MFCC in noisy environment. Sci Rep 2023; 13:6541. [PMID: 37085691 PMCID: PMC10121578 DOI: 10.1038/s41598-023-33351-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 04/12/2023] [Indexed: 04/23/2023] Open
Abstract
Traditional coal-gangue recognition methods usually do not consider the impact of equipment noise, which severely limits its adaptability and recognition accuracy. This paper mainly studies the more accurate recognition of coal-gangue in the noise site environment with the operation of shearer, conveyor, transfer machine and other device in the process of top coal caving. Mel Frequency Cepstrum Coefficients (MFCC) smoothing method was introduced to express the intrinsic feature of sound pressure more clearly in the coal-gangue recognition site. Then, a multi-branch convolution neural network (MBCNN) model with three branches was developed, and the smoothed MFCC feature was incorporated into this model to realize the recognition of falling coal and gangue in noisy environment. The sound pressure signal datasets under the operation of different device were constructed through a great deal of laboratory and site data acquisition. Comparative experiments were carried out on noiseless dataset, single noise dataset and simulated site dataset, and the results show that our method can provide higher correct recognition accuracy and better robustness. The proposed coal-gangue recognition approach based on MBCNN and MFCC smoothing can not only recognize the state of falling coal or gangue, but also recognize the operational state of site device.
Collapse
Affiliation(s)
- HaiYan Jiang
- Department of Intelligent Equipment, Shandong University of Science & Technology, Taian, 271000, China
| | - DaShuai Zong
- Department of Intelligent Equipment, Shandong University of Science & Technology, Taian, 271000, China
| | - QingJun Song
- Department of Intelligent Equipment, Shandong University of Science & Technology, Taian, 271000, China.
| | - KuiDong Gao
- Shandong Province Key Laboratory of Mine Mechanical Engineering, Shandong University of Science & Technology, Qingdao, 266590, China
| | - HuiZhi Shao
- Hong Kong Baptist University, Hong Kong, China
| | - ZhiJiang Liu
- Department of Intelligent Equipment, Shandong University of Science & Technology, Taian, 271000, China
| | - Jing Tian
- Taihe Electric Power Co. Ltd, Taian, 271000, Shandong, China
| |
Collapse
|
7
|
Yang D, Peng Y, Zhou T, Wang T, Lu G. Percussion and PSO-SVM-Based Damage Detection for Refractory Materials. MICROMACHINES 2023; 14:135. [PMID: 36677196 PMCID: PMC9861777 DOI: 10.3390/mi14010135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 01/03/2023] [Accepted: 01/03/2023] [Indexed: 06/17/2023]
Abstract
Refractory materials are basic materials widely used in industrial furnaces and thermal equipment. Their microstructure is similar to that of many heterogeneous high-performance materials used in micro/nanodevices. The presence of damage can reduce the mechanical properties and service life of refractory materials and even cause serious safety accidents. In this paper, a novel percussion and particle swarm optimization-support vector machine (PSO-SVM)-based method is proposed to detect damage in refractory materials. An impact is applied to the material and the generated sound is recorded. The percussion-induced sound signals are fed into a mel filter bank to generate time-frequency representations in the form of mel spectrograms. Then, two image descriptors-the local binary pattern (LBP) and histogram of oriented gradient (HOG)-are used to extract the texture information of the mel spectrogram. Finally, combining both HOG and LBP features, the fused features are input to the PSO-SVM algorithm to realize damage detection in refractory materials. The results demonstrated that the proposed method could identify five different degrees of damage of refractory materials, with an accuracy rate greater than 97%. Therefore, the percussion and PSO-SVM-based method proposed in this paper has high potential for field applications in damage detection in refractory material, and also has the potential to be extended to research on damage detection methods for other materials used in micro/nanodevices.
Collapse
Affiliation(s)
- Dan Yang
- Key Laboratory for Metallurgical Equipment and Control of Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China
- Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
| | - Yi Peng
- Key Laboratory for Metallurgical Equipment and Control of Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China
- Precision Manufacturing Institute, Wuhan University of Science and Technology, Wuhan 430081, China
| | - Ti Zhou
- Wuhan Digital Engineering Institute, Wuhan 430074, China
| | - Tao Wang
- Key Laboratory for Metallurgical Equipment and Control of Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China
| | - Guangtao Lu
- Key Laboratory for Metallurgical Equipment and Control of Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China
- Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
| |
Collapse
|
8
|
Kim BJ, Kim BS, Mun JH, Lim C, Kim K. An accurate deep learning model for wheezing in children using real world data. Sci Rep 2022; 12:22465. [PMID: 36577766 PMCID: PMC9797543 DOI: 10.1038/s41598-022-25953-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 11/25/2022] [Indexed: 12/30/2022] Open
Abstract
Auscultation is an important diagnostic method for lung diseases. However, it is a subjective modality and requires a high degree of expertise. To overcome this constraint, artificial intelligence models are being developed. However, these models require performance improvements and do not reflect the actual clinical situation. We aimed to develop an improved deep-learning model learning to detect wheezing in children, based on data from real clinical practice. In this prospective study, pediatric pulmonologists recorded and verified respiratory sounds in 76 pediatric patients who visited a university hospital in South Korea. In addition, structured data, such as sex, age, and auscultation location, were collected. Using our dataset, we implemented an optimal model by transforming it based on the convolutional neural network model. Finally, we proposed a model using a 34-layer residual network with the convolutional block attention module for audio data and multilayer perceptron layers for tabular data. The proposed model had an accuracy of 91.2%, area under the curve of 89.1%, precision of 94.4%, recall of 81%, and F1-score of 87.2%. The deep-learning model proposed had a high accuracy for detecting wheeze sounds. This high-performance model will be helpful for the accurate diagnosis of respiratory diseases in actual clinical practice.
Collapse
Affiliation(s)
- Beom Joon Kim
- grid.411947.e0000 0004 0470 4224Department of Pediatrics, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Baek Seung Kim
- grid.254224.70000 0001 0789 9563Department of Applied Statistics, Chung-Ang University, 84 Heukseok-Ro, Dongjak-Gu, Seoul, 06974 Republic of Korea
| | - Jeong Hyeon Mun
- grid.254224.70000 0001 0789 9563Department of Applied Statistics, Chung-Ang University, 84 Heukseok-Ro, Dongjak-Gu, Seoul, 06974 Republic of Korea
| | - Changwon Lim
- grid.254224.70000 0001 0789 9563Department of Applied Statistics, Chung-Ang University, 84 Heukseok-Ro, Dongjak-Gu, Seoul, 06974 Republic of Korea
| | - Kyunghoon Kim
- grid.412480.b0000 0004 0647 3378Department of Pediatrics, Seoul National University Bundang Hospital, Seongnam, 13620 Republic of Korea ,grid.31501.360000 0004 0470 5905Department of Pediatrics, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
9
|
Shang W, Deng L, Liu J. A Novel Air-Door Opening and Closing Identification Algorithm Using a Single Wind-Velocity Sensor. SENSORS (BASEL, SWITZERLAND) 2022; 22:6837. [PMID: 36146187 PMCID: PMC9503651 DOI: 10.3390/s22186837] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/01/2022] [Accepted: 09/08/2022] [Indexed: 06/16/2023]
Abstract
The air-door is an important device for adjusting the air flow in a mine. It opens and closes within a short time owing to transportation and other factors. Although the switching sensor alone can identify the air-door opening and closing, it cannot relate it to abnormal fluctuations in the wind speed. Large fluctuations in the wind-velocity sensor data during this time can lead to false alarms. To overcome this problem, we propose a method for identifying air-door opening and closing using a single wind-velocity sensor. A multi-scale sliding window (MSSW) is employed to divide the samples. Then, the data global features and fluctuation features are extracted using statistics and the discrete wavelet transform (DWT). In addition, a machine learning model is adopted to classify each sample. Further, the identification results are selected by merging the classification results using the non-maximum suppression method. Finally, considering the safety accidents caused by the air-door opening and closing in an actual production mine, a large number of experiments were carried out to verify the effect of the algorithm using a simulated tunnel model. The results show that the proposed algorithm exhibits superior performance when the gradient boosting decision tree (GBDT) is selected for classification. In the data set composed of air-door opening and closing experimental data, the accuracy, precision, and recall rates of the air-door opening and closing identification are 91.89%, 93.07%, and 91.07%, respectively. In the data set composed of air-door opening and closing and other mine production activity experimental data, the accuracy, precision, and recall rates of the air-door opening and closing identification are 89.61%, 90.31%, and 88.39%, respectively.
Collapse
Affiliation(s)
- Wentian Shang
- College of Safety Science and Engineering, Liaoning Technical University, Huludao 125105, China
- Key Laboratory of Mine Thermo-Motive Disaster & Prevention, Ministry of Education, Huludao 125105, China
| | - Lijun Deng
- College of Safety Science and Engineering, Liaoning Technical University, Huludao 125105, China
- Key Laboratory of Mine Thermo-Motive Disaster & Prevention, Ministry of Education, Huludao 125105, China
| | - Jian Liu
- College of Safety Science and Engineering, Liaoning Technical University, Huludao 125105, China
- Key Laboratory of Mine Thermo-Motive Disaster & Prevention, Ministry of Education, Huludao 125105, China
| |
Collapse
|
10
|
A Progressively Expanded Database for Automated Lung Sound Analysis: An Update. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
We previously established an open-access lung sound database, HF_Lung_V1, and developed deep learning models for inhalation, exhalation, continuous adventitious sound (CAS), and discontinuous adventitious sound (DAS) detection. The amount of data used for training contributes to model accuracy. In this study, we collected larger quantities of data to further improve model performance and explored issues of noisy labels and overlapping sounds. HF_Lung_V1 was expanded to HF_Lung_V2 with a 1.43× increase in the number of audio files. Convolutional neural network–bidirectional gated recurrent unit network models were trained separately using the HF_Lung_V1 (V1_Train) and HF_Lung_V2 (V2_Train) training sets. These were tested using the HF_Lung_V1 (V1_Test) and HF_Lung_V2 (V2_Test) test sets, respectively. Segment and event detection performance was evaluated. Label quality was assessed. Overlap ratios were computed between inhalation, exhalation, CAS, and DAS labels. The model trained using V2_Train exhibited improved performance in inhalation, exhalation, CAS, and DAS detection on both V1_Test and V2_Test. Poor CAS detection was attributed to the quality of CAS labels. DAS detection was strongly influenced by the overlapping of DAS with inhalation and exhalation. In conclusion, collecting greater quantities of lung sound data is vital for developing more accurate lung sound analysis models.
Collapse
|
11
|
Voigt I, Boeckmann M, Bruder O, Wolf A, Schmitz T, Wieneke H. A deep neural network using audio files for detection of aortic stenosis. Clin Cardiol 2022; 45:657-663. [PMID: 35438211 PMCID: PMC9175247 DOI: 10.1002/clc.23826] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 03/10/2022] [Accepted: 03/13/2022] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Although aortic stenosis (AS) is the most common valvular heart disease in the western world, many affected patients remain undiagnosed. Auscultation is a readily available screening tool for AS. However, it requires a high level of professional expertise. HYPOTHESIS An AI algorithm can detect AS using audio files with the same accuracy as experienced cardiologists. METHODS A deep neural network (DNN) was trained by preprocessed audio files of 100 patients with AS and 100 controls. The DNN's performance was evaluated with a test data set of 40 patients. The primary outcome measures were sensitivity, specificity, and F1-score. Results of the DNN were compared with the performance of cardiologists, residents, and medical students. RESULTS Eighteen percent of patients without AS and 22% of patients with AS showed an additional moderate or severe mitral regurgitation. The DNN showed a sensitivity of 0.90 (0.81-0.99), a specificity of 1, and an F1-score of 0.95 (0.89-1.0) for the detection of AS. In comparison, we calculated an F1-score of 0.94 (0.86-1.0) for cardiologists, 0.88 (0.78-0.98) for residents, and 0.88 (0.78-0.98) for students. CONCLUSIONS The present study shows that deep learning-guided auscultation predicts significant AS with similar accuracy as cardiologists. The results of this pilot study suggest that AI-assisted auscultation may help general practitioners without special cardiology training in daily practice.
Collapse
Affiliation(s)
- Ingo Voigt
- Department of Cardiology and Angiology, Contilia Heart and Vascular Center, Elisabeth-Krankenhaus Essen, Essen, Germany
| | - Marc Boeckmann
- Department of Cardiology and Angiology, Contilia Heart and Vascular Center, Elisabeth-Krankenhaus Essen, Essen, Germany
| | - Oliver Bruder
- Department of Cardiology and Angiology, Contilia Heart and Vascular Center, Elisabeth-Krankenhaus Essen, Essen, Germany
| | - Alexander Wolf
- Department of Cardiology and Angiology, Contilia Heart and Vascular Center, Elisabeth-Krankenhaus Essen, Essen, Germany
| | - Thomas Schmitz
- Department of Cardiology and Angiology, Contilia Heart and Vascular Center, Elisabeth-Krankenhaus Essen, Essen, Germany
| | - Heinrich Wieneke
- Department of Cardiology and Angiology, Contilia Heart and Vascular Center, Elisabeth-Krankenhaus Essen, Essen, Germany
| |
Collapse
|
12
|
Kim Y, Hyon Y, Lee S, Woo SD, Ha T, Chung C. The coming era of a new auscultation system for analyzing respiratory sounds. BMC Pulm Med 2022; 22:119. [PMID: 35361176 PMCID: PMC8969404 DOI: 10.1186/s12890-022-01896-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 03/20/2022] [Indexed: 01/28/2023] Open
Abstract
Auscultation with stethoscope has been an essential tool for diagnosing the patients with respiratory disease. Although auscultation is non-invasive, rapid, and inexpensive, it has intrinsic limitations such as inter-listener variability and subjectivity, and the examination must be performed face-to-face. Conventional stethoscope could not record the respiratory sounds, so it was impossible to share the sounds. Recent innovative digital stethoscopes have overcome the limitations and enabled clinicians to store and share the sounds for education and discussion. In particular, the recordable stethoscope made it possible to analyze breathing sounds using artificial intelligence, especially based on neural network. Deep learning-based analysis with an automatic feature extractor and convoluted neural network classifier has been applied for the accurate analysis of respiratory sounds. In addition, the current advances in battery technology, embedded processors with low power consumption, and integrated sensors make possible the development of wearable and wireless stethoscopes, which can help to examine patients living in areas of a shortage of doctors or those who need isolation. There are still challenges to overcome, such as the analysis of complex and mixed respiratory sounds and noise filtering, but continuous research and technological development will facilitate the transition to a new era of a wearable and smart stethoscope.
Collapse
Affiliation(s)
- Yoonjoo Kim
- Division of Pulmonology and Critical Care Medicine, Department of Internal Medicine, College of Medicine, Chungnam National University, Daejeon, 34134, Korea
| | - YunKyong Hyon
- Division of Industrial Mathematics, National Institute for Mathematical Sciences, 70, Yuseong-daero 1689 beon-gil, Yuseong-gu, Daejeon, 34047, Republic of Korea
| | - Sunju Lee
- Division of Industrial Mathematics, National Institute for Mathematical Sciences, 70, Yuseong-daero 1689 beon-gil, Yuseong-gu, Daejeon, 34047, Republic of Korea
| | - Seong-Dae Woo
- Division of Pulmonology and Critical Care Medicine, Department of Internal Medicine, College of Medicine, Chungnam National University, Daejeon, 34134, Korea
| | - Taeyoung Ha
- Division of Industrial Mathematics, National Institute for Mathematical Sciences, 70, Yuseong-daero 1689 beon-gil, Yuseong-gu, Daejeon, 34047, Republic of Korea.
| | - Chaeuk Chung
- Division of Pulmonology and Critical Care Medicine, Department of Internal Medicine, College of Medicine, Chungnam National University, Daejeon, 34134, Korea. .,Infection Control Convergence Research Center, Chungnam National University School of Medicine, Daejeon, 35015, Republic of Korea.
| |
Collapse
|
13
|
Percussion-Based Pipeline Ponding Detection Using a Convolutional Neural Network. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12042127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Pipeline transportation is the main method for long-distance gas transportation; however, ponding in the pipeline can affect transportation efficiency and even cause corrosion to the pipeline in some cases. A non-destructive method to detect pipeline ponding using percussion acoustic signals and a convolution neural network (CNN) is proposed in this paper. During the process of detection, a constant energy spring impact hammer is used to apply an impact on the pipeline, and the percussive acoustic signals are collected. A Mel spectrogram is used to extract the acoustic feature of the percussive acoustic signal with different ponding volumes in the pipeline. The Mel spectrogram is transferred to the input layer of the CNN and the convolutional kernel matrix of the CNN realizes the recognition of pipeline ponding volume. The recognition results show that the CNN can identify the amount of pipeline ponding with the percussive acoustic signals, which use the Mel spectrogram as the acoustic feature. Compared with the support vector machine (SVM) model and the decision tree model, the CNN model has better recognition performance. Therefore, the percussion-based pipeline ponding detection using the convolutional neural network method proposed in this paper has high application potential.
Collapse
|
14
|
Xue C, Karjadi C, Paschalidis IC, Au R, Kolachalama VB. Detection of dementia on voice recordings using deep learning: a Framingham Heart Study. Alzheimers Res Ther 2021; 13:146. [PMID: 34465384 PMCID: PMC8409004 DOI: 10.1186/s13195-021-00888-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 08/12/2021] [Indexed: 11/10/2022]
Abstract
BACKGROUND Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available. METHODS AND FINDINGS We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (±std) area under the receiver operating characteristic curve (AUC) of 0.740 ± 0.017, mean balanced accuracy of 0.647 ± 0.027, and mean weighted F1 score of 0.596 ± 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 ± 0.027, mean balanced accuracy of 0.743 ± 0.015, and mean weighted F1 score of 0.742 ± 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 ± 0.014, mean balanced accuracy of 0.675 ± 0.013, and mean weighted F1 score of 0.671 ± 0.015. The CNN model achieved a mean AUC of 0.746 ± 0.021, mean balanced accuracy of 0.652 ± 0.020, and mean weighted F1 score of 0.635 ± 0.031 in classifying cases with DE from those who were NDE. CONCLUSION This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.
Collapse
Affiliation(s)
- Chonghua Xue
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA
| | - Cody Karjadi
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA
- Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Ioannis Ch Paschalidis
- Departments to Electrical & Computer Engineering, Systems Engineering and Biomedical Engineering; Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02118, USA
| | - Rhoda Au
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA
- Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
- Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, 02118, USA
| | - Vijaya B Kolachalama
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA.
- Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA.
- Department of Computer Science and Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02115, USA.
| |
Collapse
|
15
|
Xue C, Karjadi C, Paschalidis IC, Au R, Kolachalama VB. Detection of dementia on voice recordings using deep learning: a Framingham Heart Study. Alzheimers Res Ther 2021. [PMID: 34465384 DOI: 10.1186/s13195-021-00888-3.pdf] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
BACKGROUND Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available. METHODS AND FINDINGS We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (±std) area under the receiver operating characteristic curve (AUC) of 0.740 ± 0.017, mean balanced accuracy of 0.647 ± 0.027, and mean weighted F1 score of 0.596 ± 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 ± 0.027, mean balanced accuracy of 0.743 ± 0.015, and mean weighted F1 score of 0.742 ± 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 ± 0.014, mean balanced accuracy of 0.675 ± 0.013, and mean weighted F1 score of 0.671 ± 0.015. The CNN model achieved a mean AUC of 0.746 ± 0.021, mean balanced accuracy of 0.652 ± 0.020, and mean weighted F1 score of 0.635 ± 0.031 in classifying cases with DE from those who were NDE. CONCLUSION This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.
Collapse
Affiliation(s)
- Chonghua Xue
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA
| | - Cody Karjadi
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA.,Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Ioannis Ch Paschalidis
- Departments to Electrical & Computer Engineering, Systems Engineering and Biomedical Engineering; Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02118, USA
| | - Rhoda Au
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA.,Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA.,Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA.,Department of Epidemiology, Boston University School of Public Health, Boston, MA, 02118, USA
| | - Vijaya B Kolachalama
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA. .,Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA. .,Department of Computer Science and Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02115, USA.
| |
Collapse
|