1
|
Roy A, Gyanchandani B, Oza A, Singh A. TriSpectraKAN: a novel approach for COPD detection via lung sound analysis. Sci Rep 2025; 15:6296. [PMID: 39984500 PMCID: PMC11845766 DOI: 10.1038/s41598-024-82781-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2024] [Accepted: 12/09/2024] [Indexed: 02/23/2025] Open
Abstract
This study aims to create an automated, accessible, and cost-effective diagnostic tool for chronic obstructive pulmonary disease (COPD). Traditional diagnostic methods are expensive, time-consuming, and require specialized equipment. The proposed TriSpectraKAN model leverages audio-based lung sound features to improve early diagnosis. TriSpectraKAN is a hybrid model combining spectral features and the Kolmogorov-Arnold Network (KAN) to analyze lung sounds using Mel-frequency cepstral coefficients (MFCCs), chromagram, and Mel spectrograms. Each sub-model focuses on a different audio feature, capturing unique sonic signatures. These features are merged through a hybrid network for comprehensive analysis. The model, trained on a COPD dataset, was deployed on a Raspberry Pi for real-time use. TriSpectraKAN achieved 93% accuracy, an F1 score of 0.98, precision of 0.97, and recall of 0.98. This multimodal approach captured a broad range of lung sound features, improving diagnosis accuracy compared to traditional methods. The integration of multiple audio features in TriSpectraKAN enhances COPD diagnosis, demonstrating the potential of AI and machine learning to transform respiratory disease diagnosis through accessible tools.
Collapse
|
2
|
Kumar S, Bhagat V, Sahu P, Chaube MK, Behera AK, Guizani M, Gravina R, Di Dio M, Fortino G, Curry E, Alsamhi SH. A novel multimodal framework for early diagnosis and classification of COPD based on CT scan images and multivariate pulmonary respiratory diseases. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107911. [PMID: 37981453 DOI: 10.1016/j.cmpb.2023.107911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 10/23/2023] [Accepted: 11/01/2023] [Indexed: 11/21/2023]
Abstract
BACKGROUND AND OBJECTIVE Chronic Obstructive Pulmonary Disease (COPD) is one of the world's worst diseases; its early diagnosis using existing methods like statistical machine learning techniques, medical diagnostic tools, conventional medical procedures, and other methods is challenging due to misclassification results of COPD diagnosis and takes a long time to perform accurate prediction. Due to the severe consequences of COPD, detection and accurate diagnosis of COPD at an early stage is essential. This paper aims to design and develop a multimodal framework for early diagnosis and accurate prediction of COPD patients based on prepared Computerized Tomography (CT) scan images and lung sound/cough (audio) samples using machine learning techniques, which are presented in this study. METHOD The proposed multimodal framework extracts texture, histogram intensity, chroma, Mel-Frequency Cepstral Coefficients (MFCCs), and Gaussian scale space from the prepared CT images and lung sound/cough samples. Accurate data from All India Institute Medical Sciences (AIIMS), Raipur, India, and the open respiratory CT images and lung sound/cough (audio) sample dataset validate the proposed framework. The discriminatory features are selected from the extracted feature sets using unsupervised ML techniques, and customized ensemble learning techniques are applied to perform early classification and assess the severity levels of COPD patients. RESULTS The proposed framework provided 97.50%, 98%, and 95.30% accuracy for early diagnosis of COPD patients based on the fusion technique, CT diagnostic model, and cough sample model. CONCLUSION Finally, we compare the performance of the proposed framework with existing methods, current approaches, and conventional benchmark techniques for early diagnosis.
Collapse
Affiliation(s)
- Santosh Kumar
- Department of Computer Science and Engineering, IIIT-Naya Raipur, Chhattisgarh, India.
| | - Vijesh Bhagat
- Department of Computer Science and Engineering, IIIT-Naya Raipur, Chhattisgarh, India.
| | - Prakash Sahu
- Department of Computer Science and Engineering, IIIT-Naya Raipur, Chhattisgarh, India.
| | | | - Ajoy Kumar Behera
- Department of Pulmonary Medicine & TB, All India Institute of Medical Sciences (AIIMS), Raipur, Chhattisgarh, India.
| | - Mohsen Guizani
- Machine Learning Department, Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab Emirates.
| | - Raffaele Gravina
- Department of Informatics, Modeling, Electronic, and System Engineering, University of Calabria, 87036 Rende, Italy.
| | - Michele Di Dio
- Department of Informatics, Modeling, Electronic, and System Engineering, University of Calabria, 87036 Rende, Italy; Annunziata Hospital Cosenza, Italy.
| | - Giancarlo Fortino
- Department of Informatics, Modeling, Electronic, and System Engineering, University of Calabria, 87036 Rende, Italy.
| | - Edward Curry
- Insight Centre for Data Analytics, University of Galway, Galway, Ireland.
| | - Saeed Hamood Alsamhi
- Insight Centre for Data Analytics, University of Galway, Galway, Ireland; Faculty of Engineering, IBB University, Ibb, Yemen.
| |
Collapse
|
3
|
Okamoto N, Ikenouchi A, Chibaatar E, Watanabe K, Igata R, Seki I, Yoshimura R. Risk Factors in Japanese Drug Overdose Patients: Identifying Their Associations With Suicide Risk. OMEGA-JOURNAL OF DEATH AND DYING 2023:302228231166970. [PMID: 36972707 DOI: 10.1177/00302228231166970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Several suicide attempts presented at the emergency department are due to drug overdose associated with psychiatric disorders. We examined and identified the major risk factors among Japanese drug overdose patients and several close associations of suicide risk. We enrolled 101 patients who attempted suicide by drug overdose between January 2015 and April 2018, assessed their background using the SAD PERSONS scale, and performed association rule analysis to characterize the major risk factors and their associations. We identified three main nodes-depressive state, social support lacking, and no spouse-as considerable risk factors. Furthermore, we identified several close associations of suicide risk and their intensity; in cases with previous suicide attempts and ethanol abuse or substance use, a simultaneous social support lacking is likely. These findings align with previous studies that used conventional statistical analysis on suicide and suicide attempt risk and highlight its importance.
Collapse
Affiliation(s)
- Naomichi Okamoto
- Department of Psychiatry, University of Occupational and Environmental Health, Fukuoka, Japan
| | - Atsuko Ikenouchi
- Department of Psychiatry, University of Occupational and Environmental Health, Fukuoka, Japan
- Medical Center for Dementia, University Hospital, University of Occupational and Environmental Health, Fukuoka, Japan
| | - Enkhmurun Chibaatar
- Department of Psychiatry, University of Occupational and Environmental Health, Fukuoka, Japan
| | - Keita Watanabe
- Open Innovation Institute, Kyoto University, Kyoto, Japan
| | - Ryohei Igata
- Department of Psychiatry, University of Occupational and Environmental Health, Fukuoka, Japan
| | - Issei Seki
- Department of Psychiatry, University of Occupational and Environmental Health, Fukuoka, Japan
| | - Reiji Yoshimura
- Department of Psychiatry, University of Occupational and Environmental Health, Fukuoka, Japan
| |
Collapse
|
4
|
The index lift in data mining has a close relationship with the association measure relative risk in epidemiological studies. BMC Med Inform Decis Mak 2019; 19:112. [PMID: 31208407 PMCID: PMC6580490 DOI: 10.1186/s12911-019-0838-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 06/11/2019] [Indexed: 12/02/2022] Open
Abstract
Background Data mining tools have been increasingly used in health research, with the promise of accelerating discoveries. Lift is a standard association metric in the data mining community. However, health researchers struggle with the interpretation of lift. As a result, dissemination of data mining results can be met with hesitation. The relative risk and odds ratio are standard association measures in the health domain, due to their straightforward interpretation and comparability across populations. We aimed to investigate the lift-relative risk and the lift-odds ratio relationships, and provide tools to convert lift to the relative risk and odds ratio. Methods We derived equations linking lift-relative risk and lift-odds ratio. We discussed how lift, relative risk, and odds ratio behave numerically with varying association strengths and exposure prevalence levels. The lift-relative risk relationship was further illustrated using a high-dimensional dataset which examines the association of exposure to airborne pollutants and adverse birth outcomes. We conducted spatial association rule mining using the Kingfisher algorithm, which identified association rules using its built-in lift metric. We directly estimated relative risks and odds ratios from 2 by 2 tables for each identified rule. These values were compared to the corresponding lift values, and relative risks and odds ratios were computed using the derived equations. Results As the exposure-outcome association strengthens, the odds ratio and relative risk move away from 1 faster numerically than lift, i.e. |log (odds ratio)| ≥ |log (relative risk)| ≥ |log (lift)|. In addition, lift is bounded by the smaller of the inverse probability of outcome or exposure, i.e. lift≤ min (1/P(O), 1/P(E)). Unlike the relative risk and odds ratio, lift depends on the exposure prevalence for fixed outcomes. For example, when an exposure A and a less prevalent exposure B have the same relative risk for an outcome, exposure A has a lower lift than B. Conclusions Lift, relative risk, and odds ratio are positively correlated and share the same null value. However, lift depends on the exposure prevalence, and thus is not straightforward to interpret or to use to compare association strength. Tools are provided to obtain the relative risk and odds ratio from lift. Electronic supplementary material The online version of this article (10.1186/s12911-019-0838-4) contains supplementary material, which is available to authorized users.
Collapse
|
5
|
Abstract
Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations.
Collapse
|
6
|
Chin CY, Hsieh SY, Tseng VS. eDRAM: Effective early disease risk assessment with matrix factorization on a large-scale medical database: A case study on rheumatoid arthritis. PLoS One 2018; 13:e0207579. [PMID: 30475847 PMCID: PMC6261027 DOI: 10.1371/journal.pone.0207579] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 11/02/2018] [Indexed: 11/18/2022] Open
Abstract
Recently, a number of analytical approaches for probing medical databases have been developed to assist in disease risk assessment and to determine the association of a clinical condition with others, so that better and intelligent healthcare can be provided. The early assessment of disease risk is an emerging topic in medical informatics. If diseases are detected at an early stage, prognosis can be improved and medical resources can be used more efficiently. For example, if rheumatoid arthritis (RA) is detected at an early stage, appropriate medications can be used to prevent bone deterioration. In early disease risk assessment, finding important risk factors from large-scale medical databases and performing individual disease risk assessment have been challenging tasks. A number of recent studies have considered risk factor analysis approaches, such as association rule mining, sequential rule mining, regression, and expert advice. In this study, to improve disease risk assessment, machine learning and matrix factorization techniques were integrated to discover important and implicit risk factors. A novel framework is proposed that can effectively assess early disease risks, and RA is used as a case study. This framework comprises three main stages: data preprocessing, risk factor optimization, and early disease risk assessment. This is the first study integrating matrix factorization and machine learning for disease risk assessment that is applied to a nation-wide and longitudinal medical diagnostic database. In the experimental evaluations, a cohort established from a large-scale medical database was used that included 1007 RA-diagnosed patients and 921,192 control patients examined over a nine-year follow-up period (2000-2008). The evaluation results demonstrate that the proposed approach is more efficient and stable for disease risk assessment than state-of-the-art methods.
Collapse
Affiliation(s)
- Chu-Yu Chin
- Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Sun-Yuan Hsieh
- Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Vincent S. Tseng
- Computer Science and Information Engineering, National Chiao Tung University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
7
|
Atsumi T, Ando Y, Matsuda S, Tomizawa S, Tanaka R, Takagi N, Nakasone A. Prodromal signs and symptoms of serious infections with tocilizumab treatment for rheumatoid arthritis: Text mining of the Japanese postmarketing adverse event-reporting database. Mod Rheumatol 2017; 28:435-443. [DOI: 10.1080/14397595.2017.1366007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Tatsuya Atsumi
- Division of Rheumatology, Endocrinology and Nephrology, Hokkaido University Graduate School of Medicine, Sapporo, Japan
| | | | | | | | - Riwa Tanaka
- Chugai Pharmaceutical Co., Ltd., Tokyo, Japan
| | | | | |
Collapse
|
8
|
Cheng YT, Lin YF, Chiang KH, Tseng VS. Mining Sequential Risk Patterns From Large-Scale Clinical Databases for Early Assessment of Chronic Diseases: A Case Study on Chronic Obstructive Pulmonary Disease. IEEE J Biomed Health Inform 2017; 21:303-311. [PMID: 28129195 DOI: 10.1109/jbhi.2017.2657802] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Chronic diseases have been among the major concerns in medical fields since they may cause a heavy burden on healthcare resources and disturb the quality of life. In this paper, we propose a novel framework for early assessment on chronic diseases by mining sequential risk patterns with time interval information from diagnostic clinical records using sequential rules mining, and classification modeling techniques. With a complete workflow, the proposed framework consists of four phases namely data preprocessing, risk pattern mining, classification modeling, and post analysis. For empiricasl evaluation, we demonstrate the effectiveness of our proposed framework with a case study on early assessment of COPD. Through experimental evaluation on a large-scale nationwide clinical database in Taiwan, our approach can not only derive rich sequential risk patterns but also extract novel patterns with valuable insights for further medical investigation such as discovering novel markers and better treatments. To the best of our knowledge, this is the first work addressing the issue of mining sequential risk patterns with time-intervals as well as classification models for early assessment of chronic diseases.
Collapse
|
9
|
Chen R, Sun J, Dittus RS, Fabbri D, Kirby J, Laffer CL, McNaughton CD, Malin B. Patient Stratification Using Electronic Health Records from a Chronic Disease Management Program. IEEE J Biomed Health Inform 2016:10.1109/JBHI.2016.2514264. [PMID: 26742152 PMCID: PMC4931988 DOI: 10.1109/jbhi.2016.2514264] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
OBJECTIVE The goal of this study is to devise a machine learning framework to assist care coordination programs in prognostic stratification to design and deliver personalized care plans and to allocate financial and medical resources effectively. MATERIALS AND METHODS This study is based on a de-identified cohort of 2,521 hypertension patients from a chronic care coordination program at the Vanderbilt University Medical Center. Patients were modeled as vectors of features derived from electronic health records (EHRs) over a six-year period. We applied a stepwise regression to identify risk factors associated with a decrease in mean arterial pressure of at least 2 mmHg after program enrollment. The resulting features were subsequently validated via a logistic regression classifier. Finally, risk factors were applied to group the patients through model-based clustering. RESULTS We identified a set of predictive features that consisted of a mix of demographic, medication, and diagnostic concepts. Logistic regression over these features yielded an area under the ROC curve (AUC) of 0.71 (95% CI: [0.67, 0.76]). Based on these features, four clinically meaningful groups are identified through clustering - two of which represented patients with more severe disease profiles, while the remaining represented patients with mild disease profiles. DISCUSSION Patients with hypertension can exhibit significant variation in their blood pressure control status and responsiveness to therapy. Yet this work shows that a clustering analysis can generate more homogeneous patient groups, which may aid clinicians in designing and implementing customized care programs. CONCLUSION The study shows that predictive modeling and clustering using EHR data can be beneficial for providing a systematic, generalized approach for care providers to tailor their management approach based upon patient-level factors.
Collapse
Affiliation(s)
- Robert Chen
- School of Computational Science and Engineering at the Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Jimeng Sun
- School of Computational Science and Engineering at the Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Robert S. Dittus
- Institute for Medicine and Public Health, Vanderbilt University, Nashville, TN, the Geriatric Research, Education, and Clinical Center, VA Tennessee Valley Healthcare System, Nashville, TN, and the Department of Medicine, School of Medicine, Vanderbilt University, Nashville, TN
| | - Daniel Fabbri
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN, and the Department of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University, Nashville, TN
| | - Jacqueline Kirby
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University, Nashville, TN
| | - Cheryl L. Laffer
- Department of Medicine, School of Medicine, Vanderbilt University, Nashville, TN
| | - Candace D. McNaughton
- Department of Emergency Medicine, School of Medicine, Vanderbilt University, Nashville, TN
| | - Bradley Malin
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN, and the Department of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University, Nashville, TN
| |
Collapse
|