1
|
Wimalarathna H, Youngblood PL, Parker C, Marx CG, Ankmnal-Veeranna S. Leveraging cluster analysis to compare click and chirp-evoked auditory brainstem responses. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 265:108732. [PMID: 40184854 DOI: 10.1016/j.cmpb.2025.108732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Revised: 03/04/2025] [Accepted: 03/19/2025] [Indexed: 04/07/2025]
Abstract
BACKGROUND AND OBJECTIVE The Auditory Brainstem Response (ABR) can be recorded by presenting short-duration click and chirp stimuli. The ABR test is commonly used for threshold estimation and to examine auditory brainstem integrity. The neural integrity is evaluated at suprathreshold levels. This study aimed to compare click and CE-Chirp®-evoked ABRs recorded at suprathreshold levels in normal-hearing infants and adults, using cluster analysis to identify patterns and distinctions between responses to the two stimuli. METHODS Click-evoked and CE-Chirp® evoked ABRs were recorded from infants and adults with normal hearing at suprathreshold levels. Cluster analysis techniques examined and categorized response patterns for each stimulus type, comparing across time, frequency and time-frequency domains. RESULTS Our findings indicate a noticeable homogeneity in the click-evoked ABRs in both groups in the time-domain, suggesting a consistent response to click stimuli. In contrast, CE-Chirp®-evoked ABRs exhibited variability in both groups, which may be attributable to the complex nature of the CE-Chirp® stimulus and its interaction with the auditory system. CONCLUSION The implications of these findings are significant for audiologists. It is crucial to take into account the inherent variability of these responses when interpreting chirp-evoked ABRs, as they may reflect nuanced aspects of auditory system function that are not as prominent in the more uniform click-evoked ABRs. The insights from this study enhance our understanding of auditory brainstem processing and have the potential to refine the clinical protocols for ABR testing.
Collapse
Affiliation(s)
- Hasitha Wimalarathna
- Department of Electrical and Computer Engineering, Western University, London, Ontario, Canada; National Centre for Audiology, Western University, London, Ontario, Canada.
| | | | - Caroline Parker
- School of Speech and Hearing Sciences, The University of Southern Mississippi, Hattiesburg, MS, USA
| | - Charles G Marx
- School of Speech and Hearing Sciences, The University of Southern Mississippi, Hattiesburg, MS, USA
| | | |
Collapse
|
2
|
Erra A, Chen J, Miller CM, Chrysostomou E, Barret S, Kassim YM, Friedman RA, Lauer A, Ceriani F, Marcotti W, Carroll C, Manor U. An Open-Source Deep Learning-Based GUI Toolbox for Automated Auditory Brainstem Response Analyses (ABRA). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.06.20.599815. [PMID: 38948763 PMCID: PMC11213013 DOI: 10.1101/2024.06.20.599815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Hearing loss is a pervasive global health challenge with profound impacts on communication, cognitive function, and quality of life. Recent studies have established age-related hearing loss as a significant risk factor for dementia, highlighting the importance of hearing loss research. Auditory brainstem responses (ABRs), which are electrophysiological recordings of synchronized neural activity from the auditory nerve and brainstem, serve as in vivo readouts for sensory hair cell, synaptic integrity, hearing sensitivity, and other key features of auditory pathway functionality, making them highly valuable for both basic neuroscience research and clinical diagnostics. Despite their utility, traditional ABR analyses rely heavily on subjective manual interpretation, leading to considerable variability and limiting reproducibility across studies. Here, we introduce Auditory Brainstem Response Analyzer (ABRA), a novel open-source graphical user interface powered by deep learning, which automates and standardizes ABR waveform analysis. ABRA employs convolutional neural networks trained on diverse datasets collected from multiple experimental settings, achieving rapid and unbiased extraction of key ABR metrics, including peak amplitude, latency, and auditory threshold estimates. We demonstrate that ABRA's deep learning models provide performance comparable to expert human annotators while dramatically reducing analysis time and enhancing reproducibility across datasets from different laboratories. By bridging hearing research, sensory neuroscience, and advanced computational techniques, ABRA facilitates broader interdisciplinary insights into auditory function. An online version of the tool is available for use at no cost at https://abra.ucsd.edu.
Collapse
Affiliation(s)
- Abhijeeth Erra
- Data Institute, University of San Francisco, San Francisco, CA
| | - Jeffrey Chen
- Data Institute, University of San Francisco, San Francisco, CA
| | - Cayla M. Miller
- Dept. of Cell & Developmental Biology, University of California San Diego, La Jolla, CA
| | - Elena Chrysostomou
- Dept. of Cell & Developmental Biology, University of California San Diego, La Jolla, CA
| | - Shannon Barret
- Dept. of Cell & Developmental Biology, University of California San Diego, La Jolla, CA
| | - Yasmin M. Kassim
- Dept. of Cell & Developmental Biology, University of California San Diego, La Jolla, CA
| | - Rick A. Friedman
- Dept. of Otolaryngology, University of California San Diego, La Jolla, CA
| | - Amanda Lauer
- Depts. of Otolaryngology-Head and Neck Surgery and Neuroscience and Center for Functional Anatomy and Evolution, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Federico Ceriani
- School of Biosciences, University of Sheffield, Sheffield, S10 2TN, UK
- Neuroscience Institute, University of Sheffield, Sheffield, S10 2TN, UK
| | - Walter Marcotti
- School of Biosciences, University of Sheffield, Sheffield, S10 2TN, UK
- Neuroscience Institute, University of Sheffield, Sheffield, S10 2TN, UK
| | - Cody Carroll
- Data Institute, University of San Francisco, San Francisco, CA
- Dept. of Mathematics and Statistics, University of San Francisco, San Francisco, CA
| | - Uri Manor
- Dept. of Cell & Developmental Biology, University of California San Diego, La Jolla, CA
- Dept. of Otolaryngology, University of California San Diego, La Jolla, CA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA
| |
Collapse
|
3
|
Chawla M, Panda SN, Khullar V. Deep learning and robotics enabled approach for audio based emotional pragmatics deficits identification in social communication disorders. Proc Inst Mech Eng H 2025; 239:332-346. [PMID: 40079556 DOI: 10.1177/09544119251325331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/15/2025]
Abstract
The aim of this study is to develop Deep Learning (DL) enabled robotic systems to identify audio-based emotional pragmatics deficits in individuals with social pragmatic communication deficits. The novelty of the work stems from its integration of deep learning with a robotics platform for identifying emotional pragmatics deficits. In this study, the proposed methodology utilizes the implementation of machine and DL-based classification techniques, which have been applied to a collection of open-source datasets to identify audio emotions. The application of pre-processing and converting audio signals of different emotions utilizing Mel-Frequency Cepstral Coefficients (MFCC) resulted in improved emotion classification. The data generated using MFCC were used for the training of machine or DL models. The trained models were then tested on a randomly selected dataset. DL has been proven to be more effective in the identification of emotions using robotic structure. As the data generated by MFCC is of a single dimension, therefore, one-dimensional DL algorithms, such as 1D-Convolution Neural Network, Long Short-Term Memory, and Bidirectional-Long Short-Term Memory, were utilized. In comparison to other algorithms, bidirectional Long Short-Term Memory model has resulted in higher accuracy (96.24%), loss (0.2524 in value), precision (92.87%), and recall (92.87%) in comparison to other machine and DL algorithms. Further, the proposed model was deployed on the robotic structure for real-time detection for improvement of social-emotional pragmatic responses in individuals with deficits. The approach can serve as a potential tool for the individuals with pragmatic communication deficits.
Collapse
Affiliation(s)
- Muskan Chawla
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - Surya Narayan Panda
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - Vikas Khullar
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| |
Collapse
|
4
|
Frosolini A, Franz L, Caragli V, Genovese E, de Filippis C, Marioni G. Artificial Intelligence in Audiology: A Scoping Review of Current Applications and Future Directions. SENSORS (BASEL, SWITZERLAND) 2024; 24:7126. [PMID: 39598904 PMCID: PMC11598364 DOI: 10.3390/s24227126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Revised: 10/31/2024] [Accepted: 11/04/2024] [Indexed: 11/29/2024]
Abstract
The integration of artificial intelligence (AI) into medical disciplines is rapidly transforming healthcare delivery, with audiology being no exception. By synthesizing the existing literature, this review seeks to inform clinicians, researchers, and policymakers about the potential and challenges of integrating AI into audiological practice. The PubMed, Cochrane, and Google Scholar databases were searched for articles published in English from 1990 to 2024 with the following query: "(audiology) AND ("artificial intelligence" OR "machine learning" OR "deep learning")". The PRISMA extension for scoping reviews (PRISMA-ScR) was followed. The database research yielded 1359 results, and the selection process led to the inclusion of 104 manuscripts. The integration of AI in audiology has evolved significantly over the succeeding decades, with 87.5% of manuscripts published in the last 4 years. Most types of AI were consistently used for specific purposes, such as logistic regression and other statistical machine learning tools (e.g., support vector machine, multilayer perceptron, random forest, deep belief network, decision tree, k-nearest neighbor, or LASSO) for automated audiometry and clinical predictions; convolutional neural networks for radiological image analysis; and large language models for automatic generation of diagnostic reports. Despite the advances in AI technologies, different ethical and professional challenges are still present, underscoring the need for larger, more diverse data collection and bioethics studies in the field of audiology.
Collapse
Affiliation(s)
- Andrea Frosolini
- Maxillofacial Surgery Unit, Department of Medical Biotechnology, S. Maria alle Scotte University Hospital of Siena, 53100 Siena, Italy;
| | - Leonardo Franz
- Phoniatris and Audiology Unit, Department of Neuroscience DNS, University of Padova, 33100 Treviso, Italy; (L.F.); (C.d.F.); (G.M.)
| | - Valeria Caragli
- Audiology Program, Otorhinolaryngology Unit, Department of Medical and Surgical Sciences for Children and Adults, University of Modena and Reggio Emilia, 41124 Modena, Italy;
| | - Elisabetta Genovese
- Audiology Program, Otorhinolaryngology Unit, Department of Medical and Surgical Sciences for Children and Adults, University of Modena and Reggio Emilia, 41124 Modena, Italy;
- Audiology Program, Department of Maternal, Child and Adult Medical and Surgical Sciences, University of Modena and Reggio Emilia, 41124 Modena, Italy
| | - Cosimo de Filippis
- Phoniatris and Audiology Unit, Department of Neuroscience DNS, University of Padova, 33100 Treviso, Italy; (L.F.); (C.d.F.); (G.M.)
| | - Gino Marioni
- Phoniatris and Audiology Unit, Department of Neuroscience DNS, University of Padova, 33100 Treviso, Italy; (L.F.); (C.d.F.); (G.M.)
| |
Collapse
|
5
|
Matikolaie FS, Tadj C. Machine Learning-Based Cry Diagnostic System for Identifying Septic Newborns. J Voice 2024; 38:963.e1-963.e14. [PMID: 35193790 DOI: 10.1016/j.jvoice.2021.12.021] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/28/2021] [Accepted: 12/29/2021] [Indexed: 10/19/2022]
Abstract
BACKGROUND AND OBJECTIVE Processing the newborns' cry audio signal (CAS) provides valuable information about the newborns' condition. This information can be used to diagnose the disease. This article analyzes the CASs of newborns under two months old using machine learning approaches to develop an automatic diagnostic system for identifying septic infants from healthy ones. Septic infants have not been studied in this context. METHODOLOGY The proposed features include Mel frequency cepstral coefficients and the prosodic features of tilt, rhythm, and intensity. The performance of each feature set was evaluated using a collection of classifiers, including Support Vector Machine (SVM), decision tree, and discriminant analysis. We also examined the majority voting method for improving the classification results and feature manipulation and multiple classifier framework, which has not previously been reported in the literature on developing an automatic diagnostic system based on the infant's CAS. We tested our methodology on two datasets of expiration and inspiration episodes of newborns' CASs. RESULTS AND CONCLUSION The framework of the concatenation of all feature sets using quadratic SVM resulted in the best F-score with 86% for the expiration dataset. Furthermore, the framework of tilt feature set with quadratic discriminant with 83.90% resulted in the best F-score for inspiration. We found out that septic infants cry differently than healthy infants through these experiments. Thus, our proposed method can be used as a noninvasive tool for identifying septic infants from healthy ones only based on their CAS.
Collapse
Affiliation(s)
| | - Chakib Tadj
- Department of Electrical Engineering, École De Technologie Supérieure, Montreal, QC, H3C 1K3, Canada
| |
Collapse
|
6
|
Ahmed MAO, Satar YA, Darwish EM, Zanaty EA. Synergistic integration of Multi-View Brain Networks and advanced machine learning techniques for auditory disorders diagnostics. Brain Inform 2024; 11:3. [PMID: 38219249 PMCID: PMC10788326 DOI: 10.1186/s40708-023-00214-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 12/06/2023] [Indexed: 01/16/2024] Open
Abstract
In the field of audiology, achieving accurate discrimination of auditory impairments remains a formidable challenge. Conditions such as deafness and tinnitus exert a substantial impact on patients' overall quality of life, emphasizing the urgent need for precise and efficient classification methods. This study introduces an innovative approach, utilizing Multi-View Brain Network data acquired from three distinct cohorts: 51 deaf patients, 54 with tinnitus, and 42 normal controls. Electroencephalogram (EEG) recording data were meticulously collected, focusing on 70 electrodes attached to an end-to-end key with 10 regions of interest (ROI). This data is synergistically integrated with machine learning algorithms. To tackle the inherently high-dimensional nature of brain connectivity data, principal component analysis (PCA) is employed for feature reduction, enhancing interpretability. The proposed approach undergoes evaluation using ensemble learning techniques, including Random Forest, Extra Trees, Gradient Boosting, and CatBoost. The performance of the proposed models is scrutinized across a comprehensive set of metrics, encompassing cross-validation accuracy (CVA), precision, recall, F1-score, Kappa, and Matthews correlation coefficient (MCC). The proposed models demonstrate statistical significance and effectively diagnose auditory disorders, contributing to early detection and personalized treatment, thereby enhancing patient outcomes and quality of life. Notably, they exhibit reliability and robustness, characterized by high Kappa and MCC values. This research represents a significant advancement in the intersection of audiology, neuroimaging, and machine learning, with transformative implications for clinical practice and care.
Collapse
Affiliation(s)
- Muhammad Atta Othman Ahmed
- Department of Computer Science, Faculty of Computers and Information, Luxor University, 85951, Luxor, Egypt.
| | - Yasser Abdel Satar
- Mathematics Department, Faculty of Science, Sohag University, 82511, Sohag, Egypt
| | - Eed M Darwish
- Physics Department, College of Science, Taibah University, Medina, 41411, Saudi Arabia
- Physics Department, Faculty of Science, Sohag University, 82524, Sohag, Egypt
| | - Elnomery A Zanaty
- Department of Computer Science, Faculty of Computers and Artificial Intelligence, Sohag University, 82511, Sohag, Egypt
| |
Collapse
|
7
|
Balan JR, Rodrigo H, Saxena U, Mishra SK. Explainable machine learning reveals the relationship between hearing thresholds and speech-in-noise recognition in listeners with normal audiograms. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:2278-2288. [PMID: 37823779 DOI: 10.1121/10.0021303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 09/17/2023] [Indexed: 10/13/2023]
Abstract
Some individuals complain of listening-in-noise difficulty despite having a normal audiogram. In this study, machine learning is applied to examine the extent to which hearing thresholds can predict speech-in-noise recognition among normal-hearing individuals. The specific goals were to (1) compare the performance of one standard (GAM, generalized additive model) and four machine learning models (ANN, artificial neural network; DNN, deep neural network; RF, random forest; XGBoost; eXtreme gradient boosting), and (2) examine the relative contribution of individual audiometric frequencies and demographic variables in predicting speech-in-noise recognition. Archival data included thresholds (0.25-16 kHz) and speech recognition thresholds (SRTs) from listeners with clinically normal audiograms (n = 764 participants or 1528 ears; age, 4-38 years old). Among the machine learning models, XGBoost performed significantly better than other methods (mean absolute error; MAE = 1.62 dB). ANN and RF yielded similar performances (MAE = 1.68 and 1.67 dB, respectively), whereas, surprisingly, DNN showed relatively poorer performance (MAE = 1.94 dB). The MAE for GAM was 1.61 dB. SHapley Additive exPlanations revealed that age, thresholds at 16 kHz, 12.5 kHz, etc., on the order of importance, contributed to SRT. These results suggest the importance of hearing in the extended high frequencies for predicting speech-in-noise recognition in listeners with normal audiograms.
Collapse
Affiliation(s)
- Jithin Raj Balan
- Department of Speech, Language and Hearing Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Hansapani Rodrigo
- School of Mathematical and Statistical Sciences, The University of Texas Rio Grande Valley, Edinburg, Texas 78539, USA
| | - Udit Saxena
- Department of Audiology and Speech-Language Pathology, Gujarat Medical Education and Research Society, Medical College and Hospital, Ahmedabad, 380060, India
| | - Srikanta K Mishra
- Department of Speech, Language and Hearing Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
8
|
Tarnovsky YC, Taiber S, Nissan Y, Boonman A, Assaf Y, Wilkinson GS, Avraham KB, Yovel Y. Bats experience age-related hearing loss (presbycusis). Life Sci Alliance 2023; 6:e202201847. [PMID: 36997281 PMCID: PMC10067528 DOI: 10.26508/lsa.202201847] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 03/14/2023] [Accepted: 03/15/2023] [Indexed: 04/01/2023] Open
Abstract
Hearing loss is a hallmark of aging, typically initially affecting the higher frequencies. In echolocating bats, the ability to discern high frequencies is essential. However, nothing is known about age-related hearing loss in bats, and they are often assumed to be immune to it. We tested the hearing of 47 wild Egyptian fruit bats by recording their auditory brainstem response and cochlear microphonics, and we also assessed the cochlear histology in four of these bats. We used the bats' DNA methylation profile to evaluate their age and found that bats exhibit age-related hearing loss, with more prominent deterioration at the higher frequencies. The rate of the deterioration was ∼1 dB per year, comparable to the hearing loss observed in humans. Assessing the noise in the fruit bat roost revealed that these bats are exposed to continuous immense noise-mostly of social vocalizations-supporting the assumption that bats might be partially resistant to loud noise. Thus, in contrast to previous assumptions, our results suggest that bats constitute a model animal for the study of age-related hearing loss.
Collapse
Affiliation(s)
- Yifat Chaya Tarnovsky
- School of Neurobiology, Biochemistry, and Biophysics, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Shahar Taiber
- School of Neurobiology, Biochemistry, and Biophysics, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Yomiran Nissan
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Arjan Boonman
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Yaniv Assaf
- School of Neurobiology, Biochemistry, and Biophysics, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | | | - Karen B Avraham
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Yossi Yovel
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- School of Mechanical Engineering, Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
9
|
Yang C. Prediction of hearing preservation after acoustic neuroma surgery based on SMOTE-XGBoost. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:10757-10772. [PMID: 37322959 DOI: 10.3934/mbe.2023477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Prior to the surgical removal of an acoustic neuroma, the majority of patients anticipate that their hearing will be preserved to the greatest possible extent following surgery. This paper proposes a postoperative hearing preservation prediction model for the characteristics of class-imbalanced hospital real data based on the extreme gradient boost tree (XGBoost). In order to eliminate sample imbalance, the synthetic minority oversampling technique (SMOTE) is applied to increase the number of underclass samples in the data. Multiple machine learning models are also used for the accurate prediction of surgical hearing preservation in acoustic neuroma patients. In comparison to research results from existing literature, the experimental results found the model proposed in this paper to be superior. In summary, the method this paper proposes can make a significant contribution to the development of personalized preoperative diagnosis and treatment plans for patients, leading to effective judgment for the hearing retention of patients with acoustic neuroma following surgery, a simplified long medical treatment process and saved medical resources.
Collapse
Affiliation(s)
- Cenyi Yang
- School of Mathematics and Statistics, Central South University, Changsha 410083, China
| |
Collapse
|