1
|
Shah R, Wilkins SG, Panth N, Tyagi S, Dunn H, Bell MD, Norgaard S, Guyer E, Schwartz N. Over-The-Counter Hearing Aids: Are They Safe and Effective? Otolaryngol Head Neck Surg 2024. [PMID: 38769863 DOI: 10.1002/ohn.817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 04/07/2024] [Accepted: 04/13/2024] [Indexed: 05/22/2024]
Abstract
OBJECTIVE In 2022, the Food and Drug Administration established a new regulatory category for over-the-counter (OTC) hearing aids for mild to moderate hearing loss. Herein, we aim to better compare the safety and efficacy of these devices to that of prescription hearing aids. STUDY DESIGN Comparative-effectiveness model. SETTING Academic Audiology Center. METHODS The safety and efficacy of prescription and OTC hearing aids was compared using the AudioScan Verifit 2 Testbox software. Three types of hearing loss (downsloping, sharp downsloping, and reverse sloping) were analyzed. Efficacy was tested at 3 volume inputs and was measured by calculating the average difference in test points (produced by the devices) and target points (estimated by the software). Safety was assessed by calculating the average difference in test points and the maximally safe hearing level (produced by the software). RESULTS Prescription hearing aids were found to have a better safety profile by being further from the safety threshold compared to OTC devices at the 8000 Hz frequency for the 2 types of downsloping hearing loss patterns studied (48 vs 30.5 dB, P = .04; 51 vs 32.5 dB, P = .03). Prescription hearing aids also carried a statistically significant advantage at 3 test points. OTC hearing aids generally had a greater difference between test and target points. CONCLUSION OTC and prescription hearing aids are comparably safe, though OTC hearing aids are slightly less efficacious. Further evaluation of the OTC hearing aid efficacy is warranted to ensure it provides the gain of benefit needed for different types of hearing loss.
Collapse
Affiliation(s)
- Rema Shah
- Yale University School of Medicine, New Haven, Connecticut, USA
| | - Sarah G Wilkins
- Yale University School of Medicine, New Haven, Connecticut, USA
| | - Neelima Panth
- Yale University School of Medicine, New Haven, Connecticut, USA
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Sidharth Tyagi
- Yale University School of Medicine, New Haven, Connecticut, USA
| | - Hannah Dunn
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Moira D Bell
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Sophie Norgaard
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Elizabeth Guyer
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Nofrat Schwartz
- Yale University School of Medicine, New Haven, Connecticut, USA
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Yale University School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
2
|
Phillips I, Bieber RE, Dirks C, Grant KW, Brungart DS. Age Impacts Speech-in-Noise Recognition Differently for Nonnative and Native Listeners. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:1602-1623. [PMID: 38569080 DOI: 10.1044/2024_jslhr-23-00470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
PURPOSE The purpose of this study was to explore potential differences in suprathreshold auditory function among native and nonnative speakers of English as a function of age. METHOD Retrospective analyses were performed on three large data sets containing suprathreshold auditory tests completed by 5,572 participants who were self-identified native and nonnative speakers of English between the ages of 18-65 years, including a binaural tone detection test, a digit identification test, and a sentence recognition test. RESULTS The analyses show a significant interaction between increasing age and participant group on tests involving speech-based stimuli (digit strings, sentences) but not on the binaural tone detection test. For both speech tests, differences in speech recognition emerged between groups during early adulthood, and increasing age had a more negative impact on word recognition for nonnative compared to native participants. Age-related declines in performance were 2.9 times faster for digit strings and 3.3 times faster for sentences for nonnative participants compared to native participants. CONCLUSIONS This set of analyses extends the existing literature by examining interactions between aging and self-identified native English speaker status in several auditory domains in a cohort of adults spanning young adulthood through middle age. The finding that older nonnative English speakers in this age cohort may have greater-than-expected deficits on speech-in-noise perception may have clinical implications on how these individuals should be diagnosed and treated for hearing difficulties.
Collapse
Affiliation(s)
- Ian Phillips
- Audiology & Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
- Henry M Jackson Foundation for the Advancement of Military Medicine, Inc., Bethesda, MD
| | - Rebecca E Bieber
- Audiology & Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
- Henry M Jackson Foundation for the Advancement of Military Medicine, Inc., Bethesda, MD
| | - Coral Dirks
- Audiology & Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
| | - Ken W Grant
- Audiology & Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
| | - Douglas S Brungart
- Audiology & Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
| |
Collapse
|
3
|
Schirmer J, Wolpert S, Dapper K, Rühle M, Wertz J, Wouters M, Eldh T, Bader K, Singer W, Gaudrain E, Başkent D, Verhulst S, Braun C, Rüttiger L, Munk MHJ, Dalhoff E, Knipper M. Neural Adaptation at Stimulus Onset and Speed of Neural Processing as Critical Contributors to Speech Comprehension Independent of Hearing Threshold or Age. J Clin Med 2024; 13:2725. [PMID: 38731254 PMCID: PMC11084258 DOI: 10.3390/jcm13092725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 04/24/2024] [Accepted: 04/26/2024] [Indexed: 05/13/2024] Open
Abstract
Background: It is assumed that speech comprehension deficits in background noise are caused by age-related or acquired hearing loss. Methods: We examined young, middle-aged, and older individuals with and without hearing threshold loss using pure-tone (PT) audiometry, short-pulsed distortion-product otoacoustic emissions (pDPOAEs), auditory brainstem responses (ABRs), auditory steady-state responses (ASSRs), speech comprehension (OLSA), and syllable discrimination in quiet and noise. Results: A noticeable decline of hearing sensitivity in extended high-frequency regions and its influence on low-frequency-induced ABRs was striking. When testing for differences in OLSA thresholds normalized for PT thresholds (PTTs), marked differences in speech comprehension ability exist not only in noise, but also in quiet, and they exist throughout the whole age range investigated. Listeners with poor speech comprehension in quiet exhibited a relatively lower pDPOAE and, thus, cochlear amplifier performance independent of PTT, smaller and delayed ABRs, and lower performance in vowel-phoneme discrimination below phase-locking limits (/o/-/u/). When OLSA was tested in noise, listeners with poor speech comprehension independent of PTT had larger pDPOAEs and, thus, cochlear amplifier performance, larger ASSR amplitudes, and higher uncomfortable loudness levels, all linked with lower performance of vowel-phoneme discrimination above the phase-locking limit (/i/-/y/). Conslusions: This study indicates that listening in noise in humans has a sizable disadvantage in envelope coding when basilar-membrane compression is compromised. Clearly, and in contrast to previous assumptions, both good and poor speech comprehension can exist independently of differences in PTTs and age, a phenomenon that urgently requires improved techniques to diagnose sound processing at stimulus onset in the clinical routine.
Collapse
Affiliation(s)
- Jakob Schirmer
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Stephan Wolpert
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Konrad Dapper
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
- Department of Biology, Technical University Darmstadt, 64287 Darmstadt, Germany
| | - Moritz Rühle
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Jakob Wertz
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Marjoleen Wouters
- Department of Information Technology, Ghent University, Technologiepark 126, 9052 Zwijnaarde, Belgium; (M.W.); (S.V.)
| | - Therese Eldh
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Katharina Bader
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Wibke Singer
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, Centre National de la Recherche Scientifique UMR5292, Inserm U1028, Université Lyon 1, Centre Hospitalier Le Vinatier-Bâtiment 462–Neurocampus, 95 Boulevard Pinel, 69675 Bron CEDEX, France;
- Department of Otorhinolaryngology, University Medical Center Groningen (UMCG), Hanzeplein 1, BB21, 9700 RB Groningen, The Netherlands;
| | - Deniz Başkent
- Department of Otorhinolaryngology, University Medical Center Groningen (UMCG), Hanzeplein 1, BB21, 9700 RB Groningen, The Netherlands;
| | - Sarah Verhulst
- Department of Information Technology, Ghent University, Technologiepark 126, 9052 Zwijnaarde, Belgium; (M.W.); (S.V.)
| | - Christoph Braun
- Magnetoencephalography-Centre and Hertie Institute for Clinical Brain Research, University of Tübingen, Otfried-Müller-Straße 27, 72076 Tübingen, Germany;
- Center for Mind and Brain Research, University of Trento, Palazzo Fedrigotti-corso Bettini 31, 38068 Rovereto, Italy
| | - Lukas Rüttiger
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Matthias H. J. Munk
- Department of Biology, Technical University Darmstadt, 64287 Darmstadt, Germany
- Department of Psychiatry & Psychotherapy, University of Tübingen, Calwerstraße 14, 72076 Tübingen, Germany
| | - Ernst Dalhoff
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Marlies Knipper
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| |
Collapse
|
4
|
Saldías O'Hrens M, Castro C, Espinoza VM, Stoney J, Quezada C, Laukkanen AM. Spectral features related to the auditory perception of twang-like voices. LOGOP PHONIATR VOCO 2024:1-18. [PMID: 38656176 DOI: 10.1080/14015439.2024.2345373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 04/15/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND To the best of our knowledge, studies on the relationship between spectral energy distribution and the degree of perceived twang-like voices are still sparse. Through an auditory-perceptual test we aimed to explore the spectral features that may relate with the auditory-perception of twang-like voices. METHODS Ten judges who were blind to the test's tasks and stimuli rated the amount of twang perceived on seventy-six audio samples. The stimuli consisted of twenty voices recorded from eight CCM singers who sustained the vowel [a:] in different pitches, with and without a twang-like voice. Also, forty filtered and sixteen synthesized-manipulated stimuli were included. RESULTS AND CONCLUSIONS Based on the intra-rater reliability scores, four judges were identified as suitable to be included in the analyses. Results showed that the frequency of F1 and F2 correlated strongly with the auditory-perception of twang-like voices (0.90 and 0.74, respectively), whereas F3 showed a moderate negative correlation (-0.52). The frequency difference between F1 and F3 showed a strong negative correlation (-0.82). The mean energy between 1-2 kHz and 2-3 kHz correlated moderately (0.51 and 0.42, respectively). The frequency of F4 and F5, and the energy above 3 kHz showed weak correlations. Since the spectral changes under 2 kHz have been associated with the jaw, lips, and tongue adjustments (i.e. vowel articulation) and a higher vertical laryngeal position might affect the frequency of all formants (including F1 and F2), our results suggest that vowel articulation and the laryngeal height may be relevant when performing twang-like voices.
Collapse
Affiliation(s)
| | - Christian Castro
- Departamento de Fonoaudiología, Universidad de Chile, Santiago, Chile
- Department Speech and Language Pathology, Universidad de Valparaíso, Valparaíso, Chile
- PhD Program in Health Sciences and Engineering, Universidad de Valparaíso, Valparaíso, Chile
| | | | - Justin Stoney
- New York Vocal Coaching Studio Inc, New York, NY, USA
| | - Camilo Quezada
- Departamento de Fonoaudiología, Universidad de Chile, Santiago, Chile
| | - Anne-Maria Laukkanen
- Speech and Voice Research Laboratory, Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
5
|
Buss E, Kane SG, Young KS, Gratzek CB, Bishop DM, Miller MK, Porter HL, Leibold LJ, Stecker GC, Monson BB. Effects of Stimulus Type on 16-kHz Detection Thresholds. Ear Hear 2024; 45:486-498. [PMID: 38178308 PMCID: PMC10922353 DOI: 10.1097/aud.0000000000001446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
OBJECTIVES Audiometric testing typically does not include frequencies above 8 kHz. However, recent research suggests that extended high-frequency (EHF) sensitivity could affect hearing in natural communication environments. Clinical assessment of hearing often employs pure tones and frequency-modulated (FM) tones interchangeably regardless of frequency. The present study was designed to evaluate how the stimulus chosen to measure EHF thresholds affects estimates of hearing sensitivity. DESIGN The first experiment used standard audiometric procedures to measure 8- and 16-kHz thresholds for 5- to 28-year olds with normal hearing in the standard audiometric range (250 to 8000 Hz). Stimuli were steady tones, pulsed tones, and FM tones. The second experiment tested 18- to 28-year olds with normal hearing in the standard audiometric range using psychophysical procedures to evaluate how changes in sensitivity as a function of frequency affect detection of stimuli that differ with respect to bandwidth, including bands of noise. Thresholds were measured using steady tones, pulsed tones, FM tones, narrow bands of noise, and one-third-octave bands of noise at a range of center frequencies in one ear. RESULTS In experiment 1, thresholds improved with increasing age at 8 kHz and worsened with increasing age at 16 kHz. Thresholds for individual participants were relatively similar for steady, pulsed, and FM tones at 8 kHz. At 16 kHz, mean thresholds were approximately 5 dB lower for FM tones than for steady or pulsed tones. This stimulus effect did not differ as a function of age. Experiment 2 replicated this greater stimulus effect at 16 kHz than at 8 kHz and showed that the slope of the audibility curve accounted for these effects. CONCLUSIONS Contrary to prior expectations, there was no evidence that the choice of stimulus type affected school-age children more than adults. For individual participants, audiometric thresholds at 16 kHz were as much as 20 dB lower for FM tones than for steady tones. Threshold differences across stimuli at 16 kHz were predicted by differences in audibility across frequency, which can vary markedly between listeners. These results highlight the importance of considering spectral width of the stimulus used to evaluate EHF thresholds.
Collapse
Affiliation(s)
- Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Stacey G. Kane
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Health Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kathryn S. Young
- Department of Health Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Chloe B. Gratzek
- Department of Health Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Danielle M. Bishop
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| | - Margaret K. Miller
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| | - Heather L. Porter
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| | - Lori J. Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| | | | - Brian B. Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, USA
| |
Collapse
|
6
|
Kharlamov V, Brenner D, Tucker BV. Examining the effect of high-frequency information on the classification of conversationally produced English fricativesa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:1896-1902. [PMID: 37756577 DOI: 10.1121/10.0021067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 08/28/2023] [Indexed: 09/29/2023]
Abstract
This study examines the role of frequencies above 8 kHz in the classification of conversational speech fricatives [f, v, θ, ð, s, z, ʃ, ʒ, h] in random forest modeling. Prior research has mostly focused on spectral measures for fricative categorization using frequency information below 8 kHz. The contribution of higher frequencies has received only limited attention, especially for non-laboratory speech. In the present study, we use a corpus of sociolinguistic interview recordings from Western Canadian English sampled at 44.1 and 16 kHz. For both sampling rates, we analyze spectral measures obtained using Fourier analysis and the multitaper method, and we also compare models without and with amplitudinal measures. Results show that while frequency information above 8 kHz does not improve classification accuracy in random forest analyses, inclusion of such frequencies can affect the relative importance of specific measures. This includes a decreased contribution of center of gravity and an increased contribution of spectral standard deviation for the higher sampling rate. We also find no major differences in classification accuracy between Fourier and multitaper measures. The inclusion of power measures improves model accuracy but does not change the overall importance of spectral measures.
Collapse
Affiliation(s)
- Viktor Kharlamov
- Department of Languages, Linguistics, and Comparative Literature, Florida Atlantic University, Boca Raton, Florida 33431, USA
| | | | - Benjamin V Tucker
- Department of Communication Sciences and Disorders, Northern Arizona University, Flagstaff, Arizona 86011, USA
| |
Collapse
|
7
|
Shadle CH, Chen WR, Koenig LL, Preston JL. Refining and extending measures for fricative spectra, with special attention to the high-frequency rangea). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:1932-1944. [PMID: 37768114 PMCID: PMC10540850 DOI: 10.1121/10.0021075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 08/04/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Fricatives have noise sources that are filtered by the vocal tract and that typically possess energy over a much broader range of frequencies than observed for vowels and sonorant consonants. This paper introduces and refines fricative measurements that were designed to reflect underlying articulatory and aerodynamic conditions These show differences in the pattern of high-frequency energy for sibilants vs non-sibilants, voiced vs voiceless fricatives, and non-sibilants differing in place of articulation. The results confirm the utility of a spectral peak measure (FM) and low-mid frequency amplitude difference (AmpD) for sibilants. Using a higher-frequency range for defining FM for female voices for alveolars is justified; a still higher range was considered and rejected. High-frequency maximum amplitude (Fh) and amplitude difference between low- and higher-frequency regions (AmpRange) capture /f-θ/ differences in English and the dynamic amplitude range over the entire spectrum. For this dataset, with spectral information up to 15 kHz, a new measure, HighLevelD, was more effective than previously used LevelD and Slope in showing changes over time within the frication. Finally, isolated words and connected speech differ. This work contributes improved measures of fricative spectra and demonstrates the necessity of including high-frequency energy in those measures.
Collapse
Affiliation(s)
- Christine H Shadle
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06519, USA
| | - Wei-Rong Chen
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06519, USA
| | - Laura L Koenig
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06519, USA
| | - Jonathan L Preston
- Department of Communication Sciences and Disorders, Syracuse University, Syracuse, New York 13244, USA
| |
Collapse
|
8
|
McKenna VS, Patel TH, Kendall CL, Howell RJ, Gustin RL. Voice Acoustics and Vocal Effort in Mask-Wearing Healthcare Professionals: A Comparison Pre- and Post-Workday. J Voice 2023; 37:802.e15-802.e23. [PMID: 34112547 DOI: 10.1016/j.jvoice.2021.04.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 04/20/2021] [Accepted: 04/27/2021] [Indexed: 01/17/2023]
Abstract
OBJECTIVE We evaluated voice acoustics and self-perceptual ratings in healthcare workers required to wear face masks throughout their workday. METHODS Eighteen subjects (11 cisgender female, 7 cisgender male; M = 33.72 years, SD = 8.30) completed self-perceptual ratings and acoustic recordings before and after a typical workday. Chosen measures were specific to vocal effort, dysphonia, and laryngeal tension. Mixed effects models were calculated to determine the impact of session, mask type, sex, and their interactions on the set of perceptual and acoustic measures. RESULTS The subjects self-reported a significant increase in vocal effort following the workday. These perceptual changes coincided with an increase in vocal intensity and harmonics-to-noise ratio, but decrease in relative fundamental frequency offset 10. As expected, men and women differed in measures related to fundamental frequency and vocal tract length. CONCLUSION Healthcare professionals wearing masks reported greater vocal symptoms post-workday compared to pre-workday. These symptoms coincided with acoustic changes previously related to vocal effort; however, the degree of change was considered mild. Further research is needed to determine whether vocal hygiene strategies may reduce vocal symptoms in mask-wearing workers.
Collapse
Affiliation(s)
- Victoria S McKenna
- Department of Communication Sciences and Disorders, University of Cincinnati; Department of Biomedical Engineering, University of Cincinnati.
| | - Tulsi H Patel
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Courtney L Kendall
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Rebecca J Howell
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| | - Renee L Gustin
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| |
Collapse
|
9
|
Narne VK, Jain S, Ravi SK, Almudhi A, Krishna Y, Moore BCJ. The effect of recreational noise exposure on amplitude-modulation detection, hearing sensitivity at frequencies above 8 kHz, and perception of speech in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2562. [PMID: 37129676 DOI: 10.1121/10.0017973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Accepted: 04/08/2023] [Indexed: 05/03/2023]
Abstract
Psychoacoustic and speech perception measures were compared for a group who were exposed to noise regularly through listening to music via personal music players (PMP) and a control group without such exposure. Lifetime noise exposure, quantified using the NESI questionnaire, averaged ten times higher for the exposed group than for the control group. Audiometric thresholds were similar for the two groups over the conventional frequency range up to 8 kHz, but for higher frequencies, the exposed group had higher thresholds than the control group. Amplitude modulation detection (AMD) thresholds were measured using a 4000-Hz sinusoidal carrier presented in threshold-equalizing noise at 30, 60, and 90 dB sound pressure level (SPL) for modulation frequencies of 8, 16, 32, and 64 Hz. At 90 dB SPL but not at the lower levels, AMD thresholds were significantly higher (worse) for the exposed than for the control group, especially for low modulation frequencies. The exposed group required significantly higher signal-to-noise ratios than the control group to understand sentences in noise. Otoacoustic emissions did not differ for the two groups. It is concluded that listening to music via PMP can have subtle deleterious effects on speech perception, AM detection, and hearing sensitivity over the extended high-frequency range.
Collapse
Affiliation(s)
- Vijaya Kumar Narne
- Department of Medical Rehabilitation Sciences, College of Applied Medical Sciences, King Khalid University, Abha 61481, Saudi Arabia
| | - Saransh Jain
- All India Institute of Speech and Hearing, University of Mysore, Mysuru, India
| | - Sunil Kumar Ravi
- Department of Medical Rehabilitation Sciences, College of Applied Medical Sciences, King Khalid University, Abha 61481, Saudi Arabia
| | - Abdulaziz Almudhi
- Department of Medical Rehabilitation Sciences, College of Applied Medical Sciences, King Khalid University, Abha 61481, Saudi Arabia
| | - Yerraguntla Krishna
- Department of Medical Rehabilitation Sciences, College of Applied Medical Sciences, King Khalid University, Abha 61481, Saudi Arabia
- All India Institute of Speech and Hearing, University of Mysore, Mysuru, India
| | - Brian C J Moore
- Cambridge Hearing Group, Department of Psychology, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
10
|
Costantini G, Cesarini V, Di Leo P, Amato F, Suppa A, Asci F, Pisani A, Calculli A, Saggio G. Artificial Intelligence-Based Voice Assessment of Patients with Parkinson's Disease Off and On Treatment: Machine vs. Deep-Learning Comparison. SENSORS (BASEL, SWITZERLAND) 2023; 23:2293. [PMID: 36850893 PMCID: PMC9962335 DOI: 10.3390/s23042293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/13/2023] [Accepted: 02/16/2023] [Indexed: 06/18/2023]
Abstract
Parkinson's Disease (PD) is one of the most common non-curable neurodegenerative diseases. Diagnosis is achieved clinically on the basis of different symptoms with considerable delays from the onset of neurodegenerative processes in the central nervous system. In this study, we investigated early and full-blown PD patients based on the analysis of their voice characteristics with the aid of the most commonly employed machine learning (ML) techniques. A custom dataset was made with hi-fi quality recordings of vocal tasks gathered from Italian healthy control subjects and PD patients, divided into early diagnosed, off-medication patients on the one hand, and mid-advanced patients treated with L-Dopa on the other. Following the current state-of-the-art, several ML pipelines were compared usingdifferent feature selection and classification algorithms, and deep learning was also explored with a custom CNN architecture. Results show how feature-based ML and deep learning achieve comparable results in terms of classification, with KNN, SVM and naïve Bayes classifiers performing similarly, with a slight edge for KNN. Much more evident is the predominance of CFS as the best feature selector. The selected features act as relevant vocal biomarkers capable of differentiating healthy subjects, early untreated PD patients and mid-advanced L-Dopa treated patients.
Collapse
Affiliation(s)
- Giovanni Costantini
- Department of Electronic Engineering, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Valerio Cesarini
- Department of Electronic Engineering, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Pietro Di Leo
- Department of Electronic Engineering, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Federica Amato
- Department of Control and Computer Engineering, Polytechnic University of Turin, 10129 Turin, Italy
| | - Antonio Suppa
- Department of Human Neurosciences, Sapienza University of Rome, 00185 Rome, Italy
- IRCCS Neuromed Institute, 86077 Pozzilli, Italy
| | - Francesco Asci
- Department of Human Neurosciences, Sapienza University of Rome, 00185 Rome, Italy
- IRCCS Neuromed Institute, 86077 Pozzilli, Italy
| | - Antonio Pisani
- Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy
- IRCCS Mondino Foundation, 27100 Pavia, Italy
| | - Alessandra Calculli
- Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy
- IRCCS Mondino Foundation, 27100 Pavia, Italy
| | - Giovanni Saggio
- Department of Electronic Engineering, University of Rome Tor Vergata, 00133 Rome, Italy
| |
Collapse
|
11
|
Monson BB, Trine A. Extending the High-Frequency Bandwidth and Predicting Speech-in-Noise Recognition: Building on the Work of Pat Stelmachowicz. Semin Hear 2023; 44:S64-S74. [PMID: 36970650 PMCID: PMC10033195 DOI: 10.1055/s-0043-1764133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023] Open
Abstract
Recent work has demonstrated that high-frequency (>6 kHz) and extended high-frequency (EHF; >8 kHz) hearing is valuable for speech-in-noise recognition. Several studies also indicate that EHF pure-tone thresholds predict speech-in-noise performance. These findings contradict the broadly accepted "speech bandwidth" that has historically been limited to below 8 kHz. This growing body of work is a tribute to the work of Pat Stelmachowicz, whose research was instrumental in revealing the limitations of the prior speech bandwidth work, particularly for female talkers and child listeners. Here, we provide a historical review that demonstrates how the work of Stelmachowicz and her colleagues paved the way for subsequent research to measure effects of extended bandwidths and EHF hearing. We also present a reanalysis of previous data collected in our lab, the results of which suggest that 16-kHz pure-tone thresholds are consistent predictors of speech-in-noise performance, regardless of whether EHF cues are present in the speech signal. Based on the work of Stelmachowicz, her colleagues, and those who have come afterward, we argue that it is time to retire the notion of a limited speech bandwidth for speech perception for both children and adults.
Collapse
Affiliation(s)
- Brian B. Monson
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois
- Department of Biomedical and Translational Sciences, Carle Illinois College of Medicine, Urbana, Illinois
- Neuroscience Program, University of Illinois Urbana-Champaign, Champaign, Illinois
| | - Allison Trine
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois
| |
Collapse
|
12
|
Worasawate D, Asawaponwiput W, Yoshimura N, Intarapanich A, Surangsrirat D. Classification of Parkinson's disease from smartphone recording data using time-frequency analysis and convolutional neural network. Technol Health Care 2023; 31:705-718. [PMID: 36155539 DOI: 10.3233/thc-220386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
BACKGROUND Parkinson's disease (PD) is a long-term neurodegenerative disease of the central nervous system. The current diagnosis is dependent on clinical observation and the abilities and experience of a trained specialist. One of the symptoms that affects most patients is voice impairment. OBJECTIVE Voice samples are non-invasive data that can be collected remotely for diagnosis and disease progression monitoring. In this study, we analyzed voice recording data from a smartphone as a possible medical self-diagnosis tool by using only one-second voice recording. The data from one of the largest mobile PD studies, the mPower study, was used. METHODS A total of 29,798 ten-second voice recordings on smartphone from 4,051 participants were used for the analysis. The voice recordings were from sustained phonation by participants saying /aa/ for ten seconds into an iPhone microphone. A dataset comprising 385,143 short one-second audio samples was generated from the original ten-second voice recordings. The samples were converted to a spectrogram using a short-time Fourier transform. CNN models were then applied to classify the samples. RESULTS Classification accuracies of the proposed method with LeNet-5, ResNet-50, and VGGNet-16 are 97.7 ± 0.1%, 98.6 ± 0.2%, and 99.3 ± 0.1%, respectively. CONCLUSIONS We achieve a respectable classification performance using a generalized approach on a dataset with a large number of samples. The result emphasizes that an analysis based on one-second clip recorded on a smartphone could be a promising non-invasive and remotely available PD biomarker.
Collapse
Affiliation(s)
- Denchai Worasawate
- Department of Electrical Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand
| | - Warisara Asawaponwiput
- Department of Electrical Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand
| | - Natsue Yoshimura
- Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Japan
| | - Apichart Intarapanich
- Educational Technology Team, National Electronics and Computer Technology Center, Pathum Thani, Thailand
| | - Decho Surangsrirat
- Assistive Technology and Medical Devices Research Center, National Science and Technology Development Agency, Pathum Thani, Thailand
| |
Collapse
|
13
|
Gutz SE, Rowe HP, Tilton-Bolowsky VE, Green JR. Speaking with a KN95 face mask: a within-subjects study on speaker adaptation and strategies to improve intelligibility. Cogn Res Princ Implic 2022; 7:73. [PMID: 35907167 PMCID: PMC9339031 DOI: 10.1186/s41235-022-00423-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 07/18/2022] [Indexed: 11/15/2022] Open
Abstract
Mask-wearing during the COVID-19 pandemic has prompted a growing interest in the functional impact of masks on speech and communication. Prior work has shown that masks dampen sound, impede visual communication cues, and reduce intelligibility. However, more work is needed to understand how speakers change their speech while wearing a mask and to identify strategies to overcome the impact of wearing a mask. Data were collected from 19 healthy adults during a single in-person session. We investigated the effects of wearing a KN95 mask on speech intelligibility, as judged by two speech-language pathologists, examined speech kinematics and acoustics associated with mask-wearing, and explored KN95 acoustic filtering. We then considered the efficacy of three speaking strategies to improve speech intelligibility: Loud, Clear, and Slow speech. To inform speaker strategy recommendations, we related findings to self-reported speaker effort. Results indicated that healthy speakers could compensate for the presence of a mask and achieve normal speech intelligibility. Additionally, we showed that speaking loudly or clearly—and, to a lesser extent, slowly—improved speech intelligibility. However, using these strategies may require increased physical and cognitive effort and should be used only when necessary. These results can inform recommendations for speakers wearing masks, particularly those with communication disorders (e.g., dysarthria) who may struggle to adapt to a mask but can respond to explicit instructions. Such recommendations may further help non-native speakers and those communicating in a noisy environment or with listeners with hearing loss.
Collapse
|
14
|
Meenderink SWF, Lin X, Park BH, Dong W. Sound Induced Vibrations Deform the Organ of Corti Complex in the Low-Frequency Apical Region of the Gerbil Cochlea for Normal Hearing : Sound Induced Vibrations Deform the Organ of Corti Complex. J Assoc Res Otolaryngol 2022; 23:579-591. [PMID: 35798901 PMCID: PMC9613840 DOI: 10.1007/s10162-022-00856-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/16/2022] [Indexed: 10/17/2022] Open
Abstract
Human speech primarily contains low frequencies. It is well established that such frequencies maximally excite the cochlea near its apex. But, the micromechanics that precede and are involved in this transduction are not well understood. We measured vibrations from the low-frequency, second turn in intact gerbil cochleae using optical coherence tomography (OCT). The data were used to create spatial maps that detail the sound-evoked motions across the sensory organ of Corti complex (OCC). These maps were remarkably similar across animals and showed little variation with frequency or level. We identify four, anatomically distinct, response regions within the OCC: the basilar membrane (BM), the outer hair cells (OHC), the lateral compartment (lc), and the tectorial membrane (TM). Results provide evidence that active processes in the OHC play an important role in the mechanical interplay between different OCC structures which increases the amplitude and tuning sharpness of the traveling wave. The angle between the OCT beam and the OCC makes that we captured radial motions thought to be the effective stimulus to the mechano-sensitive hair bundles. We found that TM responses were relatively weak, arguing against a role in enhancing mechanical hair bundle deflection. Rather, BM responses were found to closely resemble the frequency selectivity and sensitivity found in auditory nerve fibers (ANF) that innervate the low-frequency cochlea.
Collapse
Affiliation(s)
| | - Xiaohui Lin
- VA Loma Linda Healthcare System, Loma Linda, CA, 92374, USA
| | - B Hyle Park
- Department of Bioengineering, University of California, Riverside, Riverside, CA, 92521, USA
| | - Wei Dong
- VA Loma Linda Healthcare System, Loma Linda, CA, 92374, USA.
- Loma Linda University Health, Loma Linda, CA, 92350, USA.
| |
Collapse
|
15
|
Monson BB, Buss E. On the use of the TIMIT, QuickSIN, NU-6, and other widely used bandlimited speech materials for speech perception experiments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1639. [PMID: 36182310 PMCID: PMC9473723 DOI: 10.1121/10.0013993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/20/2022] [Accepted: 08/20/2022] [Indexed: 05/29/2023]
Abstract
The use of spectrally degraded speech signals deprives listeners of acoustic information that is useful for speech perception. Several popular speech corpora, recorded decades ago, have spectral degradations, including limited extended high-frequency (EHF) (>8 kHz) content. Although frequency content above 8 kHz is often assumed to play little or no role in speech perception, recent research suggests that EHF content in speech can have a significant beneficial impact on speech perception under a wide range of natural listening conditions. This paper provides an analysis of the spectral content of popular speech corpora used for speech perception research to highlight the potential shortcomings of using bandlimited speech materials. Two corpora analyzed here, the TIMIT and NU-6, have substantial low-frequency spectral degradation (<500 Hz) in addition to EHF degradation. We provide an overview of the phenomena potentially missed by using bandlimited speech signals, and the factors to consider when selecting stimuli that are sensitive to these effects.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA
| | - Emily Buss
- Department of Otolaryngology/HNS, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, USA
| |
Collapse
|
16
|
Deep learning and machine learning-based voice analysis for the detection of COVID-19: A proposal and comparison of architectures. Knowl Based Syst 2022; 253:109539. [PMID: 35915642 PMCID: PMC9328841 DOI: 10.1016/j.knosys.2022.109539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 06/18/2022] [Accepted: 07/22/2022] [Indexed: 11/21/2022]
Abstract
Alongside the currently used nasal swab testing, the COVID-19 pandemic situation would gain noticeable advantages from low-cost tests that are available at any-time, anywhere, at a large-scale, and with real time answers. A novel approach for COVID-19 assessment is adopted here, discriminating negative subjects versus positive or recovered subjects. The scope is to identify potential discriminating features, highlight mid and short-term effects of COVID on the voice and compare two custom algorithms. A pool of 310 subjects took part in the study; recordings were collected in a low-noise, controlled setting employing three different vocal tasks. Binary classifications followed, using two different custom algorithms. The first was based on the coupling of boosting and bagging, with an AdaBoost classifier using Random Forest learners. A feature selection process was employed for the training, identifying a subset of features acting as clinically relevant biomarkers. The other approach was centered on two custom CNN architectures applied to mel-Spectrograms, with a custom knowledge-based data augmentation. Performances, evaluated on an independent test set, were comparable: Adaboost and CNN differentiated COVID-19 positive from negative with accuracies of 100% and 95% respectively, and recovered from negative individuals with accuracies of 86.1% and 75% respectively. This study highlights the possibility to identify COVID-19 positive subjects, foreseeing a tool for on-site screening, while also considering recovered subjects and the effects of COVID-19 on the voice. The two proposed novel architectures allow for the identification of biomarkers and demonstrate the ongoing relevance of traditional ML versus deep learning in speech analysis.
Collapse
|
17
|
Jain S, Narne VK, Nataraja NP, Madhukesh S, Kumar K, Moore BCJ. The effect of age and hearing sensitivity at frequencies above 8 kHz on auditory stream segregation and speech perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:716. [PMID: 35931505 DOI: 10.1121/10.0012917] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 07/07/2022] [Indexed: 06/06/2023]
Abstract
The effects of age and mild hearing loss over the extended high-frequency (EHF) range from 9000 to 16 000 Hz on speech perception and auditory stream segregation were assessed using four groups: (1) young with normal hearing threshold levels (HTLs) over both the conventional and EHF range; (2) older with audiograms matched to those for group 1; (3) young with normal HTLs over the conventional frequency range and elevated HTLs over the EHF range; (4) older with audiograms matched to those for group 3. For speech in quiet, speech recognition thresholds and speech identification scores did not differ significantly across groups. For monosyllables in noise, both greater age and hearing loss over the EHF range adversely affected performance, but the effect of age was much larger than the effect of hearing status. Stream segregation was assessed using a rapid sequence of vowel stimuli differing in fundamental frequency (F0). Larger differences in F0 were required for stream segregation for the two groups with impaired hearing in the EHF range, but there was no significant effect of age. It is argued that impaired hearing in the EHF range is associated with impaired auditory function at lower frequencies, despite normal audiometric thresholds at those frequencies.
Collapse
Affiliation(s)
- Saransh Jain
- All India Institute of Speech and Hearing, University of Mysore, Mysuru-570006 (Kar.), India
| | - Vijaya Kumar Narne
- Department of Medical Rehabilitation Sciences, College of Applied Medical Sciences, King Khalid University, Abha 61481, Saudi Arabia
| | - N P Nataraja
- JSS Institute of Speech and Hearing, University of Mysore, Mysuru-570004 (Kar.), India
| | - Sanjana Madhukesh
- Department of Speech and Hearing, Manipal College of Health Professionals, Manipal-576104 (Kar.), India
| | - Kruthika Kumar
- District Disabled Rehabilitation Centre, Chikmagalur-577126 (Kar.), India
| | - Brian C J Moore
- Cambridge Hearing Group, Department of Psychology, University of Cambridge, Cambridge, CB2 3EB, United Kingdom
| |
Collapse
|
18
|
Babu A, Malik P, Das N, Mandal D. Surface Potential Tuned Single Active Material Comprised Triboelectric Nanogenerator for a High Performance Voice Recognition Sensor. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2022; 18:e2201331. [PMID: 35499190 DOI: 10.1002/smll.202201331] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 04/04/2022] [Indexed: 06/14/2023]
Abstract
To fabricate a high-performance and ultrasensitive triboelectric nanogenerator (TENG), choice of a combination of different materials of triboelectric series is one of the prime challenging tasks. An effective way to fabricate a TENG with a single material (abbreviated as S-TENG) is proposed, comprising electrospun nylon nanofibers. The surface potential of the nanofibers are tuned by changing the voltage polarity in the electrospinning setup, employed between the needle and collector. The difference in surface potential leads to a different work function that is the key to design S-TENG with a single material only. Further, S-TENG is demonstrated as an ultrahigh sensitive acoustic sensor with mechanoacoustic sensitivity of ≈27 500 mV Pa-1 . Due to high sensitivity in the low-to-middle decibel (60-70 dB) sounds, S-TENG is highly capable in recognizing different voice signals depending on the condition of the vocal cord. This effective voice recognition ability indicates that it has high potential to open an alternative pathway for medical professionals to detect several diseases such as neurological voice disorder, muscle tension dysphonia, vocal cord paralysis, and speech delay/disorder related to laryngeal complications.
Collapse
Affiliation(s)
- Anand Babu
- Quantum Materials and Devices Unit, Institute of Nano Science and Technology, Knowledge City, Sector 81, Mohali, 140306, India
| | - Pinki Malik
- Quantum Materials and Devices Unit, Institute of Nano Science and Technology, Knowledge City, Sector 81, Mohali, 140306, India
| | - Nityananda Das
- Department of Physics, Jagannath Kishore College, Purulia, West Bengal, 723101, India
| | - Dipankar Mandal
- Quantum Materials and Devices Unit, Institute of Nano Science and Technology, Knowledge City, Sector 81, Mohali, 140306, India
| |
Collapse
|
19
|
Hinton AS, Yang-Hood A, Schrader AD, Loose C, Ohlemiller KK, McLean WJ. Approaches to Treat Sensorineural Hearing Loss by Hair-Cell Regeneration: The Current State of Therapeutic Developments and Their Potential Impact on Audiological Clinical Practice. J Am Acad Audiol 2022; 32:661-669. [PMID: 35609593 PMCID: PMC9129918 DOI: 10.1055/s-0042-1750281] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Sensorineural hearing loss (SNHL) is typically a permanent and often progressive condition that is commonly attributed to sensory cell loss. All vertebrates except mammals can regenerate lost sensory cells. Thus, SNHL is currently only treated with hearing aids or cochlear implants. There has been extensive research to understand how regeneration occurs in nonmammals, how hair cells form during development, and what limits regeneration in maturing mammals. These studies motivated efforts to identify therapeutic interventions to regenerate hair cells as a treatment for hearing loss, with a focus on targeting supporting cells to form new sensory hair cells. The approaches include gene therapy and small molecule delivery to the inner ear. At the time of this publication, early-stage clinical trials have been conducted to test targets that have shown evidence of regenerating sensory hair cells in preclinical models. As these potential treatments move closer to a clinical reality, it will be important to understand which therapeutic option is most appropriate for a given population. It is also important to consider which audiological tests should be administered to identify hearing improvement while considering the pharmacokinetics and mechanism of a given approach. Some impacts on audiological practice could include implementing less common audiological measures as standard procedure. As devices are not capable of repairing the damaged underlying biology, hair-cell regeneration treatments could allow patients to benefit more from their devices, move from a cochlear implant candidate to a hearing aid candidate, or move a subject to not needing an assistive device. Here, we describe the background, current state, and future implications of hair-cell regeneration research.
Collapse
Affiliation(s)
| | - Aizhen Yang-Hood
- Department of Otolaryngology, Central Institute for the Deaf, Fay and Carl Simons Center for Hearing and Deafness, Washington University School of Medicine, Saint Louis, Missouri
| | - Angela D Schrader
- Department of Otolaryngology, Central Institute for the Deaf, Fay and Carl Simons Center for Hearing and Deafness, Washington University School of Medicine, Saint Louis, Missouri
| | | | - Kevin K Ohlemiller
- Department of Otolaryngology, Central Institute for the Deaf, Fay and Carl Simons Center for Hearing and Deafness, Washington University School of Medicine, Saint Louis, Missouri
| | - Will J McLean
- Frequency Therapeutics, Lexington, Massachusetts.,Department of Surgery, University of Connecticut School of Medicine, Farmington, Connecticut
| |
Collapse
|
20
|
Ardeshirrouhanifard S, Fossa SD, Huddart R, Monahan PO, Fung C, Song Y, Dolan ME, Feldman DR, Hamilton RJ, Vaughn D, Martin NE, Kollmannsberger C, Dinh P, Einhorn L, Frisina RD, Travis LB. Ototoxicity After Cisplatin-Based Chemotherapy: Factors Associated With Discrepancies Between Patient-Reported Outcomes and Audiometric Assessments. Ear Hear 2022; 43:794-807. [PMID: 35067571 PMCID: PMC9010341 DOI: 10.1097/aud.0000000000001172] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
OBJECTIVES To provide new information on factors associated with discrepancies between patient-reported and audiometrically defined hearing loss (HL) in adult-onset cancer survivors after cisplatin-based chemotherapy (CBCT) and to comprehensively investigate risk factors associated with audiometrically defined HL. DESIGN A total of 1410 testicular cancer survivors (TCS) ≥6 months post-CBCT underwent comprehensive audiometric assessments (0.25 to 12 kHz) and completed questionnaires. HL severity was defined using American Speech-Language-Hearing Association criteria. Multivariable multinomial regression identified factors associated with discrepancies between patient-reported and audiometrically defined HL and multivariable ordinal regression evaluated factors associated with the latter. RESULTS Overall, 34.8% of TCS self-reported HL. Among TCS without tinnitus, those with audiometrically defined HL at only extended high frequencies (EHFs) (10 to 12 kHz) (17.8%) or at both EHFs and standard frequencies (0.25 to 8 kHz) (23.4%) were significantly more likely to self-report HL than those with no audiometrically defined HL (8.1%) [odds ratio (OR) = 2.48; 95% confidence interval (CI), 1.31 to 4.68; and OR = 3.49; 95% CI, 1.89 to 6.44, respectively]. Older age (OR = 1.09; 95% CI, 1.07 to 1.11, p < 0.0001), absence of prior noise exposure (OR = 1.40; 95% CI, 1.06 to 1.84, p = 0.02), mixed/conductive HL (OR = 2.01; 95% CI, 1.34 to 3.02, p = 0.0007), no hearing aid use (OR = 5.64; 95% CI, 1.84 to 17.32, p = 0.003), and lower education (OR = 2.12; 95% CI, 1.23 to 3.67, p = 0.007 for high school or less education versus postgraduate education) were associated with greater underestimation of audiometrically defined HL severity, while tinnitus was associated with greater overestimation (OR = 4.65; 95% CI, 2.64 to 8.20 for a little tinnitus, OR = 5.87; 95% CI, 2.65 to 13.04 for quite a bit tinnitus, and OR = 10.57; 95% CI, 4.91 to 22.79 for very much tinnitus p < 0.0001). Older age (OR = 1.13; 95% CI, 1.12 to 1.15, p < 0.0001), cumulative cisplatin dose (>300 mg/m2, OR = 1.47; 95% CI, 1.21 to 1.80, p = 0.0001), and hypertension (OR = 1.80; 95% CI, 1.28 to 2.52, p = 0.0007) were associated with greater American Speech-Language-Hearing Association-defined HL severity, whereas postgraduate education (OR = 0.58; 95% CI, 0.40 to 0.85, p = 0.005) was associated with less severe HL. CONCLUSIONS Discrepancies between patient-reported and audiometrically defined HL after CBCT are due to several factors. For survivors who self-report HL but have normal audiometric findings at standard frequencies, referral to an audiologist for additional testing and inclusion of EHFs in audiometric assessments should be considered.
Collapse
Affiliation(s)
| | | | | | | | - Chunkit Fung
- University of Rochester Medical Center, Rochester, NY
| | | | | | | | | | | | | | | | - Paul Dinh
- Indiana University, Indianapolis, IN
| | | | | | | |
Collapse
|
21
|
Lough M, Plack CJ. Extended high-frequency audiometry in research and clinical practice. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:1944. [PMID: 35364938 DOI: 10.1121/10.0009766] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 02/15/2022] [Indexed: 06/14/2023]
Abstract
Audiometric testing in research and in clinical settings rarely considers frequencies above 8 kHz. However, the sensitivity of young healthy ears extends to 20 kHz, and there is increasing evidence that testing in the extended high-frequency (EHF) region, above 8 kHz, might provide valuable additional information. Basal (EHF) cochlear regions are especially sensitive to the effects of aging, disease, ototoxic drugs, and possibly noise exposure. Hence, EHF loss may be an early warning of damage, useful for diagnosis and for monitoring hearing health. In certain environments, speech perception may rely on EHF information, and there is evidence for an association between EHF loss and speech perception difficulties, although this may not be causal: EHF loss may instead be a marker for sub-clinical damage at lower frequencies. If there is a causal relation, then amplification in the EHF range may be beneficial if the technical difficulties can be overcome. EHF audiometry in the clinic presents with no particular difficulty, the biggest obstacle being lack of specialist equipment. Currently, EHF audiometry has limited but increasing clinical application. With the development of international guidelines and standards, it is likely that EHF testing will become widespread in future.
Collapse
Affiliation(s)
- Melanie Lough
- Manchester Centre for Audiology and Deafness, The University of Manchester, Oxford Road, Manchester, M13 9PL, United Kingdom
| | - Christopher J Plack
- Manchester Centre for Audiology and Deafness, The University of Manchester, Oxford Road, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
22
|
McKenna VS, Kendall CL, Patel TH, Howell RJ, Gustin RL. Impact of Face Masks on Speech Acoustics and Vocal Effort in Healthcare Professionals. Laryngoscope 2022; 132:391-397. [PMID: 34287933 PMCID: PMC8742743 DOI: 10.1002/lary.29763] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 07/07/2021] [Accepted: 07/08/2021] [Indexed: 02/03/2023]
Abstract
OBJECTIVES/HYPOTHESIS We investigated speech acoustics and self-reported vocal symptoms in mask-wearing healthcare professionals. We hypothesized that there would be an attenuation of spectral energies and increase in vocal effort during masked speech compared to unmasked speech. STUDY DESIGN Within and between subject quasi-experimental design. METHODS We prospectively enrolled 21 healthcare providers (13 cisgender female, 8 cisgender male; M = 32.9 years; SD = 7.9 years) and assessed acoustics and perceptual measures with and without a face mask in place. Measurements included: 1) acoustic Vowel Articulation Index (VAI); 2) cepstral and spectral acoustic measures; 3) traditional vocal measures (e.g., fundamental frequency, intensity); 4) relative fundamental frequency (RFF); and 5) self-reported ratings of vocal effort and dyspnea. RESULTS During masked speech, there was a significant reduction in VAI, high-frequency information (>4 kHz), and RFF offset 10, as well as a significant increase in cepstral peak prominence and perceived vocal effort. Further analysis showed that high-frequency attenuation was more pronounced when wearing an N95 mask compared to a simple mask. CONCLUSIONS Face masks pose an additional barrier to effective communication that primarily impacts spectral characteristics, vowel space measures, and vocal effort. Future work should evaluate how long-term mask use impacts vocal health and may contribute to vocal problems. LEVEL OF EVIDENCE 3 Laryngoscope, 132:391-397, 2022.
Collapse
Affiliation(s)
- Victoria S. McKenna
- Department of Communication Sciences and Disorders, University of Cincinnati
- Department of Biomedical Engineering, University of Cincinnati
- Corresponding Author: 3225 Eden Ave, Cincinnati, Ohio 45267; ; 513-558-8507
| | - Courtney L. Kendall
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Tulsi H. Patel
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Rebecca J. Howell
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| | - Renee L. Gustin
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| |
Collapse
|
23
|
Effect of Masker Head Orientation, Listener Age, and Extended High-Frequency Sensitivity on Speech Recognition in Spatially Separated Speech. Ear Hear 2022; 43:90-100. [PMID: 34260434 PMCID: PMC8712343 DOI: 10.1097/aud.0000000000001081] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
OBJECTIVES Masked speech recognition is typically assessed as though the target and background talkers are all directly facing the listener. However, background speech in natural environments is often produced by talkers facing other directions, and talker head orientation affects the spectral content of speech, particularly at the extended high frequencies (EHFs; >8 kHz). This study investigated the effect of masker head orientation and listeners' EHF sensitivity on speech-in-speech recognition and spatial release from masking in children and adults. DESIGN Participants were 5- to 7-year-olds (n = 15) and adults (n = 34), all with normal hearing up to 8 kHz and a range of EHF hearing thresholds. Speech reception thresholds (SRTs) were measured for target sentences recorded from a microphone directly in front of the talker's mouth and presented from a loudspeaker directly in front of the listener, simulating a target directly in front of and facing the listener. The maskers were two streams of concatenated words recorded from a microphone located at either 0° or 60° azimuth, simulating masker talkers facing the listener or facing away from the listener, respectively. Maskers were presented in one of three spatial conditions: co-located with the target, symmetrically separated on either side of the target (+54° and -54° on the horizontal plane), or asymmetrically separated to the right of the target (both +54° on the horizontal plane). RESULTS Performance was poorer for the facing than for the nonfacing masker head orientation. This benefit of the nonfacing masker head orientation, or head orientation release from masking (HORM), was largest under the co-located condition, but it was also observed for the symmetric and asymmetric masker spatial separation conditions. SRTs were positively correlated with the mean 16-kHz threshold across ears in adults for the nonfacing conditions but not for the facing masker conditions. In adults with normal EHF thresholds, the HORM was comparable in magnitude to the benefit of a symmetric spatial separation of the target and maskers. Although children benefited from the nonfacing masker head orientation, their HORM was reduced compared to adults with normal EHF thresholds. Spatial release from masking was comparable across age groups for symmetric masker placement, but it was larger in adults than children for the asymmetric masker. CONCLUSIONS Masker head orientation affects speech-in-speech recognition in children and adults, particularly those with normal EHF thresholds. This is important because masker talkers do not all face the listener under most natural listening conditions, and assuming a midline orientation would tend to overestimate the effect of spatial separation. The benefits associated with EHF audibility for speech-in-speech recognition may warrant clinical evaluation of thresholds above 8 kHz.
Collapse
|
24
|
Qi S, Chen X, Yang J, Wang X, Tian X, Huang H, Rehmann J, Kuehnel V, Guan J, Xu L. Effects of Adaptive Non-linear Frequency Compression in Hearing Aids on Mandarin Speech and Sound-Quality Perception. Front Neurosci 2021; 15:722970. [PMID: 34483833 PMCID: PMC8414550 DOI: 10.3389/fnins.2021.722970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/26/2021] [Indexed: 11/29/2022] Open
Abstract
Objective This study was aimed at examining the effects of an adaptive non-linear frequency compression algorithm implemented in hearing aids (i.e., SoundRecover2, or SR2) at different parameter settings and auditory acclimatization on speech and sound-quality perception in native Mandarin-speaking adult listeners with sensorineural hearing loss. Design Data consisted of participants’ unaided and aided hearing thresholds, Mandarin consonant and vowel recognition in quiet, and sentence recognition in noise, as well as sound-quality ratings through five sessions in a 12-week period with three SR2 settings (i.e., SR2 off, SR2 default, and SR2 strong). Study Sample Twenty-nine native Mandarin-speaking adults aged 37–76 years old with symmetric sloping moderate-to-profound sensorineural hearing loss were recruited. They were all fitted bilaterally with Phonak Naida V90-SP BTE hearing aids with hard ear-molds. Results The participants demonstrated a significant improvement of aided hearing in detecting high frequency sounds at 8 kHz. For consonant recognition and overall sound-quality rating, the participants performed significantly better with the SR2 default setting than the other two settings. No significant differences were found in vowel and sentence recognition among the three SR2 settings. Test session was a significant factor that contributed to the participants’ performance in all speech and sound-quality perception tests. Specifically, the participants benefited from a longer duration of hearing aid use. Conclusion Findings from this study suggested possible perceptual benefit from the adaptive non-linear frequency compression algorithm for native Mandarin-speaking adults with moderate-to-profound hearing loss. Periods of acclimatization should be taken for better performance in novel technologies in hearing aids.
Collapse
Affiliation(s)
- Shuang Qi
- Beijing Tongren Hospital, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Beijing Institute of Otolaryngology, Capital Medical University, Ministry of Education, Beijing, China
| | - Xueqing Chen
- Beijing Tongren Hospital, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Beijing Institute of Otolaryngology, Capital Medical University, Ministry of Education, Beijing, China
| | - Jing Yang
- Department of Communication Sciences and Disorders, University of Wisconsin-Milwaukee, Milwaukee, WI, United States
| | - Xianhui Wang
- Division of Communication Sciences and Disorders, Ohio University, Athens, OH, United States
| | | | | | | | | | | | - Li Xu
- Division of Communication Sciences and Disorders, Ohio University, Athens, OH, United States
| |
Collapse
|
25
|
Thys TM, Treviño J, Nadkarni NM. Perceptual–Acoustic Comparisons of Natural Sonic Environments: Applications for Nature-Deprived Populations. ECOPSYCHOLOGY 2021. [DOI: 10.1089/eco.2021.0007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Tierney M. Thys
- Research Department, California Academy of Sciences, San Francisco, California, USA
| | - Jeffrey Treviño
- Department of Recording and Music Technology, College of Arts, Humanities, and Social Sciences, California State University, Marina, California, USA
| | - Nalini M. Nadkarni
- School of Biological Science, University of Utah, Salt Lake City, Utah, USA
| |
Collapse
|
26
|
Minimal and Mild Hearing Loss in Children: Association with Auditory Perception, Cognition, and Communication Problems. Ear Hear 2021; 41:720-732. [PMID: 31633598 DOI: 10.1097/aud.0000000000000802] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES "Minimal" and "mild" hearing loss are the most common but least understood forms of hearing loss in children. Children with better ear hearing level as low as 30 dB HL have a global language impairment and, according to the World Health Organization, a "disabling level of hearing loss." We examined in a population of 6- to 11-year-olds how hearing level ≤40.0 dB HL (1 and 4 kHz pure-tone average, PTA, threshold) is related to auditory perception, cognition, and communication. DESIGN School children (n = 1638) were recruited in 4 centers across the United Kingdom. They completed a battery of hearing (audiometry, filter width, temporal envelope, speech-in-noise) and cognitive (IQ, attention, verbal memory, receptive language, reading) tests. Caregivers assessed their children's communication and listening skills. Children included in this study (702 male; 752 female) had 4 reliable tone thresholds (1, 4 kHz each ear), and no caregiver reported medical or intellectual disorder. Normal-hearing children (n = 1124, 77.1%) had all 4 thresholds and PTA <15 dB HL. Children with ≥15 dB HL for at least 1 threshold, and PTA <20 dB (n = 245, 16.8%) had minimal hearing loss. Children with 20 ≤PTA <40 dB HL (n = 88, 6.0%) had mild hearing loss. Interaural asymmetric hearing loss ( left PTA - right PTA ≥10 dB) was found in 28.9% of those with minimal and 39.8% of those with mild hearing loss. RESULTS Speech perception in noise, indexed by vowel-consonant-vowel pseudoword repetition in speech-modulated noise, was impaired in children with minimal and mild hearing loss, relative to normal-hearing children. Effect size was largest (d = 0.63) in asymmetric mild hearing loss and smallest (d = 0.21) in symmetric minimal hearing loss. Spectral (filter width) and temporal (backward masking) perceptions were impaired in children with both forms of hearing loss, but suprathreshold perception generally related only weakly to PTA. Speech-in-noise (nonsense syllables) and language (pseudoword repetition) were also impaired in both forms of hearing loss and correlated more strongly with PTA. Children with mild hearing loss were additionally impaired in working memory (digit span) and reading, and generally performed more poorly than those with minimal loss. Asymmetric hearing loss produced as much impairment overall on both auditory and cognitive tasks as symmetric hearing loss. Nonverbal IQ, attention, and caregiver-rated listening and communication were not significantly impaired in children with hearing loss. Modeling suggested that 15 dB HL is objectively an appropriate lower audibility limit for diagnosis of hearing loss. CONCLUSIONS Hearing loss between 15 and 30 dB PTA is, at ~20%, much more prevalent in 6- to 11-year-old children than most current estimates. Key aspects of auditory and cognitive skills are impaired in both symmetric and asymmetric minimal and mild hearing loss. Hearing loss <30 dB HL is most closely related to speech perception in noise, and to cognitive abilities underpinning language and reading. The results suggest wider use of speech-in-noise measures to diagnose and assess management of hearing loss and reduction of the clinical hearing loss threshold for children to 15 dB HL.
Collapse
|
27
|
Yi H, Pingsterhaus A, Song W. Effects of Wearing Face Masks While Using Different Speaking Styles in Noise on Speech Intelligibility During the COVID-19 Pandemic. Front Psychol 2021; 12:682677. [PMID: 34295288 PMCID: PMC8292133 DOI: 10.3389/fpsyg.2021.682677] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 05/28/2021] [Indexed: 12/30/2022] Open
Abstract
The coronavirus pandemic has resulted in the recommended/required use of face masks in public. The use of a face mask compromises communication, especially in the presence of competing noise. It is crucial to measure the potential effects of wearing face masks on speech intelligibility in noisy environments where excessive background noise can create communication challenges. The effects of wearing transparent face masks and using clear speech to facilitate better verbal communication were evaluated in this study. We evaluated listener word identification scores in the following four conditions: (1) type of mask condition (i.e., no mask, transparent mask, and disposable face mask), (2) presentation mode (i.e., auditory only and audiovisual), (3) speaking style (i.e., conversational speech and clear speech), and (4) with two types of background noise (i.e., speech shaped noise and four-talker babble at -5 signal-to-noise ratio). Results indicate that in the presence of noise, listeners performed less well when the speaker wore a disposable face mask or a transparent mask compared to wearing no mask. Listeners correctly identified more words in the audiovisual presentation when listening to clear speech. Results indicate the combination of face masks and the presence of background noise negatively impact speech intelligibility for listeners. Transparent masks facilitate the ability to understand target sentences by providing visual information. Use of clear speech was shown to alleviate challenging communication situations including compensating for a lack of visual cues and reduced acoustic signals.
Collapse
Affiliation(s)
- Hoyoung Yi
- Department of Speech, Language, and Hearing Sciences, Texas Tech University Health Sciences Center, Lubbock, TX, United States
| | - Ashly Pingsterhaus
- Department of Speech, Language, and Hearing Sciences, Texas Tech University Health Sciences Center, Lubbock, TX, United States
| | - Woonyoung Song
- Department of Educational Psychology and Leadership, Texas Tech University, Lubbock, TX, United States
| |
Collapse
|
28
|
Balamurali BT, Enyi T, Clarke CJ, Harn SY, Chen JM. Acoustic Effect of Face Mask Design and Material Choice. ACOUSTICS AUSTRALIA 2021; 49:505-512. [PMID: 34099950 PMCID: PMC8172558 DOI: 10.1007/s40857-021-00245-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 05/21/2021] [Indexed: 05/28/2023]
Abstract
UNLABELLED The widespread adoption of face masks is now a standard public health response to the 2020 pandemic. Although studies have shown that wearing a face mask interferes with speech and intelligibility, relating the acoustic response of the mask to design parameters such as fabric choice, number of layers and mask geometry is not well understood. Using a dummy head mounted with a loudspeaker at its mouth generating a broadband signal, we report the acoustic response associated with 10 different masks (different material/design) and the effect of material layers; a small number of masks were found to be almost acoustically transparent (minimal losses). While different mask material and design result in different frequency responses, we find that material selection has somewhat greater influence on transmission characteristics than mask design or geometry choices. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s40857-021-00245-2.
Collapse
Affiliation(s)
- B. T. Balamurali
- Singapore University of Technology and Design, 8 Somapah Rd, Singapore, Singapore
| | - Tan Enyi
- Singapore University of Technology and Design, 8 Somapah Rd, Singapore, Singapore
| | | | - Sim Yuh Harn
- Singapore University of Technology and Design, 8 Somapah Rd, Singapore, Singapore
| | - Jer-Ming Chen
- Singapore University of Technology and Design, 8 Somapah Rd, Singapore, Singapore
| |
Collapse
|
29
|
Bröhl F, Kayser C. Delta/theta band EEG differentially tracks low and high frequency speech-derived envelopes. Neuroimage 2021; 233:117958. [PMID: 33744458 PMCID: PMC8204264 DOI: 10.1016/j.neuroimage.2021.117958] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/08/2021] [Accepted: 03/09/2021] [Indexed: 11/01/2022] Open
Abstract
The representation of speech in the brain is often examined by measuring the alignment of rhythmic brain activity to the speech envelope. To conveniently quantify this alignment (termed 'speech tracking') many studies consider the broadband speech envelope, which combines acoustic fluctuations across the spectral range. Using EEG recordings, we show that using this broadband envelope can provide a distorted picture on speech encoding. We systematically investigated the encoding of spectrally-limited speech-derived envelopes presented by individual and multiple noise carriers in the human brain. Tracking in the 1 to 6 Hz EEG bands differentially reflected low (0.2 - 0.83 kHz) and high (2.66 - 8 kHz) frequency speech-derived envelopes. This was independent of the specific carrier frequency but sensitive to attentional manipulations, and may reflect the context-dependent emphasis of information from distinct spectral ranges of the speech envelope in low frequency brain activity. As low and high frequency speech envelopes relate to distinct phonemic features, our results suggest that functionally distinct processes contribute to speech tracking in the same EEG bands, and are easily confounded when considering the broadband speech envelope.
Collapse
Affiliation(s)
- Felix Bröhl
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany.
| | - Christoph Kayser
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany
| |
Collapse
|
30
|
Mapping the human auditory cortex using spectrotemporal receptive fields generated with magnetoencephalography. Neuroimage 2021; 238:118222. [PMID: 34058330 DOI: 10.1016/j.neuroimage.2021.118222] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 05/25/2021] [Accepted: 05/28/2021] [Indexed: 11/24/2022] Open
Abstract
We present a novel method to map the functional organization of the human auditory cortex noninvasively using magnetoencephalography (MEG). More specifically, this method estimates via reverse correlation the spectrotemporal receptive fields (STRF) in response to a temporally dense pure tone stimulus, from which important spectrotemporal characteristics of neuronal processing can be extracted and mapped back onto the cortex surface. We show that several neuronal populations can be found examining the spectrotemporal characteristics of their STRFs, and demonstrate how these can be used to generate tonotopic gradient maps. In doing so, we show that the spatial resolution of MEG is sufficient to reliably extract important information about the spatial organization of the auditory cortex, while enabling the analysis of complex temporal dynamics of auditory processing such as best temporal modulation rate and response latency given its excellent temporal resolution. Furthermore, because spectrotemporally dense auditory stimuli can be used with MEG, the time required to acquire the necessary data to generate tonotopic maps is significantly less for MEG than for other neuroimaging tools that acquire BOLD-like signals.
Collapse
|
31
|
Extended high-frequency hearing and head orientation cues benefit children during speech-in-speech recognition. Hear Res 2021; 406:108230. [PMID: 33951577 DOI: 10.1016/j.heares.2021.108230] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/03/2021] [Accepted: 03/18/2021] [Indexed: 12/29/2022]
Abstract
While the audible frequency range for humans spans approximately 20 Hz to 20 kHz, children display enhanced sensitivity relative to adults when detecting extended high frequencies (frequencies above 8 kHz; EHFs), as indicated by better pure tone thresholds. The impact that this increased hearing sensitivity to EHFs may have on children's speech recognition has not been established. One context in which EHF hearing may be particularly important for children is when recognizing speech in the presence of competing talkers. In the present study, we examined the extent to which school-age children (ages 5-17 years) with normal hearing were able to benefit from EHF cues when recognizing sentences in a two-talker speech masker. Two filtering conditions were tested: all stimuli were either full band or were low-pass filtered at 8 kHz to remove EHFs. Given that EHF energy emission in speech is highly dependent on head orientation of the talker (i.e., radiation becomes more directional with increasing frequency), two masker head angle conditions were tested: both co-located maskers were facing 45°, or both were facing 60° relative to the listener. The results demonstrated that regardless of age, children performed better when EHFs were present. In addition, a small change in masker head orientation also impacted performance, with better recognition at 60° compared to 45°. These findings suggest that EHF energy in the speech signal above 8 kHz is beneficial for children in complex listening situations. The magnitude of benefit from EHF cues and talker head orientation cues did not differ between children and adults. Therefore, while EHFs were beneficial for children as young as 5 years of age, children's generally better EHF hearing relative to adults did not provide any additional benefit.
Collapse
|
32
|
Dabbaghchian S, Arnela M, Engwall O, Guasch O. Simulation of vowel-vowel utterances using a 3D biomechanical-acoustic model. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2021; 37:e3407. [PMID: 33070445 DOI: 10.1002/cnm.3407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 09/17/2020] [Accepted: 09/28/2020] [Indexed: 06/11/2023]
Abstract
A link is established between biomechanical and acoustic 3D models for the numerical simulation of vowel-vowel utterances. The former rely on the activation and contraction of relevant muscles for voice production, which displace and distort speech organs. However, biomechanical models do not provide a closed computational domain of the 3D vocal tract airway where to simulate sound wave propagation. An algorithm is thus proposed to extract the vocal tract boundary from the surrounding anatomical structures at each time step of the transition between vowels. The resulting 3D geometries are fed into a 3D finite element acoustic model that solves the mixed wave equation for the acoustic pressure and particle velocity. An arbitrary Lagrangian-Eulerian framework is considered to account for the evolving vocal tract. Examples include six static vowels and three dynamic vowel-vowel utterances. Plausible muscle activation patterns are first determined for the static vowel sounds following an inverse method. Dynamic utterances are then generated by linearly interpolating the muscle activation of the static vowels. Results exhibit nonlinear trajectory of the vocal tract geometry, similar to that observed in electromagnetic midsagittal articulography. Clear differences are appreciated when comparing the generated sound with that obtained from direct linear interpolation of the vocal tract geometry. That is, interpolation between the starting and ending vocal tract geometries of an utterance, without resorting to any biomechanical model.
Collapse
Affiliation(s)
- Saeed Dabbaghchian
- Department of Speech, Music, and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Marc Arnela
- GTM Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Olov Engwall
- Department of Speech, Music, and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Oriol Guasch
- GTM Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| |
Collapse
|
33
|
Trine A, Monson BB. Extended High Frequencies Provide Both Spectral and Temporal Information to Improve Speech-in-Speech Recognition. Trends Hear 2020; 24:2331216520980299. [PMID: 33345755 PMCID: PMC7756042 DOI: 10.1177/2331216520980299] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Several studies have demonstrated that extended high frequencies (EHFs; >8 kHz) in speech are not only audible but also have some utility for speech recognition, including for speech-in-speech recognition when maskers are facing away from the listener. However, the contribution of EHF spectral versus temporal information to speech recognition is unknown. Here, we show that access to EHF temporal information improved speech-in-speech recognition relative to speech bandlimited at 8 kHz but that additional access to EHF spectral detail provided an additional small but significant benefit. Results suggest that both EHF spectral structure and the temporal envelope contribute to the observed EHF benefit. Speech recognition performance was quite sensitive to masker head orientation, with a rotation of only 15° providing a highly significant benefit. An exploratory analysis indicated that pure-tone thresholds at EHFs are better predictors of speech recognition performance than low-frequency pure-tone thresholds.
Collapse
Affiliation(s)
- Allison Trine
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, United States
| | - Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, United States.,Neuroscience Program, University of Illinois at Urbana-Champaign, Champaign, United States
| |
Collapse
|
34
|
Abstract
OBJECTIVES Hearing in the extended high frequencies (EHFs; >8 kHz) is perceptually and clinically relevant. Recent work suggests the possible role of EHF audibility in natural listening environments (e.g., spatial hearing) and hidden hearing loss. In this article, we examine the development of frequency discrimination (FD) in the EHFs. Specifically, the objectives of the present study were to answer if the developmental timeline for FD is different for EHFs; and whether the discontinuity of FD thresholds across frequency-representing the hypothetical shift from a temporal to place code-for children occurs at about the same frequency as adults. DESIGN Thirty-one normal-hearing children (5 to 12 years) and 15 young adults participated in this study. FD thresholds were measured for standard frequencies (1, 2, 4, 6, and 8 kHz) and EHFs (10 and 12.5 kHz) using a three-alternative (odd-ball) forced-choice paradigm. Statistical analysis focused on examining the change of FD thresholds as a function of age and estimating the breakpoints in the discrimination threshold-frequency functions. RESULTS FD performance in younger children for EHFs was nearly six times poorer relative to older children and adults; however, there was no effect of test frequency on the child-adult difference. Change-point detection on group data revealed a higher knot frequency-representing the putative transition from temporal to place mechanisms-for adults (9.8 kHz) than children (~6 kHz). Individual spline functions suggest that the knot frequency varied from 2 to 10 kHz across participants. CONCLUSIONS The present study provides evidence for a similar rate of maturation of FD for EHFs and standard frequencies. FD at EHFs matures by 10 to 12 years of age. Adult listeners may not all use temporal cues up to 10 kHz. Young children are relatively inefficient in using temporal fine-structure cues for FD at frequencies above 6 kHz.
Collapse
|
35
|
Helfer KS, Jesse A. Hearing and speech processing in midlife. Hear Res 2020; 402:108097. [PMID: 33706999 DOI: 10.1016/j.heares.2020.108097] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 09/29/2020] [Accepted: 10/13/2020] [Indexed: 12/20/2022]
Abstract
Middle-aged adults often report a decline in their ability to understand speech in adverse listening situations. However, there has been relatively little research devoted to identifying how early aging affects speech processing, as the majority of investigations into senescent changes in speech understanding compare performance in groups of young and older adults. This paper provides an overview of research on hearing and speech perception in middle-aged adults. Topics covered include both objective and subjective (self-perceived) hearing and speech understanding, listening effort, and audiovisual speech perception. This review ends with justification for future research needed to define the nature, consequences, and remediation of hearing problems in middle-aged adults.
Collapse
Affiliation(s)
- Karen S Helfer
- Department of Communication Disorders, University of Massachusetts Amherst, 358 N. Pleasant St., Amherst, MA 01003, USA.
| | - Alexandra Jesse
- Department of Psychological and Brain Sciences, University of Massachusetts Amherst, 135 Hicks Way, Amherst, MA 01003, USA.
| |
Collapse
|
36
|
Talkington WJ, Donai J, Kadner AS, Layne ML, Forino A, Wen S, Gao S, Gray MM, Ashraf AJ, Valencia GN, Smith BD, Khoo SK, Gray SJ, Lass N, Brefczynski-Lewis JA, Engdahl S, Graham D, Frum CA, Lewis JW. Electrophysiological Evidence of Early Cortical Sensitivity to Human Conspecific Mimic Voice as a Distinct Category of Natural Sound. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3539-3559. [PMID: 32936717 PMCID: PMC8060013 DOI: 10.1044/2020_jslhr-20-00063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 04/29/2020] [Accepted: 07/01/2020] [Indexed: 06/11/2023]
Abstract
Purpose From an anthropological perspective of hominin communication, the human auditory system likely evolved to enable special sensitivity to sounds produced by the vocal tracts of human conspecifics whether attended or passively heard. While numerous electrophysiological studies have used stereotypical human-produced verbal (speech voice and singing voice) and nonverbal vocalizations to identify human voice-sensitive responses, controversy remains as to when (and where) processing of acoustic signal attributes characteristic of "human voiceness" per se initiate in the brain. Method To explore this, we used animal vocalizations and human-mimicked versions of those calls ("mimic voice") to examine late auditory evoked potential responses in humans. Results Here, we revealed an N1b component (96-120 ms poststimulus) during a nonattending listening condition showing significantly greater magnitude in response to mimics, beginning as early as primary auditory cortices, preceding the time window reported in previous studies that revealed species-specific vocalization processing initiating in the range of 147-219 ms. During a sound discrimination task, a P600 (500-700 ms poststimulus) component showed specificity for accurate discrimination of human mimic voice. Distinct acoustic signal attributes and features of the stimuli were used in a classifier model, which could distinguish most human from animal voice comparably to behavioral data-though none of these single features could adequately distinguish human voiceness. Conclusions These results provide novel ideas for algorithms used in neuromimetic hearing aids, as well as direct electrophysiological support for a neurocognitive model of natural sound processing that informs both neurodevelopmental and anthropological models regarding the establishment of auditory communication systems in humans. Supplemental Material https://doi.org/10.23641/asha.12903839.
Collapse
Affiliation(s)
- William J. Talkington
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Jeremy Donai
- Department of Communication Sciences and Disorders, College of Education and Human Services, West Virginia University, Morgantown
| | - Alexandra S. Kadner
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Molly L. Layne
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Andrew Forino
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Sijin Wen
- Department of Biostatistics, West Virginia University, Morgantown
| | - Si Gao
- Department of Biostatistics, West Virginia University, Morgantown
| | - Margeaux M. Gray
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Alexandria J. Ashraf
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Gabriela N. Valencia
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Brandon D. Smith
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Stephanie K. Khoo
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Stephen J. Gray
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Norman Lass
- Department of Communication Sciences and Disorders, College of Education and Human Services, West Virginia University, Morgantown
| | | | - Susannah Engdahl
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - David Graham
- Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown
| | - Chris A. Frum
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - James W. Lewis
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| |
Collapse
|
37
|
Erbele ID, Fink MR, Mankekar G, Son LS, Mehta R, Arriaga MA. Over-under cartilage tympanoplasty: technique, results and a call for improved reporting. J Laryngol Otol 2020; 134:1-7. [PMID: 33019948 DOI: 10.1017/s0022215120001978] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE This study aimed to describe the microscopic over-under cartilage tympanoplasty technique, provide hearing results and detail clinically significant complications. METHOD This was a retrospective case series chart review study of over-under cartilage tympanoplasty procedures performed by the senior author between January 2015 and January 2019 at three tertiary care centres. Cases were excluded for previous or intra-operative cholesteatoma, if a mastoidectomy was performed during the procedure or if ossiculoplasty was performed. Hearing results and complications were obtained. RESULTS Sixty-eight tympanoplasty procedures met the inclusion criteria. The median age was 13 years (range, 3-71 years). The mean improvement in pure tone average was 6 dB (95 per cent confidence interval 4-9 dB; p < 0.0001). The overall perforation closure rate was 97 per cent (n = 66). Revision surgery was recommended for a total of 6 cases (9 per cent) including 2 post-operative perforations, 1 case of middle-ear cholesteatoma and 3 cases of external auditory canal scarring. CONCLUSION Over-under cartilage tympanoplasty is effective at improving clinically meaningful hearing with a low rate of post-operative complications.
Collapse
Affiliation(s)
- I D Erbele
- Department of Otolaryngology, Division of Neurotology, Louisiana State University Health Sciences Center, New Orleans, USA
- Our Lady of the Lake Hearing and Balance Center, Baton Rouge, USA
| | - M R Fink
- Medical School, Louisiana State University Health Sciences Center, New Orleans, USA
| | - G Mankekar
- Department of Otolaryngology, Division of Neurotology, Louisiana State University Health Sciences Center, New Orleans, USA
- Department of Otolaryngology, Louisiana State University Health Sciences Center, Shreveport, USA
| | - L S Son
- Department of Otolaryngology, Division of Neurotology, Louisiana State University Health Sciences Center, New Orleans, USA
- Our Lady of the Lake Hearing and Balance Center, Baton Rouge, USA
| | - R Mehta
- Department of Otolaryngology, Division of Neurotology, Louisiana State University Health Sciences Center, New Orleans, USA
- Our Lady of the Lake Hearing and Balance Center, Baton Rouge, USA
| | - M A Arriaga
- Department of Otolaryngology, Division of Neurotology, Louisiana State University Health Sciences Center, New Orleans, USA
- Our Lady of the Lake Hearing and Balance Center, Baton Rouge, USA
- Culicchia Neurological Clinic, New Orleans, USA
| |
Collapse
|
38
|
Speights Atkins M, Bailey DJ, Boyce SE. Speech exemplar and evaluation database (SEED) for clinical training in articulatory phonetics and speech science. CLINICAL LINGUISTICS & PHONETICS 2020; 34:878-886. [PMID: 32200647 DOI: 10.1080/02699206.2020.1743761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 03/12/2020] [Accepted: 03/13/2020] [Indexed: 06/10/2023]
Abstract
One challenge faced by teachers of phonetics, speech science, and clinical speech disorders courses is providing meaningful instruction that closes the theory to practice gap. One barrier to providing this type of deep learning experience is the lack of publicly available examples of speech recordings that illustrate comparisons between typical and disordered speech production across a broad range of disorder populations. Data of this type exist, but are typically collected for specific research projects under narrowly written IRB protocols that do not allow for release of even de-identified speech recordings to other investigators or teachers. As a partial corrective to this problem, we have developed an approved publicly available database of speech recordings that provides illustrative examples of adult and child speech production from individuals with and without speech disorders. The recorded speech materials were designed to illustrate important clinical concepts, and the recordings were collected under controlled conditions using high-quality equipment. The ultimate goal of creating this corpus is to improve practitioners' and scientists' understanding of the scientific bases of knowledge in our profession and improve our ability to develop clinical scientists and young researchers in the field.
Collapse
Affiliation(s)
| | - Dallin J Bailey
- Communication Disorders, Auburn University , Auburn, AL, USA
| | - Suzanne E Boyce
- Communication Sciences and Disorders, University of Cincinnati , Cincinnati, OH, USA
| |
Collapse
|
39
|
Norel R, Agurto C, Heisig S, Rice JJ, Zhang H, Ostrand R, Wacnik PW, Ho BK, Ramos VL, Cecchi GA. Speech-based characterization of dopamine replacement therapy in people with Parkinson's disease. NPJ PARKINSONS DISEASE 2020; 6:12. [PMID: 32566741 PMCID: PMC7293295 DOI: 10.1038/s41531-020-0113-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 05/19/2020] [Indexed: 11/10/2022]
Abstract
People with Parkinson's (PWP) disease are under constant tension with respect to their dopamine replacement therapy (DRT) regimen. Waiting too long between doses results in more prominent symptoms, loss of motor function, and greater risk of falling per step. Shortened pill cycles can lead to accelerated habituation and faster development of disabling dyskinesias. The Unified Parkinson's Disease Rating Scale (MDS-UPDRS) is the gold standard for monitoring Parkinson's disease progression but requires a neurologist to administer and therefore is not an ideal instrument to continuously evaluate short-term disease fluctuations. We investigated the feasibility of using speech to detect changes in medication states, based on expectations of subtle changes in voice and content related to dopaminergic levels. We calculated acoustic and prosodic features for three speech tasks (picture description, reverse counting, and diadochokinetic rate) for 25 PWP, each evaluated "ON" and "OFF" DRT. Additionally, we generated semantic features for the picture description task. Classification of ON/OFF medication states using features generated from picture description, reverse counting and diadochokinetic rate tasks resulted in cross-validated accuracy rates of 0.89, 0.84, and 0.60, respectively. The most discriminating task was picture description which provided evidence that participants are more likely to use action words in ON than in OFF state. We also found that speech tempo was modified by DRT. Our results suggest that automatic speech assessment can capture changes associated with the DRT cycle. Given the ease of acquiring speech data, this method shows promise to remotely monitor DRT effects.
Collapse
Affiliation(s)
- R Norel
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - C Agurto
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - S Heisig
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - J J Rice
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - H Zhang
- Pfizer Digital Medicine & Translational Imaging: Early Clinical Development, Cambridge, MA 02139 USA
| | - R Ostrand
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - P W Wacnik
- Pfizer Digital Medicine & Translational Imaging: Early Clinical Development, Cambridge, MA 02139 USA
| | - B K Ho
- Department of Neurology, Tufts University School of Medicine and Tufts Medical Center, 800 Washington St, Boston, MA 02111 USA
| | - V L Ramos
- Pfizer Digital Medicine & Translational Imaging: Early Clinical Development, Cambridge, MA 02139 USA
| | - G A Cecchi
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| |
Collapse
|
40
|
Nakeva von Mentzer C. Phonemic discrimination and reproduction in 4-5-year-old children: Relations to hearing. Int J Pediatr Otorhinolaryngol 2020; 133:109981. [PMID: 32247932 DOI: 10.1016/j.ijporl.2020.109981] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 02/27/2020] [Accepted: 02/29/2020] [Indexed: 12/28/2022]
Abstract
OBJECTIVE The long-term objective of this research is to highlight the importance of speech perception assessment in children with developmental language disorder (DLD), and to investigate how hearing contributes to speech and language skills. As a first step in fulfilling this aim, the present study explored relations between phonemic discrimination and reproduction, and sensitive measures of hearing in young healthy children. METHODS The American Listen-Say test was developed and served as speech perception tool. This test reports speech discrimination of phonemic contrasts quantitatively for both quiet and in noise conditions, along with reproduction scores, all measured within one session. Speech tokens were perceptually homogenized in noise. Forty-one 4-5-year-old American children participated. Phonemic discrimination (quiet and speech shaped noise) and phonemic reproduction, audiometric thresholds in the conventional (1-8 kHz) and extended high frequency (EHF; 10-16 kHz) range, and distortion product otoacoustic emissions (DPOAEs) were examined. RESULTS All children had normal hearing thresholds within the conventional range (mean PTA bilaterally 8.6 dB HL). Ten (24.3%) of the children had elevated EHF thresholds (> 20 dB HL) for one or more frequencies or ears, and six (14.6%) had DPOAE signal to noise ratios (SNR) < 6 dB. EHF thresholds and DPOAE SNRs were significantly associated. Children's phonemic discrimination was impaired in noise, relative to quiet. There was a moderate, significant correlation between overall phonemic discrimination in noise and EHF audiometric thresholds. CONCLUSIONS Overall, the present study showed that sensitive hearing measures enabled the detection of subtle hearing difficulties in young healthy children. In particular, phonemic discrimination in noise showed associations with hearing. Implications of including sensitive hearing measures in children with DLD are discussed.
Collapse
|
41
|
Frühholz S, Trost W, Grandjean D, Belin P. Neural oscillations in human auditory cortex revealed by fast fMRI during auditory perception. Neuroimage 2020; 207:116401. [DOI: 10.1016/j.neuroimage.2019.116401] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 11/24/2019] [Indexed: 11/30/2022] Open
|
42
|
Monson BB, Caravello J. The maximum audible low-pass cutoff frequency for speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:EL496. [PMID: 31893732 DOI: 10.1121/1.5140032] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 11/22/2019] [Indexed: 06/10/2023]
Abstract
Speech energy beyond 8 kHz is often audible for listeners with normal hearing. Limits to audibility in this frequency range are not well described. This study assessed the maximum audible low-pass cutoff frequency for speech, relative to full-bandwidth speech. The mean audible cutoff frequency was approximately 13 kHz, with a small but significant effect of talker sex. Better pure tone thresholds at extended high frequencies correlated with higher audible cutoff frequency. These findings demonstrate that bandlimiting speech even at 13 kHz results in a detectable loss for the average normal-hearing listener, suggesting there is information regarding the speech signal beyond 13 kHz.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, 901 South Sixth Street, Champaign, Illinois 61820, ,
| | - Jacob Caravello
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, 901 South Sixth Street, Champaign, Illinois 61820, ,
| |
Collapse
|
43
|
Radziwon KE, Sheppard A, Salvi RJ. Psychophysical changes in temporal processing in chinchillas with noise-induced hearing loss: A literature review. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3733. [PMID: 31795701 DOI: 10.1121/1.5132292] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
It is well-established that excessive noise exposure can systematically shift audiometric thresholds (i.e., noise-induced hearing loss, NIHL) making sounds at the lower end of the dynamic range difficult to detect. An often overlooked symptom of NIHL is the degraded ability to resolve temporal fluctuations in supra-threshold signals. Given that the temporal properties of speech are highly dynamic, it is not surprising that NIHL greatly reduces one's ability to clearly decipher spoken language. However, systematic characterization of noise-induced impairments on supra-threshold signals in humans is difficult given the variability in noise exposure among individuals. Fortunately, the chinchilla is audiometrically similar to humans, making it an ideal animal model to investigate noise-induced supra-threshold deficits. Through a series of studies using the chinchilla, the authors have elucidated several noise-induced deficits in temporal processing that occur at supra-threshold levels. These experiments highlight the importance of the chinchilla model in developing an understanding of noise-induced deficits in temporal processing.
Collapse
Affiliation(s)
- Kelly E Radziwon
- Center for Hearing & Deafness, Department of Communicative Disorders and Sciences, State University of New York at Buffalo, 137 Cary Hall, Buffalo, New York 14214, USA
| | - Adam Sheppard
- Department of Communicative Disorders and Sciences, State University of New York at Buffalo, 137 Cary Hall, Buffalo, New York 14214, USA
| | - Richard J Salvi
- Center for Hearing & Deafness, Department of Communicative Disorders and Sciences, State University of New York at Buffalo, 137 Cary Hall, Buffalo, New York 14214, USA
| |
Collapse
|
44
|
Best V, Roverud E, Baltzell L, Rennies J, Lavandier M. The importance of a broad bandwidth for understanding "glimpsed" speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3215. [PMID: 31795657 PMCID: PMC6847933 DOI: 10.1121/1.5131651] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
When a target talker speaks in the presence of competing talkers, the listener must not only segregate the voices but also understand the target message based on a limited set of spectrotemporal regions ("glimpses") in which the target voice dominates the acoustic mixture. Here, the hypothesis that a broad audible bandwidth is more critical for these sparse representations of speech than it is for intact speech is tested. Listeners with normal hearing were presented with sentences that were either intact, or progressively "glimpsed" according to a competing two-talker masker presented at various levels. This was achieved by using an ideal binary mask to exclude time-frequency units in the target that would be dominated by the masker in the natural mixture. In each glimpsed condition, speech intelligibility was measured for a range of low-pass conditions (cutoff frequencies from 500 to 8000 Hz). Intelligibility was poorer for sparser speech, and the bandwidth required for optimal intelligibility increased with the sparseness of the speech. The combined effects of glimpsing and bandwidth reduction were well captured by a simple metric based on the proportion of audible target glimpses retained. The findings may be relevant for understanding the impact of high-frequency hearing loss on everyday speech communication.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Lucas Baltzell
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jan Rennies
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Mathieu Lavandier
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
45
|
Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9214535] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants–Fant) model enhanced with aspiration noise and controlled by the R d glottal shape parameter. The vowels [ɑ], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower R d values) and/or high fundamental frequency values, F 0 s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.
Collapse
|
46
|
Abstract
OBJECTIVE To determine the long-term hearing preservation rate for spontaneous vestibular schwannoma treated by primary radiotherapy. DATA SOURCES The MEDLINE/PubMed, Web of Science, Cochrane Reviews, and EMBASE databases were searched using a comprehensive Boolean keyword search developed in conjunction with a scientific librarian. English language papers published from 2000 to 2016 were evaluated. STUDY SELECTION Inclusion criteria: full articles, pretreatment and posttreatment audiograms or audiogram based scoring system, vestibular schwannoma only tumor type, reported time to follow-up, published after 1999, use of either Gamma Knife or linear accelerator radiotherapy. EXCLUSION CRITERIA case report or series with fewer than five cases, inadequate audiometric data, inadequate time to follow-up, neurofibromatosis type 2 exceeding 10% of study population, previous treatment exceeding 10% of study population, repeat datasets, use of proton beam therapy, and non-English language. DATA EXTRACTION Two reviewers independently analyzed papers for inclusion. Class A/B, 1/2 hearing was defined as either pure tone average less than or equal to 50 db with speech discrimination score more than or equal to 50%, American Academy of Otolaryngology-Head & Neck Surgery (AAO-HNS) Hearing Class A or B, or Gardner-Robertson Grade I or II. Aggregate data were used when individual data were not specified. DATA SYNTHESIS Means were compared with student t test. CONCLUSIONS Forty seven articles containing a total of 2,195 patients with preserved Class A/B, 1/2 hearing were identified for analysis. The aggregate crude hearing preservation rate was 58% at an average reporting time of 46.6 months after radiotherapy treatment. Analysis of time-based reporting shows a clear trend of decreased hearing preservation extending to 10-year follow-up. This data encourages a future long-term controlled trial.
Collapse
|
47
|
Monson BB, Rock J, Schulz A, Hoffman E, Buss E. Ecological cocktail party listening reveals the utility of extended high-frequency hearing. Hear Res 2019; 381:107773. [PMID: 31404807 DOI: 10.1016/j.heares.2019.107773] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 07/19/2019] [Accepted: 07/27/2019] [Indexed: 10/26/2022]
Abstract
A fundamental principle of neuroscience is that each species' and individual's sensory systems are tailored to meet the demands placed upon them by their environments and experiences. What has driven the upper limit of the human frequency range of hearing? The traditional view is that sensitivity to the highest frequencies (i.e., beyond 8 kHz) facilitates localization of sounds in the environment. However, this has yet to be demonstrated for naturally occurring non-speech sounds. An alternative view is that, for social species such as humans, the biological relevance of conspecific vocalizations has driven the development and retention of auditory system features. Here, we provide evidence for the latter theory. We evaluated the contribution of extended high-frequency (EHF) hearing to common ecological speech perception tasks. We found that restricting access to EHFs reduced listeners' discrimination of talker head orientation by approximately 34%. Furthermore, access to EHFs significantly improved speech recognition under listening conditions in which the target talker's head was facing the listener while co-located background talkers faced away from the listener. Our findings raise the possibility that sensitivity to the highest audio frequencies fosters communication and socialization of the human species. These findings suggest that loss of sensitivity to the highest frequencies may lead to deficits in speech perception. Such EHF hearing loss typically goes undiagnosed, but is widespread among the middle-aged population.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States; Neuroscience Program, University of Illinois at Urbana-Champaign, United States.
| | - Jenna Rock
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States
| | - Anneliese Schulz
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States
| | - Elissa Hoffman
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, United States
| |
Collapse
|
48
|
Agurto C, Ahmad O, Cecchi GA, Norel R, Pietrowicz M, Eyigoz EK, Mosmiller E, Baxi E, Rothstein JD, Roy P, Berry J, Maragakis NJ. Analyzing progression of motor and speech impairment in ALS. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2019; 2019:6097-6102. [PMID: 31947236 DOI: 10.1109/embc.2019.8857300] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Amyotrophic lateral sclerosis (ALS) is a degenerative disease which causes death of neurons controlling voluntary muscles. It is currently assessed with subjective clinical measurements, but it would benefit from alternative surrogate biomarkers that can better estimate disease progression. This work analyzes speech and fine motor coordination of subjects recruited by the Answer ALS foundation using data from a mobile app. In addition, clinical variables such as speech, writing and total ALSFRS-R scores are also acquired along with forced and slow vital capacity. Cross-sectional and longitudinal analyses were performed using speech and fine motor features. Results show that both types of features are useful to infer clinical variables especially for males (R2=0.79 for ALSFRS-R total score), but their initial values are not helpful to predict speech and motor decline. However, we found that longitudinal progression for bulbar and spinal ALS onset are different and they can be identified with high accuracy by the extracted features.
Collapse
|
49
|
Speech Perception in Noise and Listening Effort of Older Adults With Nonlinear Frequency Compression Hearing Aids. Ear Hear 2019; 39:215-225. [PMID: 28806193 DOI: 10.1097/aud.0000000000000481] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The purpose of this laboratory-based study was to compare the efficacy of two hearing aid fittings with and without nonlinear frequency compression, implemented within commercially available hearing aids. Previous research regarding the utility of nonlinear frequency compression has revealed conflicting results for speech recognition, marked by high individual variability. Individual differences in auditory function and cognitive abilities, specifically hearing loss slope and working memory, may contribute to aided performance. The first aim of the study was to determine the effect of nonlinear frequency compression on aided speech recognition in noise and listening effort using a dual-task test paradigm. The hypothesis, based on the Ease of Language Understanding model, was that nonlinear frequency compression would improve speech recognition in noise and decrease listening effort. The second aim of the study was to determine if listener variables of hearing loss slope, working memory capacity, and age would predict performance with nonlinear frequency compression. DESIGN A total of 17 adults (age, 57-85 years) with symmetrical sensorineural hearing loss were tested in the sound field using hearing aids fit to target (NAL-NL2). Participants were recruited with a range of hearing loss severities and slopes. A within-subjects, single-blinded design was used to compare performance with and without nonlinear frequency compression. Speech recognition in noise and listening effort were measured by adapting the Revised Speech in Noise Test into a dual-task paradigm. Participants were required trial-by-trial to repeat the last word of each sentence presented in speech babble and then recall the sentence-ending words after every block of six sentences. Half of the sentences were rich in context for the recognition of the final word of each sentence, and half were neutral in context. Extrinsic factors of sentence context and nonlinear frequency compression were manipulated, and intrinsic factors of hearing loss slope, working memory capacity, and age were measured to determine which participant factors were associated with benefit from nonlinear frequency compression. RESULTS On average, speech recognition in noise performance significantly improved with the use of nonlinear frequency compression. Individuals with steeply sloping hearing loss received more recognition benefit. Recall performance also significantly improved at the group level, with nonlinear frequency compression revealing reduced listening effort. The older participants within the study cohort received less recall benefit than the younger participants. The benefits of nonlinear frequency compression for speech recognition and listening effort did not correlate with each other, suggesting separable sources of benefit for these outcome measures. CONCLUSIONS Improvements of speech recognition in noise and reduced listening effort indicate that adult hearing aid users can receive benefit from nonlinear frequency compression in a noisy environment, with the amount of benefit varying across individuals and across outcome measures. Evidence supports individualized selection of nonlinear frequency compression, with results suggesting benefits in speech recognition for individuals with steeply sloping hearing losses and in listening effort for younger individuals. Future research is indicated with a larger data set on the dual-task paradigm as a potential cognitive outcome measure.
Collapse
|
50
|
Niebuhr O, Nazaryan AN. Money Talks — But Less Well so over the Mobile Phone? The Persistence of the Telephone Voice in a 4G Technology Setting and the Resulting Implications for Business Communication and Mobile-Phone Innovation. INTERNATIONAL JOURNAL OF INNOVATION AND TECHNOLOGY MANAGEMENT 2019. [DOI: 10.1142/s0219877019500135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Our study is a first step toward the innovative further development of mobile phones with special emphasis on optimizing them for business communication. Traditional landline phones and mobile phones up to 3G technology are known to trigger the so-called “telephone voice”. The phonetic changes induced by the telephone voice (louder speech at a higher pitch level) are suitable for undermining the perceived competence, trustworthiness and charisma of a speaker and can, thus, negatively influence business actions over the mobile phone. In a speech production experiment with 20 speakers and a subsequent acoustic speech-signal analysis of almost 15 000 utterances, we tested in comparison to a baseline face-to-face dialog condition, whether the telephone voice still exists in a technological setting of VoLTE 4G mobile-phone communication. In fact, we found that the typical characteristics of the telephone voice persist even under the currently best technological 4G standards and under silent communication conditions. Moreover, we identified further acoustic-phonetic parameters of the telephone voice, some of which (like a more monotonous intonation) further compound the problem of business communication over the mobile phone. In combination, the extended parametric picture and the persistent occurrence of the “telephone voice” even under quiet 4G conditions suggest that a speech-in-noise-like (i.e. Lombard) adaption is not the only and perhaps not even the primary cause behind the telephone voice. Based on this, we propose a number of innovations and R&D activities for making mobile-phone technology more suitable for business communication.
Collapse
Affiliation(s)
- Oliver Niebuhr
- CIE — Centre for Industrial Electronics, Mads Clausen Institute, University of Southern Denmark, Sønderborg, Denmark
| | - Anush Norika Nazaryan
- Department of General Linguistics, Institute of Scandinavian Studies, Frisian, and General Linguistics, Kiel University, Germany
| |
Collapse
|