1
|
Regev J, Zaar J, Relaño-Iborra H, Dau T. Age-related reduction of amplitude modulation frequency selectivity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2298. [PMID: 37092934 DOI: 10.1121/10.0017835] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/27/2023] [Indexed: 05/03/2023]
Abstract
The perception of amplitude modulations (AMs) has been characterized by a frequency-selective process in the temporal envelope domain and simulated in computational auditory processing and perception models using a modulation filterbank. Such AM frequency-selective processing has been argued to be critical for the perception of complex sounds, including speech. This study aimed at investigating the effects of age on behavioral AM frequency selectivity in young (n = 11, 22-29 years) versus older (n = 10, 57-77 years) listeners with normal hearing, using a simultaneous AM masking paradigm with a sinusoidal carrier (2.8 kHz), target modulation frequencies of 4, 16, 64, and 128 Hz, and narrowband-noise modulation maskers. A reduction of AM frequency selectivity by a factor of up to 2 was found in the older listeners. While the observed AM selectivity co-varied with the unmasked AM detection sensitivity, the age-related broadening of the masked threshold patterns remained stable even when AM sensitivity was similar across groups for an extended stimulus duration. The results from the present study might provide a valuable basis for further investigations exploring the effects of age and reduced AM frequency selectivity on complex sound perception as well as the interaction of age and hearing impairment on AM processing and perception.
Collapse
Affiliation(s)
- Jonathan Regev
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Johannes Zaar
- Eriksholm Research Centre, Snekkersten, 3070, Denmark
| | - Helia Relaño-Iborra
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| |
Collapse
|
2
|
Relaño-Iborra H, Dau T. Speech intelligibility prediction based on modulation frequency-selective processing. Hear Res 2022; 426:108610. [PMID: 36163219 DOI: 10.1016/j.heares.2022.108610] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 08/22/2022] [Accepted: 09/12/2022] [Indexed: 11/17/2022]
Abstract
Speech intelligibility models can provide insights regarding the auditory processes involved in human speech perception and communication. One successful approach to modelling speech intelligibility has been based on the analysis of the amplitude modulations present in speech as well as competing interferers. This review covers speech intelligibility models that include a modulation-frequency selective processing stage i.e., a modulation filterbank, as part of their front end. The speech-based envelope power spectrum model [sEPSM, Jørgensen and Dau (2011). J. Acoust. Soc. Am. 130(3), 1475-1487], several variants of the sEPSM including modifications with respect to temporal resolution, spectro-temporal processing and binaural processing, as well as the speech-based computational auditory signal processing and perception model [sCASP; Relaño-Iborra et al. J. Acoust. Soc. Am. 146(5), 3306-3317], which is based on an established auditory signal detection and masking model, are discussed. The key processing stages of these models for the prediction of speech intelligibility across a variety of acoustic conditions are addressed in relation to competing modeling approaches. The strengths and weaknesses of the modulation-based analysis are outlined and perspectives presented, particularly in connection with the challenge of predicting the consequences of individual hearing loss on speech intelligibility.
Collapse
Affiliation(s)
- Helia Relaño-Iborra
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby 2800, Denmark; Cognitive Systems Section, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs, Lyngby 2800, Denmark.
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby 2800, Denmark
| |
Collapse
|
3
|
Eurich B, Encke J, Ewert SD, Dietz M. Lower interaural coherence in off-signal bands impairs binaural detection. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:3927. [PMID: 35778173 DOI: 10.1121/10.0011673] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 05/24/2022] [Indexed: 06/15/2023]
Abstract
Differences in interaural phase configuration between a target and a masker can lead to substantial binaural unmasking. This effect is decreased for masking noises with an interaural time difference (ITD). Adding a second noise with an opposing ITD in most cases further reduces binaural unmasking. Thus far, modeling of these detection thresholds required both a mechanism for internal ITD compensation and an increased filter bandwidth. An alternative explanation for the reduction is that unmasking is impaired by the lower interaural coherence in off-frequency regions caused by the second masker [Marquardt and McAlpine (2009). J. Acoust. Soc. Am. 126(6), EL177-EL182]. Based on this hypothesis, the current work proposes a quantitative multi-channel model using monaurally derived peripheral filter bandwidths and an across-channel incoherence interference mechanism. This mechanism differs from wider filters since it has no effect when the masker coherence is constant across frequency bands. Combined with a monaural energy discrimination pathway, the model predicts the differences between a single delayed noise and two opposingly delayed noises as well as four other data sets. It helps resolve the inconsistency that simulating some data requires wide filters while others require narrow filters.
Collapse
Affiliation(s)
- Bernhard Eurich
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Jörg Encke
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Stephan D Ewert
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Mathias Dietz
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
4
|
Gottschalk M, Verhey JL. Modelling suppression and comodulation masking release using the dual-resonance nonlinear filter. JASA EXPRESS LETTERS 2022; 2:014401. [PMID: 36154223 DOI: 10.1121/10.0009130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Although comodulation masking release (CMR) is commonly associated with across-channel processes, it was often argued that part of the effect may be explained by processing within an auditory filter. One peripheral mechanism for such within-channel process is cochlear suppression. Using the dual-resonance nonlinear filter model with different sets of model parameters, the present study shows that the simulated CMR is associated with the simulated two-tone suppression. A modification of the model parameters results in a more accurate prediction of suppression and thus, is also more accurate in predicting the contribution of suppression to CMR.
Collapse
Affiliation(s)
- Martin Gottschalk
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, Leipziger Str. 44, 39120 Magdeburg, Germany ,
| | - Jesko L Verhey
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, Leipziger Str. 44, 39120 Magdeburg, Germany ,
| |
Collapse
|
5
|
Ewert SD, Paraouty N, Lorenzi C. A two‐path model of auditory modulation detection using temporal fine structure and envelope cues. Eur J Neurosci 2020; 51:1265-1278. [DOI: 10.1111/ejn.13846] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 01/18/2018] [Accepted: 01/18/2018] [Indexed: 11/30/2022]
Affiliation(s)
- Stephan D. Ewert
- Medizinische Physik and Cluster of Excellence Hearing4All Universität Oldenburg 26111 Oldenburg Germany
| | - Nihaad Paraouty
- Laboratoire des systèmes perceptifs Département d’études cognitives, École normale supérieure CNRS PSL Research University Paris France
| | - Christian Lorenzi
- Laboratoire des systèmes perceptifs Département d’études cognitives, École normale supérieure CNRS PSL Research University Paris France
| |
Collapse
|
6
|
Ihlefeld A, Chen YW, Sanes DH. Developmental Conductive Hearing Loss Reduces Modulation Masking Release. Trends Hear 2018; 20:2331216516676255. [PMID: 28215119 PMCID: PMC5318943 DOI: 10.1177/2331216516676255] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Hearing-impaired individuals experience difficulties in detecting or understanding speech, especially in background sounds within the same frequency range. However, normally hearing (NH) human listeners experience less difficulty detecting a target tone in background noise when the envelope of that noise is temporally gated (modulated) than when that envelope is flat across time (unmodulated). This perceptual benefit is called modulation masking release (MMR). When flanking masker energy is added well outside the frequency band of the target, and comodulated with the original modulated masker, detection thresholds improve further (MMR+). In contrast, if the flanking masker is antimodulated with the original masker, thresholds worsen (MMR−). These interactions across disparate frequency ranges are thought to require central nervous system (CNS) processing. Therefore, we explored the effect of developmental conductive hearing loss (CHL) in gerbils on MMR characteristics, as a test for putative CNS mechanisms. The detection thresholds of NH gerbils were lower in modulated noise, when compared with unmodulated noise. The addition of a comodulated flanker further improved performance, whereas an antimodulated flanker worsened performance. However, for CHL-reared gerbils, all three forms of masking release were reduced when compared with NH animals. These results suggest that developmental CHL impairs both within- and across-frequency processing and provide behavioral evidence that CNS mechanisms are affected by a peripheral hearing impairment.
Collapse
Affiliation(s)
- Antje Ihlefeld
- 1 Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ, USA
| | - Yi-Wen Chen
- 2 Center for Neural Science, New York University, NY, USA
| | - Dan H Sanes
- 2 Center for Neural Science, New York University, NY, USA.,3 Department of Psychology, New York University, NY, USA.,4 Department of Biology, New York University, NY, USA
| |
Collapse
|
7
|
Wright BA, Fitzgerald MB. Detection of tones of unexpected frequency in amplitude-modulated noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2043. [PMID: 29092596 DOI: 10.1121/1.5007718] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Detection of a tonal signal in amplitude-modulated noise can improve with increases in noise bandwidth if the pattern of amplitude fluctuations is uniform across frequency, a phenomenon termed comodulation masking release (CMR). Most explanations for CMR rely on an assumption that listeners monitor frequency channels both at and remote from the signal frequency in conditions that yield the effect. To test this assumption, detectability was assessed for signals presented at expected and unexpected frequencies in wideband amplitude-modulated noise. Detection performance was high even for signals of unexpected frequency, suggesting that listeners were monitoring multiple frequency channels, as has been assumed.
Collapse
Affiliation(s)
- Beverly A Wright
- Department of Communication Sciences and Disorders and Knowles Hearing Center, Northwestern University, 2240 Campus Drive, Evanston, Illinois 60208, USA
| | - Matthew B Fitzgerald
- Department of Otolaryngology/Head and Neck Surgery, Stanford University, Stanford Ear Institute, 2452 Watson Court, Palo Alto, California 94303, USA
| |
Collapse
|
8
|
Ewert SD, Schubotz W, Brand T, Kollmeier B. Binaural masking release in symmetric listening conditions with spectro-temporally modulated maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:12. [PMID: 28764456 DOI: 10.1121/1.4990019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Speech reception thresholds (SRTs) decrease as target and maskers are spatially separated (spatial release from masking, SRM). The current study systematically assessed how SRTs and SRM for a frontal target in a spatially symmetric masker configuration depend on spectro-temporal masker properties, the availability of short-time interaural level difference (ILD) and interaural time difference (ITD), and informational masking. Maskers ranged from stationary noise to single, interfering talkers and were modified by head-related transfer functions to provide: (i) different binaural cues (ILD, ITD, or both) and (ii) independent maskers in each ear ("infinite ILD"). Additionally, a condition was tested in which only information from short-time spectro-temporal segments of the ear with a favorable signal-to-noise ratio (better-ear glimpses) was presented. For noise-based maskers, ILD, ITD, and spectral changes related to masker location contributed similarly to SRM, while ILD cues played a larger role if temporal modulation was introduced. For speech maskers, glimpsing and perceived location contributed roughly equally and ITD contributed less. The "infinite ILD" condition might suggest better-ear glimpsing limitations resulting in a maximal SRM of 12 dB for maskers with low or absent informational masking. Comparison to binaural model predictions highlighted the importance of short-time processing and helped to clarify the contribution of the different binaural cues and mechanisms.
Collapse
Affiliation(s)
- Stephan D Ewert
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| | - Wiebke Schubotz
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| | - Thomas Brand
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| | - Birger Kollmeier
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| |
Collapse
|
9
|
Diepenbrock JP, Jeschke M, Ohl FW, Verhey J. Comodulation masking release in the inferior colliculus by combined signal enhancement and masker reduction. J Neurophysiol 2016; 117:853-867. [PMID: 27784801 DOI: 10.1152/jn.00191.2016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Revised: 10/03/2016] [Accepted: 10/21/2016] [Indexed: 11/22/2022] Open
Abstract
Auditory signals that contain coherent level fluctuations of a masker in different frequency regions enhance the detectability of an embedded sinusoidal target signal, an effect commonly known as comodulation masking release (CMR). Neural correlates have been proposed at different stages of the auditory system. While later stages seem to suppress the response to the masker, earlier stages are more likely to enhance their response to the signal when the masker is comodulated. Using a flanking band masking paradigm, the present study investigates how CMR is represented at the level of the inferior colliculus of the Mongolian gerbil. The responses to a target signal at various sound pressure levels in three different masking conditions were compared. In one condition the masker was a 10-Hz amplitude modulated sinusoid centered at the signal frequency while in the other two conditions six off-frequency carriers (flanking bands) were added. From 81 units 26 showed a change that enhanced the detectability of the signal if the temporal modulation of the added flanking bands was identical to that of the masker at the signal frequency compared to the other two masking conditions. This study shows that the response characteristics of these neurons represent an intermediate stage between the representation in the cochlear nucleus and the auditory cortex. This means that the response is increased during the signal intervals but is also decreased for the following masker portions.
Collapse
|
10
|
Joosten ERM, Shamma SA, Lorenzi C, Neri P. Dynamic Reweighting of Auditory Modulation Filters. PLoS Comput Biol 2016; 12:e1005019. [PMID: 27398600 PMCID: PMC4939963 DOI: 10.1371/journal.pcbi.1005019] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 06/13/2016] [Indexed: 11/22/2022] Open
Abstract
Sound waveforms convey information largely via amplitude modulations (AM). A large body of experimental evidence has provided support for a modulation (bandpass) filterbank. Details of this model have varied over time partly reflecting different experimental conditions and diverse datasets from distinct task strategies, contributing uncertainty to the bandwidth measurements and leaving important issues unresolved. We adopt here a solely data-driven measurement approach in which we first demonstrate how different models can be subsumed within a common 'cascade' framework, and then proceed to characterize the cascade via system identification analysis using a single stimulus/task specification and hence stable task rules largely unconstrained by any model or parameters. Observers were required to detect a brief change in level superimposed onto random level changes that served as AM noise; the relationship between trial-by-trial noisy fluctuations and corresponding human responses enables targeted identification of distinct cascade elements. The resulting measurements exhibit a dynamic complex picture in which human perception of auditory modulations appears adaptive in nature, evolving from an initial lowpass to bandpass modes (with broad tuning, Q∼1) following repeated stimulus exposure.
Collapse
Affiliation(s)
- Eva R. M. Joosten
- Laboratoire Psychologie de la Perception (CNRS UMR 8242) and Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Shihab A. Shamma
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
- Department of Electrical and Computer Engineering, Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
| | - Christian Lorenzi
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
| | - Peter Neri
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
| |
Collapse
|
11
|
Schubotz W, Brand T, Kollmeier B, Ewert SD. Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:524. [PMID: 27475175 DOI: 10.1121/1.4955079] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Speech intelligibility is strongly affected by the presence of maskers. Depending on the spectro-temporal structure of the masker and its similarity to the target speech, different masking aspects can occur which are typically referred to as energetic, amplitude modulation, and informational masking. In this study speech intelligibility and speech detection was measured in maskers that vary systematically in the time-frequency domain from steady-state noise to a single interfering talker. Male and female target speech was used in combination with maskers based on speech for the same or different gender. Observed data were compared to predictions of the speech intelligibility index, extended speech intelligibility index, multi-resolution speech-based envelope-power-spectrum model, and the short-time objective intelligibility measure. The different models served as analysis tool to help distinguish between the different masking aspects. Comparison shows that overall masking can to a large extent be explained by short-term energetic masking. However, the other masking aspects (amplitude modulation an informational masking) influence speech intelligibility as well. Additionally, it was obvious that all models showed considerable deviations from the data. Therefore, the current study provides a benchmark for further evaluation of speech prediction models.
Collapse
Affiliation(s)
- Wiebke Schubotz
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| | - Thomas Brand
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| | - Birger Kollmeier
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, D-26111 Oldenburg, Germany
| |
Collapse
|
12
|
Lentz JJ, Valentine S. Across-frequency processing of modulation phase differences in hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:EL205-EL211. [PMID: 26428814 PMCID: PMC4560714 DOI: 10.1121/1.4929624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Revised: 06/19/2015] [Accepted: 08/12/2015] [Indexed: 06/05/2023]
Abstract
Two experiments tested the influence of hearing impairment (HI) on representing across-frequency temporal coherence. In one experiment, HI listeners demonstrated similar abilities to normal-hearing listeners in detecting across-frequency differences in modulation phase. In another, spectral-shape discrimination was detrimentally affected by modulation phase disparities imposed on spectral components. Spectral-shape discrimination by HI listeners was less influenced by the disparities, suggesting that hearing loss alters the representation of envelope phase. Results suggest that multiple approaches may be necessary to determine alterations associated with hearing loss—detection tasks may not be sufficient to elucidate distortions to temporal envelope associated with hearing loss.
Collapse
Affiliation(s)
- Jennifer J Lentz
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana 47405, USA ,
| | - Susie Valentine
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana 47405, USA ,
| |
Collapse
|
13
|
Grzeschik R, Lübken B, Verhey JL. Comodulation masking release in an off-frequency masking paradigm. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:1194-1205. [PMID: 26328732 DOI: 10.1121/1.4928134] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Detection threshold of a sinusoidal signal masked by a broadband masker is lower when on- and off-frequency masker components have a correlated envelope, compared to a condition in which these masker components have different envelopes. This effect is commonly referred to as comodulation masking release (CMR). The present study investigated if there is a CMR in the absence of a masker component at the signal frequency, i.e., in an off-frequency masking paradigm. Thresholds were measured for a 500-Hz signal in the presence of a broadband masker with a spectral notch at the signal frequency. Thresholds were significantly lower for a (co-)modulated than for an unmodulated masker for all notch widths up to 400 Hz. An additional experiment showed that the particularly large CMR for the no-notch condition was due to the way the modulated masker was generated. No CMR was measured when the notched-noise masker was replaced by a pair of narrowband noises. The addition of more remote masker bands resulted in a CMR of about 3-4 dB. The notched-noise data were predicted on the basis of a modulation-filterbank model. The predictions of the narrowband noise conditions indicated that all mechanisms underlying CMR might still not be fully understood.
Collapse
Affiliation(s)
- Ramona Grzeschik
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, Leipziger Straße 44, 39120 Magdeburg, Germany
| | - Björn Lübken
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, Leipziger Straße 44, 39120 Magdeburg, Germany
| | - Jesko L Verhey
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, Leipziger Straße 44, 39120 Magdeburg, Germany
| |
Collapse
|
14
|
Chabot-Leclerc A, Jørgensen S, Dau T. The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:3502-12. [PMID: 24907813 DOI: 10.1121/1.4873517] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated by comparing predictions from models based on the signal-to-noise envelope power ratio, SNRenv, and the modulation transfer function, MTF. The models were evaluated in conditions of noisy speech (1) subjected to reverberation, (2) distorted by phase jitter, or (3) processed by noise reduction via spectral subtraction. The results suggested that a decision metric based on the SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined with a measure of across (audio) frequency variability at the output of the auditory preprocessing. A complex spectro-temporal modulation filterbank might therefore not be required for speech intelligibility prediction.
Collapse
Affiliation(s)
- Alexandre Chabot-Leclerc
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Søren Jørgensen
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Torsten Dau
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| |
Collapse
|
15
|
Christiansen SK, Jepsen ML, Dau T. Effects of tonotopicity, adaptation, modulation tuning, and temporal coherence in "primitive" auditory stream segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:323-333. [PMID: 24437772 DOI: 10.1121/1.4845675] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The perceptual organization of two-tone sequences into auditory streams was investigated using a modeling framework consisting of an auditory pre-processing front end [Dau et al., J. Acoust. Soc. Am. 102, 2892-2905 (1997)] combined with a temporal coherence-analysis back end [Elhilali et al., Neuron 61, 317-329 (2009)]. Two experimental paradigms were considered: (i) Stream segregation as a function of tone repetition time (TRT) and frequency separation (Δf) and (ii) grouping of distant spectral components based on onset/offset synchrony. The simulated and experimental results of the present study supported the hypothesis that forward masking enhances the ability to perceptually segregate spectrally close tone sequences. Furthermore, the modeling suggested that effects of neural adaptation and processing though modulation-frequency selective filters may enhance the sensitivity to onset asynchrony of spectral components, facilitating the listeners' ability to segregate temporally overlapping sounds into separate auditory objects. Overall, the modeling framework may be useful to study the contributions of bottom-up auditory features on "primitive" grouping, also in more complex acoustic scenarios than those considered here.
Collapse
Affiliation(s)
| | | | - Torsten Dau
- Oticon Centre of Excellence for Hearing and Speech Sciences, Technical University of Denmark, DK-2800 Lyngby, Denmark
| |
Collapse
|