1
|
Fan Y, Gifford RH. Objective measure of binaural processing: Acoustic change complex in response to interaural phase differences. Hear Res 2024; 448:109020. [PMID: 38763034 DOI: 10.1016/j.heares.2024.109020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 04/12/2024] [Accepted: 04/26/2024] [Indexed: 05/21/2024]
Abstract
Combining cochlear implants with binaural acoustic hearing via preserved hearing in the implanted ear(s) is commonly referred to as combined electric and acoustic stimulation (EAS). EAS fittings can provide patients with significant benefit for speech recognition in complex noise, perceived listening difficulty, and horizontal-plane localization as compared to traditional bimodal hearing conditions with contralateral and monaural acoustic hearing. However, EAS benefit varies across patients and the degree of benefit is not reliably related to the underlying audiogram. Previous research has indicated that EAS benefit for speech recognition in complex listening scenarios and localization is significantly correlated with the patients' binaural cue sensitivity, namely interaural time differences (ITD). In the context of pure tones, interaural phase differences (IPD) and ITD can be understood as two perspectives on the same phenomenon. Through simple mathematical conversion, one can be transformed into the other, illustrating their inherent interrelation for spatial hearing abilities. However, assessing binaural cue sensitivity is not part of a clinical assessment battery as psychophysical tasks are time consuming, require training to achieve performance asymptote, and specialized programming and software all of which render this clinically unfeasible. In this study, we investigated the possibility of using an objective measure of binaural cue sensitivity by the acoustic change complex (ACC) via imposition of an IPD of varying degrees at stimulus midpoint. Ten adult listeners with normal hearing were assessed on tasks of behavioral and objective binaural cue sensitivity for carrier frequencies of 250 and 1000 Hz. Results suggest that 1) ACC amplitude increases with IPD; 2) ACC-based IPD sensitivity for 250 Hz is significantly correlated with behavioral ITD sensitivity; 3) Participants were more sensitive to IPDs at 250 Hz as compared to 1000 Hz. Thus, this objective measure of IPD sensitivity may hold clinical application for pre- and post-operative assessment for individuals meeting candidacy indications for cochlear implantation with low-frequency acoustic hearing preservation as this relatively quick and objective measure may provide clinicians with information identifying patients most likely to derive benefit from EAS technology.
Collapse
Affiliation(s)
- Yibo Fan
- Department of Hearing and Speech Sciences, Vanderbilt University, School of Medicine, Nashville, TN 37232, USA
| | - René H Gifford
- Department of Hearing and Speech Sciences, Vanderbilt University, School of Medicine, Nashville, TN 37232, USA.
| |
Collapse
|
2
|
Kanagokar V, Fathima H, Bhat JS, Muthu ANP. Effect of inter-aural temporal envelope differences on inter-aural time difference thresholds for amplitude modulated noise. Codas 2024; 36:e20220261. [PMID: 38324806 PMCID: PMC10903954 DOI: 10.1590/2317-1782/20232022261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 04/26/2023] [Indexed: 02/09/2024] Open
Abstract
PURPOSE The inter-aural time difference (ITD) and inter-aural level difference (ILD) are important acoustic cues for horizontal localization and spatial release from masking. These cues are encoded based on inter-aural comparisons of tonotopically matched binaural inputs. Therefore, binaural coherence or the interaural spectro-temporal similarity is a pre-requisite for encoding ITD and ILD. The modulation depth of envelope is an important envelope characteristic that helps in encoding the envelope-ITD. However, inter-aural difference in modulation depth can result in reduced binaural coherence and poor representation of binaural cues as in the case with reverberation, noise and compression in cochlear implants and hearing aids. This study investigated the effect of inter-aural modulation depth difference on the ITD thresholds for an amplitude-modulated noise in normal hearing young adults. METHODS An amplitude modulated high pass filtered noise with varying modulation depth differences was presented sequentially through headphones. In one ear, the modulation depth was retained at 90% and in the other ear it varied from 90% to 50%. The ITD thresholds for modulation frequencies of 8 Hz and 16 Hz were estimated as a function of the inter-aural modulation depth difference. RESULTS The Friedman test findings revealed a statistically significant increase in the ITD threshold with an increase in the inter-aural modulation depth difference for 8 Hz and 16 Hz. CONCLUSION The results indicate that the inter-aural differences in the modulation depth negatively impact ITD perception for an amplitude-modulated high pass filtered noise.
Collapse
Affiliation(s)
- Vibha Kanagokar
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, Karnataka, India.
| | - Hasna Fathima
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, Karnataka, India.
- Department of Audiology and Speech-Language Pathology, National Institute of Speech and Hearing - Trivandrum, Kerala, India.
| | | | - Arivudai Nambi Pitchai Muthu
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, Karnataka, India.
- Department of Audiology, All India Institute of Speech and Hearing - Mysore, Karnataka, India.
| |
Collapse
|
3
|
González-Toledo D, Cuevas-Rodríguez M, Vicente T, Picinali L, Molina-Tanco L, Reyes-Lecuona A. Spatial release from masking in the median plane with non-native speakers using individual and mannequin head related transfer functions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:284-293. [PMID: 38227426 DOI: 10.1121/10.0024239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 12/12/2023] [Indexed: 01/17/2024]
Abstract
Spatial release from masking (SRM) in speech-on-speech tasks has been widely studied in the horizontal plane, where interaural cues play a fundamental role. Several studies have also observed SRM for sources located in the median plane, where (monaural) spectral cues are more important. However, a relatively unexplored research question concerns the impact of head-related transfer function (HRTF) personalisation on SRM, for example, whether using individually-measured HRTFs results in better performance if compared with the use of mannequin HRTFs. This study compares SRM in the median plane in a speech-on-speech virtual task rendered using both individual and mannequin HRTFs. SRM is obtained using English sentences with non-native English speakers. Our participants show lower SRM performances compared to those found by others using native English participants. Furthermore, SRM is significantly larger when the source is spatialised using the individual HRTF, and this effect is more marked for those with lower English proficiency. Further analyses using a spectral distortion metric and the estimation of the better-ear effect, show that the observed SRM can only partially be explained by HRTF-specific factors and that the effect of the familiarity with individual spatial cues is likely to be the most significant element driving these results.
Collapse
Affiliation(s)
- Daniel González-Toledo
- Telecommunication Research Institute (TELMA), Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| | - María Cuevas-Rodríguez
- Telecommunication Research Institute (TELMA), Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| | - Thibault Vicente
- Audio Experience Design, Dyson School of Design Engineering, Imperial College London, London SW7 2DB, United Kingdom
| | - Lorenzo Picinali
- Audio Experience Design, Dyson School of Design Engineering, Imperial College London, London SW7 2DB, United Kingdom
| | - Luis Molina-Tanco
- Telecommunication Research Institute (TELMA), Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| | - Arcadio Reyes-Lecuona
- Telecommunication Research Institute (TELMA), Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| |
Collapse
|
4
|
Wang J, Xie S, Stenfelt S, Zhou H, Wang X, Sang J. Spatial Release From Masking With Bilateral Bone Conduction Stimulation at Mastoid for Normal Hearing Subjects. Trends Hear 2024; 28:23312165241234202. [PMID: 38549451 PMCID: PMC10981249 DOI: 10.1177/23312165241234202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 02/03/2024] [Accepted: 02/05/2024] [Indexed: 04/01/2024] Open
Abstract
This study investigates the effect of spatial release from masking (SRM) in bilateral bone conduction (BC) stimulation at the mastoid. Nine adults with normal hearing were tested to determine SRM based on speech recognition thresholds (SRTs) in simulated spatial configurations ranging from 0 to 180 degrees. These configurations were based on nonindividualized head-related transfer functions. The participants were subjected to sound stimulation through either air conduction (AC) via headphones or BC. The results indicated that both the angular separation between the target and the masker, and the modality of sound stimulation, significantly influenced speech recognition performance. As the angular separation between the target and the masker increased up to 150°, both BC and AC SRTs decreased, indicating improved performance. However, performance slightly deteriorated when the angular separation exceeded 150°. For spatial separations less than 75°, BC stimulation provided greater spatial benefits than AC, although this difference was not statistically significant. For separations greater than 75°, AC stimulation offered significantly more spatial benefits than BC. When speech and noise originated from the same side of the head, the "better ear effect" did not significantly contribute to SRM. However, when speech and noise were located on opposite sides of the head, this effect became dominant in SRM.
Collapse
Affiliation(s)
- Jie Wang
- School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, China
| | - Sijia Xie
- School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, China
| | - Stefan Stenfelt
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
| | - Huali Zhou
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China
| | - Xiaoya Wang
- Otolaryngology Department, Guangzhou Women and Children's Medical Center, Guangzhou, China
| | - Jinqiu Sang
- Shanghai Institute of AI for Education, East China Normal University, Shanghai, China
| |
Collapse
|
5
|
Minelli G, Puglisi GE, Astolfi A, Hauth C, Warzybok A. Objective Assessment of Binaural Benefit from Acoustical Treatment in Real Primary School Classrooms. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:ijerph20105848. [PMID: 37239574 DOI: 10.3390/ijerph20105848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/10/2023] [Accepted: 05/12/2023] [Indexed: 05/28/2023]
Abstract
Providing students with an adequate acoustic environment is crucial for ensuring speech intelligibility in primary school classrooms. Two main approaches to control acoustics in educational facilities consist of reducing background noise and late reverberation. Prediction models for speech intelligibility have been developed and implemented to evaluate the effects of these approaches. In this study, two versions of the Binaural Speech Intelligibility Model (BSIM) were used to predict speech intelligibility in realistic spatial configurations of speakers and listeners, considering binaural aspects. Both versions shared the same binaural processing and speech intelligibility backend processes but differed in the pre-processing of the speech signal. An Italian primary school classroom was characterized in terms of acoustics before (reverberation, T20 = 1.6 ± 0.1 s) and after (T20 = 0.6 ± 0.1 s) an acoustical treatment to compare BSIM predictions to well-established room acoustic measures. With shorter reverberation time, speech clarity and definition improved, as well as speech recognition thresholds (SRTs) (by up to ~6 dB), particularly when the noise source was close to the receiver and an energetic masker was present. Conversely, longer reverberation times resulted (i) in poorer SRTs (by ~11 dB on average) and (ii) in an almost non-existent spatial release from masking at an angle (SRM).
Collapse
Affiliation(s)
- Greta Minelli
- Department of Energy, Politecnico di Torino, 10129 Torino, Italy
| | | | - Arianna Astolfi
- Department of Energy, Politecnico di Torino, 10129 Torino, Italy
| | - Christopher Hauth
- Medizinische Physik and Cluster of Excellence Hearing4All, Carl von Ossietzky University of Oldenburg, D-26111 Oldenburg, Germany
| | - Anna Warzybok
- Medizinische Physik and Cluster of Excellence Hearing4All, Carl von Ossietzky University of Oldenburg, D-26111 Oldenburg, Germany
| |
Collapse
|
6
|
Iva P, Martin R, Fielding J, Clough M, White O, Godic B, van der Walt A, Rajan R. Discriminating spatialised speech in complex environments in multiple sclerosis. Cortex 2023; 159:217-232. [PMID: 36640621 DOI: 10.1016/j.cortex.2022.11.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 06/13/2022] [Accepted: 11/09/2022] [Indexed: 12/25/2022]
Abstract
People with multiple sclerosis (pwMS) frequently present with deficits in binaural processing used for sound localization. This study examined spatial release from speech-on-speech masking in pwMS, which involves binaural processing and additional higher level mechanisms underlying streaming, such as spatial attention. 26 pwMS with mild severity (Expanded Disability Status Scale score <3) and 20 age-matched controls listened via headphones to pre-recorded sentences from a standard list presented simultaneously with eight-talker babble. Virtual acoustic techniques were used to simulate sentences originating from 0°, 20°, or 50° on the interaural horizontal plane around the listener whilst babble was presented continuously at 0° azimuth, and participants verbally repeated the target sentence. In a separate task, two simultaneous sentences both containing a colour and number were presented, and participants were required to report the target colour and number. Both competing sentences could originate from 0°, 20°, or 50° on the azimuthal plane. Participants also completed a series of neuropsychological assessments, an auditory questionnaire, and a three-alternative forced-choice task that involved the detection of interaural time differences (ITDs) in noise bursts. Spatial release from masking was observed in both pwMS and controls, as response accuracy in the two speech discrimination tasks improved in the spatially separated conditions (20° and 50°) compared with the co-localised condition. However, pwMS demonstrated significantly less spatial release (18%) than controls (28%) when discriminating colour/number coordinates. At 50° separation, pwMS discriminated significantly fewer coordinates (77%) than controls (89%). In contrast, pwMS had similar performances to controls when sentences were presented in babble, and for the basic ITD discrimination task. Significant correlations between speech discrimination performance and standardized neuropsychological scores were observed across all spatial conditions. Our findings suggest that spatial hearing is likely to be implicated in pwMS, thereby affecting the perception of competing speech originating from various locations.
Collapse
Affiliation(s)
- Pippa Iva
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia.
| | - Russell Martin
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Joanne Fielding
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Meaghan Clough
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Owen White
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Branislava Godic
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Anneke van der Walt
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Ramesh Rajan
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
7
|
Effect of interaural electrode insertion depth difference and independent band selection on sentence recognition in noise and spatial release from masking in simulated bilateral cochlear implant listening. Eur Arch Otorhinolaryngol 2023; 280:3209-3217. [PMID: 36695909 DOI: 10.1007/s00405-023-07845-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 01/17/2023] [Indexed: 01/26/2023]
Abstract
PURPOSE Inter-aural insertion depth difference (IEDD) in bilateral cochlear implant (BiCI) with continuous interleaved sampling (CIS) processing is known to reduce the recognition of speech in noise and spatial release from masking (SRM). However, the independent channel selection in the 'n-of-m' sound coding strategy might have a different effect on speech recognition and SRM when compared to the effects of IEDD in CIS-based findings. This study aimed to investigate the effect of bilateral 'n-of-m' processing strategy and interaural electrode insertion depth difference on speech recognition in noise and SRM under conditions that simulated bilateral cochlear implant listening. METHODS Five young adults with normal hearing sensitivity participated in the study. The target sentences were spatially filtered to originate from 0° and the masker was spatially filtered at 0°, 15°, 37.5°, and 90° using the Oldenburg head-related transfer function database for behind the ear microphone. A 22-channel sine wave vocoder processing based on 'n-of-m' processing was applied to the spatialized target-masker mixture, in each ear. The perceptual experiment involved a test of speech recognition in noise under one co-located condition (target and masker at 0°) and three spatially separated conditions (target at 0°, masker at 15°, 37.5°, or 90° to the right ear). RESULTS The results were analyzed using a three-way repeated measure analysis of variance (ANOVA). The effect of interaural insertion depth difference (F (2,8) = 3.145, p = 0.098, ɳ2 = 0.007) and spatial separation between target and masker (F (3,12) = 1.239, p = 0.339, ɳ2 = 0.004) on speech recognition in noise was not significant. CONCLUSIONS Speech recognition in noise and SRM were not affected by IEDD ≤ 3 mm. Bilateral 'n-of-m' processing resulted in reduced speech recognition in noise and SRM.
Collapse
|
8
|
Hládek Ľ, Seeber BU. Speech Intelligibility in Reverberation is Reduced During Self-Rotation. Trends Hear 2023; 27:23312165231188619. [PMID: 37475460 PMCID: PMC10363862 DOI: 10.1177/23312165231188619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 06/23/2023] [Accepted: 07/02/2023] [Indexed: 07/22/2023] Open
Abstract
Speech intelligibility in cocktail party situations has been traditionally studied for stationary sound sources and stationary participants. Here, speech intelligibility and behavior were investigated during active self-rotation of standing participants in a spatialized speech test. We investigated if people would rotate to improve speech intelligibility, and we asked if knowing the target location would be further beneficial. Target sentences randomly appeared at one of four possible locations: 0°, ± 90°, 180° relative to the participant's initial orientation on each trial, while speech-shaped noise was presented from the front (0°). Participants responded naturally with self-rotating motion. Target sentences were presented either without (Audio-only) or with a picture of an avatar (Audio-Visual). In a baseline (Static) condition, people were standing still without visual location cues. Participants' self-orientation undershot the target location and orientations were close to acoustically optimal. Participants oriented more often in an acoustically optimal way, and speech intelligibility was higher in the Audio-Visual than in the Audio-only condition for the lateral targets. The intelligibility of the individual words in Audio-Visual and Audio-only increased during self-rotation towards the rear target, but it was reduced for the lateral targets when compared to Static, which could be mostly, but not fully, attributed to changes in spatial unmasking. Speech intelligibility prediction based on a model of static spatial unmasking considering self-rotations overestimated the participant performance by 1.4 dB. The results suggest that speech intelligibility is reduced during self-rotation, and that visual cues of location help to achieve more optimal self-rotations and better speech intelligibility.
Collapse
Affiliation(s)
- Ľuboš Hládek
- Audio Information Processing, Technical University of Munich, Munich, Germany
| | - Bernhard U. Seeber
- Audio Information Processing, Technical University of Munich, Munich, Germany
| |
Collapse
|
9
|
Cochlear Implant Facilitates the Use of Talker Sex and Spatial Cues to Segregate Competing Speech in Unilaterally Deaf Listeners. Ear Hear 2023; 44:77-91. [PMID: 35733275 DOI: 10.1097/aud.0000000000001254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
OBJECTIVES Talker sex and spatial cues can facilitate segregation of competing speech. However, the spectrotemporal degradation associated with cochlear implants (CIs) can limit the benefit of talker sex and spatial cues. Acoustic hearing in the nonimplanted ear can improve access to talker sex cues in CI users. However, it's unclear whether the CI can improve segregation of competing speech when maskers are symmetrically placed around the target (i.e., when spatial cues are available), compared with acoustic hearing alone. The aim of this study was to investigate whether a CI can improve segregation of competing speech by individuals with unilateral hearing loss. DESIGN Speech recognition thresholds (SRTs) for competing speech were measured in 16 normal-hearing (NH) adults and 16 unilaterally deaf CI users. All participants were native speakers of Mandarin Chinese. CI users were divided into two groups according to thresholds in the nonimplanted ear: (1) single-sided deaf (SSD); pure-tone thresholds <25 dB HL at all audiometric frequencies, and (2) Asymmetric hearing loss (AHL; one or more thresholds > 25 dB HL). SRTs were measured for target sentences produced by a male talker in the presence of two masker talkers (different male or female talkers). The target sentence was always presented via loudspeaker directly in front of the listener (0°), and the maskers were either colocated with the target (0°) or spatially separated from the target at ±90°. Three segregation cue conditions were tested to measure masking release (MR) relative to the baseline condition: (1) Talker sex, (2) Spatial, and (3) Talker sex + Spatial. For CI users, SRTs were measured with the CI on or off. RESULTS Binaural MR was significantly better for the NH group than for the AHL or SSD groups ( P < 0.001 in all cases). For the NH group, mean MR was largest with the Talker sex + spatial cues (18.8 dB) and smallest for the Talker sex cues (10.7 dB). In contrast, mean MR for the SSD group was largest with the Talker sex + spatial cues (14.7 dB), and smallest with the Spatial cues (4.8 dB). For the AHL group, mean MR was largest with the Talker sex + spatial cues (7.8 dB) and smallest with the Talker sex (4.8 dB) and the Spatial cues (4.8 dB). MR was significantly better with the CI on than off for both the AHL ( P = 0.014) and SSD groups ( P < 0.001). Across all unilaterally deaf CI users, monaural (acoustic ear alone) and binaural MR were significantly correlated with unaided pure-tone average thresholds in the nonimplanted ear for the Talker sex and Talker sex + spatial conditions ( P < 0.001 in both cases) but not for the Spatial condition. CONCLUSION Although the CI benefitted unilaterally deaf listeners' segregation of competing speech, MR was much poorer than that observed in NH listeners. Different from previous findings with steady noise maskers, the CI benefit for segregation of competing speech from a different talker sex was greater in the SSD group than in the AHL group.
Collapse
|
10
|
Sheffield SW, Wheeler HJ, Brungart DS, Bernstein JGW. The Effect of Sound Localization on Auditory-Only and Audiovisual Speech Recognition in a Simulated Multitalker Environment. Trends Hear 2023; 27:23312165231186040. [PMID: 37415497 PMCID: PMC10331332 DOI: 10.1177/23312165231186040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/13/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023] Open
Abstract
Information regarding sound-source spatial location provides several speech-perception benefits, including auditory spatial cues for perceptual talker separation and localization cues to face the talker to obtain visual speech information. These benefits have typically been examined separately. A real-time processing algorithm for sound-localization degradation (LocDeg) was used to investigate how spatial-hearing benefits interact in a multitalker environment. Normal-hearing adults performed auditory-only and auditory-visual sentence recognition with target speech and maskers presented from loudspeakers at -90°, -36°, 36°, or 90° azimuths. For auditory-visual conditions, one target and three masking talker videos (always spatially separated) were rendered virtually in rectangular windows at these locations on a head-mounted display. Auditory-only conditions presented blank windows at these locations. Auditory target speech (always spatially aligned with the target video) was presented in co-located speech-shaped noise (experiment 1) or with three co-located or spatially separated auditory interfering talkers corresponding to the masker videos (experiment 2). In the co-located conditions, the LocDeg algorithm did not affect auditory-only performance but reduced target orientation accuracy, reducing auditory-visual benefit. In the multitalker environment, two spatial-hearing benefits were observed: perceptually separating competing speech based on auditory spatial differences and orienting to the target talker to obtain visual speech cues. These two benefits were additive, and both were diminished by the LocDeg algorithm. Although visual cues always improved performance when the target was accurately localized, there was no strong evidence that they provided additional assistance in perceptually separating co-located competing speech. These results highlight the importance of sound localization in everyday communication.
Collapse
Affiliation(s)
- Sterling W. Sheffield
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, FL, USA
| | - Harley J. Wheeler
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, MN, USA
| | - Douglas S. Brungart
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Joshua G. W. Bernstein
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| |
Collapse
|
11
|
Domain-specific hearing-in-noise performance is associated with absolute pitch proficiency. Sci Rep 2022; 12:16344. [PMID: 36175508 PMCID: PMC9521875 DOI: 10.1038/s41598-022-20869-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022] Open
Abstract
Recent evidence suggests that musicians may have an advantage over non-musicians in perceiving speech against noisy backgrounds. Previously, musicians have been compared as a homogenous group, despite demonstrated heterogeneity, which may contribute to discrepancies between studies. Here, we investigated whether “quasi”-absolute pitch (AP) proficiency, viewed as a general trait that varies across a spectrum, accounts for the musician advantage in hearing-in-noise (HIN) performance, irrespective of whether the streams are speech or musical sounds. A cohort of 12 non-musicians and 42 trained musicians stratified into high, medium, or low AP proficiency identified speech or melody targets masked in noise (speech-shaped, multi-talker, and multi-music) under four signal-to-noise ratios (0, − 3, − 6, and − 9 dB). Cognitive abilities associated with HIN benefits, including auditory working memory and use of visuo-spatial cues, were assessed. AP proficiency was verified against pitch adjustment and relative pitch tasks. We found a domain-specific effect on HIN perception: quasi-AP abilities were related to improved perception of melody but not speech targets in noise. The quasi-AP advantage extended to tonal working memory and the use of spatial cues, but only during melodic stream segregation. Overall, the results do not support the putative musician advantage in speech-in-noise perception, but suggest a quasi-AP advantage in perceiving music under noisy environments.
Collapse
|
12
|
Ahrens A, Lund KD. Auditory spatial analysis in reverberant multi-talker environments with congruent and incongruent audio-visual room information. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1586. [PMID: 36182305 DOI: 10.1121/10.0013991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 08/19/2022] [Indexed: 06/16/2023]
Abstract
In a multi-talker situation, listeners have the challenge of identifying a target speech source out of a mixture of interfering background noises. In the current study, it was investigated how listeners analyze audio-visual scenes with varying complexity in terms of number of talkers and reverberation. The visual information of the room was either congruent with the acoustic room or incongruent. The listeners' task was to locate an ongoing speech source in a mixture of other speech sources. The three-dimensional audio-visual scenarios were presented using a loudspeaker array and virtual reality glasses. It was shown that room reverberation, as well as the number of talkers in a scene, influence the ability to analyze an auditory scene in terms of accuracy and response time. Incongruent visual information of the room did not affect this ability. When few talkers were presented simultaneously, listeners were able to detect a target talker quickly and accurately even in adverse room acoustical conditions. Reverberation started to affect the response time when four or more talkers were presented. The number of talkers became a significant factor for five or more simultaneous talkers.
Collapse
Affiliation(s)
- Axel Ahrens
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Kasper Duemose Lund
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs, Lyngby, Denmark
| |
Collapse
|
13
|
Temporal and Directional Cue Effects on the Cocktail Party Problem for Patients With Listening Difficulties Without Clinical Hearing Loss. Ear Hear 2022; 43:1740-1751. [DOI: 10.1097/aud.0000000000001247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
14
|
Differing Bilateral Benefits for Spatial Release From Masking and Sound Localization Accuracy Using Bone Conduction Devices. Ear Hear 2022; 43:1708-1720. [PMID: 35588503 PMCID: PMC9592172 DOI: 10.1097/aud.0000000000001234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
OBJECTIVES Normal binaural hearing facilitates spatial hearing and therefore many everyday listening tasks, such as understanding speech against a backdrop of competing sounds originating from various locations, and localization of sounds. For stimulation with bone conduction hearing devices (BCD), used to alleviate conductive hearing losses, limited transcranial attenuation results in cross-stimulation so that both cochleae are stimulated from the position of the bone conduction transducer. As such, interaural time and level differences, hallmarks of binaural hearing, are unpredictable at the level of the inner ears. The aim of this study was to compare spatial hearing by unilateral and bilateral BCD stimulation in normal-hearing listeners with simulated bilateral conductive hearing loss. DESIGN Bilateral conductive hearing loss was reversibly induced in 25 subjects (mean age = 28.5 years) with air conduction and bone conduction (BC) pure-tone averages across 0.5, 1, 2, and 4 kHz (PTA 4 ) <5 dB HL. The mean (SD) PTA 4 for the simulated conductive hearing loss was 48.2 dB (3.8 dB). Subjects participated in a speech-in-speech task and a horizontal sound localization task in a within-subject repeated measures design (unilateral and bilateral bone conduction stimulation) using Baha 5 clinical sound processors on a softband. For the speech-in-speech task, the main outcome measure was the threshold for 40% correct speech recognition when masking speech and target speech were both colocated (0°) and spatially and symmetrically separated (target 0°, maskers ±30° and ±150°). Spatial release from masking was quantified as the difference between colocated and separated masking and target speech thresholds. For the localization task, the main outcome measure was the overall variance in localization accuracy quantified as an error index (0.0 = perfect performance; 1.0 = random performance). Four stimuli providing various spatial cues were used in the sound localization task. RESULTS The bilateral BCD benefit for recognition thresholds of speech in competing speech was statistically significant but small regardless if the masking speech signals were colocated with, or spatially and symmetrically separated from, the target speech. Spatial release from masking was identical for unilateral and bilateral conditions, and significantly different from zero. A distinct bilateral BCD sound localization benefit existed but varied in magnitude across stimuli. The smallest benefit occurred for a low-frequency stimulus (octave-filtered noise, CF = 0.5 kHz), and the largest benefit occurred for unmodulated broadband and narrowband (octave-filtered noise, CF = 4.0 kHz) stimuli. Sound localization by unilateral BCD was poor across stimuli. CONCLUSIONS Results suggest that the well-known transcranial transmission of BC sound affects bilateral BCD benefits for spatial processing of sound in differing ways. Results further suggest that patients with bilateral conductive hearing loss and BC thresholds within the normal range may benefit from a bilateral fitting of BCD, particularly for horizontal localization of sounds.
Collapse
|
15
|
Can visual capture of sound separate auditory streams? Exp Brain Res 2022; 240:813-824. [PMID: 35048159 DOI: 10.1007/s00221-021-06281-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 11/21/2021] [Indexed: 11/04/2022]
Abstract
In noisy contexts, sound discrimination improves when the auditory sources are separated in space. This phenomenon, named Spatial Release from Masking (SRM), arises from the interaction between the auditory information reaching the ear and spatial attention resources. To examine the relative contribution of these two factors, we exploited an audio-visual illusion in a hearing-in-noise task to create conditions in which the initial stimulation to the ears is held constant, while the perceived separation between speech and masker is changed illusorily (visual capture of sound). In two experiments, we asked participants to identify a string of five digits pronounced by a female voice, embedded in either energetic (Experiment 1) or informational (Experiment 2) noise, before reporting the perceived location of the heard digits. Critically, the distance between target digits and masking noise was manipulated both physically (from 22.5 to 75.0 degrees) and illusorily, by pairing target sounds with visual stimuli either at same (audio-visual congruent) or different positions (15 degrees offset, leftward or rightward: audio-visual incongruent). The proportion of correctly reported digits increased with the physical separation between the target and masker, as expected from SRM. However, despite effective visual capture of sounds, performance was not modulated by illusory changes of target sound position. Our results are compatible with a limited role of central factors in the SRM phenomenon, at least in our experimental setting. Moreover, they add to the controversial literature on the limited effects of audio-visual capture in auditory stream separation.
Collapse
|
16
|
Cortical Processing of Binaural Cues as Shown by EEG Responses to Random-Chord Stereograms. J Assoc Res Otolaryngol 2021; 23:75-94. [PMID: 34904205 PMCID: PMC8783002 DOI: 10.1007/s10162-021-00820-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/06/2021] [Indexed: 10/26/2022] Open
Abstract
Spatial hearing facilitates the perceptual organization of complex soundscapes into accurate mental representations of sound sources in the environment. Yet, the role of binaural cues in auditory scene analysis (ASA) has received relatively little attention in recent neuroscientific studies employing novel, spectro-temporally complex stimuli. This may be because a stimulation paradigm that provides binaurally derived grouping cues of sufficient spectro-temporal complexity has not yet been established for neuroscientific ASA experiments. Random-chord stereograms (RCS) are a class of auditory stimuli that exploit spectro-temporal variations in the interaural envelope correlation of noise-like sounds with interaurally coherent fine structure; they evoke salient auditory percepts that emerge only under binaural listening. Here, our aim was to assess the usability of the RCS paradigm for indexing binaural processing in the human brain. To this end, we recorded EEG responses to RCS stimuli from 12 normal-hearing subjects. The stimuli consisted of an initial 3-s noise segment with interaurally uncorrelated envelopes, followed by another 3-s segment, where envelope correlation was modulated periodically according to the RCS paradigm. Modulations were applied either across the entire stimulus bandwidth (wideband stimuli) or in temporally shifting frequency bands (ripple stimulus). Event-related potentials and inter-trial phase coherence analyses of the EEG responses showed that the introduction of the 3- or 5-Hz wideband modulations produced a prominent change-onset complex and ongoing synchronized responses to the RCS modulations. In contrast, the ripple stimulus elicited a change-onset response but no response to ongoing RCS modulation. Frequency-domain analyses revealed increased spectral power at the fundamental frequency and the first harmonic of wideband RCS modulations. RCS stimulation yields robust EEG measures of binaurally driven auditory reorganization and has potential to provide a flexible stimulation paradigm suitable for isolating binaural effects in ASA experiments.
Collapse
|
17
|
So W, Smith SB. Comparison of two cortical measures of binaural hearing acuity. Int J Audiol 2021; 60:875-884. [PMID: 33345686 PMCID: PMC8244817 DOI: 10.1080/14992027.2020.1860260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 11/29/2020] [Accepted: 12/01/2020] [Indexed: 10/22/2022]
Abstract
OBJECTIVE Multiple studies have demonstrated binaural hearing deficits in the aging and those with hearing loss. Consequently, there is great interest in developing efficient clinical tests of binaural hearing acuity to improve diagnostic assessments and to assist clinicians when fitting binaural hearing aids and/or cochlear implants. DESIGN Two cortical measures of interaural phase difference sensitivity, the acoustic change complex (ACC) and interaural phase modulation following response (IPM-FR), were compared on three metrics using five different stimulus interaural phase differences (IPDs; 0°, ±22.5°, ±45°, ±67.5° and ±90°). These metrics were scalp topography, time-to-detect, and input-output characteristics. STUDY SAMPLE Ten young, normal-hearing listeners. RESULTS Scalp topography qualitatively differed between ACC and IPM-FR. The IPM-FR demonstrated better time-to-detect performance on smaller (±22.5° and ±45°) but not larger (67.5°, and ±90°) IPDs. Input-output characteristics of each response were similar. CONCLUSIONS The IPM-FR may be a faster and more efficient tool for assessing neural sensitivity to subtle IPD changes. However, the ACC may be useful for research or clinical questions concerned with the topographic representation of binaural cues.
Collapse
Affiliation(s)
- Won So
- Department of Communication Sciences and Disorders, The University of Texas at Austin, Austin, TX, USA
| | - Spencer B Smith
- Department of Communication Sciences and Disorders, The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
18
|
Vicente T, Buchholz JM, Lavandier M. Modelling binaural unmasking and the intelligibility of speech in noise and reverberation for normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3275. [PMID: 34852607 DOI: 10.1121/10.0006736] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 09/26/2021] [Indexed: 05/25/2023]
Abstract
This study investigated the effect of hearing loss on binaural unmasking (BU) for the intelligibility of speech in noise. Speech reception thresholds (SRTs) were measured with normal-hearing (NH) listeners and older mildly hearing-impaired (HI) listeners while varying the presentation level of the stimuli, reverberation, modulation of the noise masker, and spatial separation of the speech and noise sources. On average across conditions, the NH listeners benefited more (by 0.6 dB) from BU than HI listeners. The binaural intelligibility model developed by Vicente, Lavandier, and Buchholz [J. Acoust. Soc. Am. 148, 3305-3317 (2020)] was used to describe the data, accurate predictions were obtained for the conditions considering moderate noise levels [50 and 60 dB sound pressure level (SPL)]. The interaural jitters that were involved in the prediction of BU had to be revised to describe the data measured at a lower level (40 dB SPL). Across all tested conditions, the correlation between the measured and predicted SRTs was 0.92, whereas the mean prediction error was 0.9 dB.
Collapse
Affiliation(s)
- Thibault Vicente
- Department of Linguistics-Audiology, Australian Hearing Hub, Macquarie University, New South Wales, 2109, Australia
| | - Jörg M Buchholz
- Department of Linguistics-Audiology, Australian Hearing Hub, Macquarie University, New South Wales, 2109, Australia
| | - Mathieu Lavandier
- Univ. Lyon, ENTPE, Laboratoire de Tribologie et Dynamique des Systèmes UMR 5513, Rue M. Audin, 69518 Vaulx-en-Velin Cedex, France
| |
Collapse
|
19
|
Lavandier M, Mason CR, Baltzell LS, Best V. Individual differences in speech intelligibility at a cocktail party: A modeling perspective. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:1076. [PMID: 34470293 PMCID: PMC8561716 DOI: 10.1121/10.0005851] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 07/07/2021] [Accepted: 07/21/2021] [Indexed: 06/13/2023]
Abstract
This study aimed at predicting individual differences in speech reception thresholds (SRTs) in the presence of symmetrically placed competing talkers for young listeners with sensorineural hearing loss. An existing binaural model incorporating the individual audiogram was revised to handle severe hearing losses by (a) taking as input the target speech level at SRT in a given condition and (b) introducing a floor in the model to limit extreme negative better-ear signal-to-noise ratios. The floor value was first set using SRTs measured with stationary and modulated noises. The model was then used to account for individual variations in SRTs found in two previously published data sets that used speech maskers. The model accounted well for the variation in SRTs across listeners with hearing loss, based solely on differences in audibility. When considering listeners with normal hearing, the model could predict the best SRTs, but not the poorer SRTs, suggesting that other factors limit performance when audibility (as measured with the audiogram) is not compromised.
Collapse
Affiliation(s)
- Mathieu Lavandier
- Univ. Lyon, ENTPE, Laboratoire de Tribologie et Dynamique des Systèmes UMR 5513, Rue Maurice Audin, F-69518 Vaulx-en-Velin Cedex, France
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Lucas S Baltzell
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
20
|
Effects of Simulated and Profound Unilateral Sensorineural Hearing Loss on Recognition of Speech in Competing Speech. Ear Hear 2021; 41:411-419. [PMID: 31356386 DOI: 10.1097/aud.0000000000000764] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Unilateral hearing loss (UHL) is a condition as common as bilateral hearing loss in adults. Because of the unilaterally reduced audibility associated with UHL, binaural processing of sounds may be disrupted. As a consequence, daily tasks such as listening to speech in a background of spatially distinct competing sounds may be challenging. A growing body of subjective and objective data suggests that spatial hearing is negatively affected by UHL. However, the type and degree of UHL vary considerably in previous studies. The aim here was to determine the effect of a profound sensorineural UHL, and of a simulated UHL, on recognition of speech in competing speech, and the binaural and monaural contributions to spatial release from masking, in a demanding multisource listening environment. DESIGN Nine subjects (25 to 61 years) with profound sensorineural UHL [mean pure-tone average (PTA) across 0.5, 1, 2, and 4 kHz = 105 dB HL] and normal contralateral hearing (mean PTA = 7.2 dB HL) were included based on the criterion that the target and competing speech were inaudible in the ear with hearing loss. Thirteen subjects with normal hearing (19 to 60 years; mean left PTA = 4.1 dB HL; mean right PTA = 5.5 dB HL) contributed data in normal and simulated "mild-to-moderate" UHL conditions (PTA = 38.6 dB HL). The main outcome measure was the threshold for 40% correct speech recognition in colocated (0°) and spatially and symmetrically separated (±30° and ±150°) competing speech conditions. Spatial release from masking was quantified as the threshold difference between colocated and separated conditions. RESULTS Thresholds in profound UHL were higher (worse) than normal hearing in separated and colocated conditions, and comparable to simulated UHL. Monaural spatial release from masking, that is, the spatial release achieved by subjects with profound UHL, was significantly different from zero and 49% of the magnitude of the spatial release from masking achieved by subjects with normal hearing. There were subjects with profound UHL who showed negative spatial release, whereas subjects with normal hearing consistently showed positive spatial release from masking in the normal condition. The simulated UHL had a larger effect on the speech recognition threshold for separated than for colocated conditions, resulting in decreased spatial release from masking. The difference in spatial release between normal-hearing and simulated UHL conditions increased with age. CONCLUSIONS The results demonstrate that while recognition of speech in colocated and separated competing speech is impaired for profound sensorineural UHL, spatial release from masking may be possible when competing speech is symmetrically distributed around the listener. A "mild-to-moderate" simulated UHL decreases spatial release from masking compared with normal-hearing conditions and interacts with age, indicating that small amounts of residual hearing in the UHL ear may be more beneficial for separated than for colocated interferer conditions for young listeners.
Collapse
|
21
|
Goverts ST, Colburn HS. Binaural Recordings in Natural Acoustic Environments: Estimates of Speech-Likeness and Interaural Parameters. Trends Hear 2021; 24:2331216520972858. [PMID: 33331242 PMCID: PMC7750905 DOI: 10.1177/2331216520972858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Binaural acoustic recordings were made in multiple natural environments, which were chosen to be similar to those reported to be difficult for listeners with impaired hearing. These environments include natural conversations that take place in the presence of other sound sources as found in restaurants, walking or biking in the city, and so on. Sounds from these environments were recorded binaurally with in-the-ear microphones and were analyzed with respect to speech-likeness measures and interaural difference measures. The speech-likeness measures were based on amplitude–modulation patterns within frequency bands and were estimated for 1-s time-slices. The interaural difference measures included interaural coherence, interaural time difference, and interaural level difference, which were estimated for time-slices of 20-ms duration. These binaural measures were documented for one-fourth-octave frequency bands centered at 500 Hz and for the envelopes of one-fourth-octave bands centered at 2000 Hz. For comparison purposes, the same speech-likeness and interaural difference measures were computed for a set of virtual recordings that mimic typical clinical test configurations. These virtual recordings were created by filtering anechoic waveforms with available head-related transfer functions and combining them to create multiple source combinations. Overall, the speech-likeness results show large variability within and between environments, and they demonstrate the importance of having information from both ears available. Furthermore, the interaural parameter results show that the natural recordings contain a relatively small proportion of time-slices with high coherence compared with the virtual recordings; however, when present, binaural cues might be used for selecting intervals with good speech intelligibility for individual sources.
Collapse
Affiliation(s)
- S Theo Goverts
- Otolaryngology-Head and Neck Surgery, Ear & Hearing, Amsterdam Public Health, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - H Steven Colburn
- Biomedical Engineering Department, Boston University, Boston, Massachusetts, United States
| |
Collapse
|
22
|
Marrufo-Pérez MI, Araquistain-Serrat L, Eustaquio-Martín A, Lopez-Poveda EA. On the importance of interaural noise coherence and the medial olivocochlear reflex for binaural unmasking in free-field listening. Hear Res 2021; 405:108246. [PMID: 33872834 DOI: 10.1016/j.heares.2021.108246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 03/25/2021] [Accepted: 03/31/2021] [Indexed: 11/15/2022]
Abstract
For speech in competition with a noise source in the free field, normal-hearing (NH) listeners recognize speech better when listening binaurally than when listening monaurally with the ear that has the better acoustic signal-to-noise ratio (SNR). This benefit from listening binaurally is known as binaural unmasking and indicates that the brain combines information from the two ears to improve intelligibility. Here, we address three questions pertaining to binaural unmasking for NH listeners. First, we investigate if binaural unmasking results from combining the speech and/or the noise from the two ears. In a simulated acoustic free field with speech and noise sources at 0° and 270°azimuth, respectively, we found comparable unmasking regardless of whether the speech was present or absent in the ear with the worse SNR. This indicates that binaural unmasking probably involves combining only the noise at the two ears. Second, we investigate if having binaurally coherent location cues for the noise signal is sufficient for binaural unmasking to occur. We found no unmasking when location cues were coherent but noise signals were generated incoherent or were processed unilaterally through a hearing aid with linear, minimal amplification. This indicates that binaural unmasking requires interaurally coherent noise signals, source location cues, and processing. Third, we investigate if the hypothesized antimasking benefits of the medial olivocochlear reflex (MOCR) contribute to binaural unmasking. We found comparable unmasking regardless of whether speech tokens (words) were sufficiently delayed from the noise onset to fully activate the MOCR or not. Moreover, unmasking was absent when the noise was binaurally incoherent whereas the physiological antimasking effects of the MOCR are similar for coherent and incoherent noises. This indicates that the MOCR is unlikely involved in binaural unmasking.
Collapse
Affiliation(s)
- Miriam I Marrufo-Pérez
- Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Calle Pintor Fernando Gallego 1, Salamanca 37007, Spain; Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca, Salamanca 37007, Spain
| | - Leire Araquistain-Serrat
- Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Calle Pintor Fernando Gallego 1, Salamanca 37007, Spain
| | - Almudena Eustaquio-Martín
- Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Calle Pintor Fernando Gallego 1, Salamanca 37007, Spain; Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca, Salamanca 37007, Spain
| | - Enrique A Lopez-Poveda
- Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Calle Pintor Fernando Gallego 1, Salamanca 37007, Spain; Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca, Salamanca 37007, Spain; Departamento de Cirugía, Facultad de Medicina, Universidad de Salamanca, Salamanca 37007, Spain.
| |
Collapse
|
23
|
Wang X, Xu L. Speech perception in noise: Masking and unmasking. J Otol 2021; 16:109-119. [PMID: 33777124 PMCID: PMC7985001 DOI: 10.1016/j.joto.2020.12.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 12/03/2020] [Accepted: 12/06/2020] [Indexed: 11/23/2022] Open
Abstract
Speech perception is essential for daily communication. Background noise or concurrent talkers, on the other hand, can make it challenging for listeners to track the target speech (i.e., cocktail party problem). The present study reviews and compares existing findings on speech perception and unmasking in cocktail party listening environments in English and Mandarin Chinese. The review starts with an introduction section followed by related concepts of auditory masking. The next two sections review factors that release speech perception from masking in English and Mandarin Chinese, respectively. The last section presents an overall summary of the findings with comparisons between the two languages. Future research directions with respect to the difference in literature on the reviewed topic between the two languages are also discussed.
Collapse
Affiliation(s)
- Xianhui Wang
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| |
Collapse
|
24
|
Cuevas-Rodriguez M, Gonzalez-Toledo D, Reyes-Lecuona A, Picinali L. Impact of non-individualised head related transfer functions on speech-in-noise performances within a synthesised virtual environment. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2573. [PMID: 33940900 DOI: 10.1121/10.0004220] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 03/19/2021] [Indexed: 06/12/2023]
Abstract
When performing binaural spatialisation, it is widely accepted that the choice of the head related transfer functions (HRTFs), and in particular the use of individually measured ones, can have an impact on localisation accuracy, externalization, and overall realism. Yet the impact of HRTF choices on speech-in-noise performances in cocktail party-like scenarios has not been investigated in depth. This paper introduces a study where 22 participants were presented with a frontal speech target and two lateral maskers, spatialised using a set of non-individual HRTFs. Speech reception threshold (SRT) was measured for each HRTF. Furthermore, using the SRT predicted by an existing speech perception model, the measured values were compensated in the attempt to remove overall HRTF-specific benefits. Results show significant overall differences among the SRTs measured using different HRTFs, consistently with the results predicted by the model. Individual differences between participants related to their SRT performances using different HRTFs could also be found, but their significance was reduced after the compensation. The implications of these findings are relevant to several research areas related to spatial hearing and speech perception, suggesting that when testing speech-in-noise performances within binaurally rendered virtual environments, the choice of the HRTF for each individual should be carefully considered.
Collapse
Affiliation(s)
- Maria Cuevas-Rodriguez
- Departamento de Tecnología Electrónica, Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| | - Daniel Gonzalez-Toledo
- Departamento de Tecnología Electrónica, Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| | - Arcadio Reyes-Lecuona
- Departamento de Tecnología Electrónica, Universidad de Málaga, ETSI Telecomunicación, 29010 Málaga, Spain
| | - Lorenzo Picinali
- Dyson School of Design Engineering, Imperial College London, London SW7 2DB, United Kingdom
| |
Collapse
|
25
|
Atılgan A, Çiprut A. Effects of spatial separation with better- ear listening on N1-P2 complex. Auris Nasus Larynx 2021; 48:1067-1073. [PMID: 33745789 DOI: 10.1016/j.anl.2021.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 02/16/2021] [Accepted: 03/02/2021] [Indexed: 11/15/2022]
Abstract
OBJECTIVE The purpose of this study was to determine better- ear listening effect on spatial separation with the N1-P2 complex. METHODS Twenty individuals with normal hearing participated in this study. The speech stimulus /ba/ was presented in front of the participant (0°). Continuous Speech Noise (5 dB signal-to-noise ratio) was presented either in front of the participant (0°), left-side (-90°), or right-side (+90°). N1- P2 complex has been recorded in quiet and three noisy conditions. RESULTS There was a remarkable effect of noise direction on N1, P2 latencies. When the noise was separated from the stimulus, N1 and P2 latency increased in terms of when noise was co-located with the stimulus. There was no statistically significant difference in N1-P2 amplitudes between the stimulus-only and co-located condition. N1-P2 amplitude was increased when the noise came from the sides, according to the stimulus-only and co-located conditions. CONCLUSION These findings demonstrate that the latency shifts on N1-P2 complex explain cortical mechanisms of spatial separation in better-ear listening.
Collapse
Affiliation(s)
- Atılım Atılgan
- Marmara University, School of Medicine, Audiology Department, İstanbul, Turkey; İstanbul Medeniyet University, Faculty of Health Sciences, Audiology Department, İstanbul, Turkey.
| | - Ayça Çiprut
- Marmara University, School of Medicine, Audiology Department, İstanbul, Turkey
| |
Collapse
|
26
|
Ahrens A, Cuevas-Rodriguez M, Brimijoin WO. Speech intelligibility with various head-related transfer functions: A computational modelling approach. JASA EXPRESS LETTERS 2021; 1:034401. [PMID: 36154562 DOI: 10.1121/10.0003618] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Speech intelligibility (SI) is known to be affected by the relative spatial position between target and interferers. The benefit of a spatial separation is, along with other factors, related to the head-related transfer function (HRTF). The HRTF is individually different and thus, the cues that affect SI might also be different. In the current study, an auditory model was employed to predict SI with various HRTFs and at different angles on the horizontal plane. The predicted SI threshold was found to be largely different across HRTFs. Thus, individual listeners might have different access to SI cues, dependent on their HRTF.
Collapse
Affiliation(s)
- Axel Ahrens
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | | | | |
Collapse
|
27
|
Iva P, Fielding J, Clough M, White O, Godic B, Martin R, Rajan R. Speech Discrimination Tasks: A Sensitive Sensory and Cognitive Measure in Early and Mild Multiple Sclerosis. Front Neurosci 2021; 14:604991. [PMID: 33424540 PMCID: PMC7786116 DOI: 10.3389/fnins.2020.604991] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 11/30/2020] [Indexed: 11/13/2022] Open
Abstract
There is a need for reliable and objective measures of early and mild symptomology in multiple sclerosis (MS), as deficits can be subtle and difficult to quantify objectively in patients without overt physical deficits. We hypothesized that a speech-in-noise (SiN) task would be sensitive to demyelinating effects on precise neural timing and diffuse higher-level networks required for speech intelligibility, and therefore be a useful tool for monitoring sensory and cognitive changes in early MS. The objective of this study was to develop a SiN task for clinical use that sensitively monitors disease activity in early (<5 years) and late (>10 years) stages of MS subjects with mild severity [Expanded Disability Status Scale (EDSS) score < 3]. Pre-recorded Bamford-Kowal-Bench sentences and isolated keywords were presented at five signal-to-noise ratios (SNR) in one of two background noises: speech-weighted noise and eight-talker babble. All speech and noise were presented via headphones to controls (n = 38), early MS (n = 23), and late MS (n = 12) who were required to verbally repeat the target speech. MS subjects also completed extensive neuropsychological testing which included: Paced Auditory Serial Addition Test, Digit Span Test, and California Verbal Learning Test. Despite normal hearing thresholds, subjects with early and late mild MS displayed speech discrimination deficits when sentences and words were presented in babble - but not speech-weighted noise. Significant correlations between SiN performance and standardized neuropsychological assessments indicated that MS subjects with lower functional scores also had poorer speech discrimination. Furthermore, a quick 5-min task with words and keywords presented in multi-talker babble at an SNR of -1 dB was 82% accurate in discriminating mildly impaired MS individuals (median EDSS = 0) from healthy controls. Quantifying functional deficits in mild MS will help clinicians to maximize the opportunities to preserve neurological reserve in patients with appropriate therapeutic management, particularly in the earliest stages. Given that physical assessments are not informative in this fully ambulatory cohort, a quick 5-min task with words and keywords presented in multi-talker babble at a single SNR could serve as a complementary test for clinical use due to its ease of use and speed.
Collapse
Affiliation(s)
- Pippa Iva
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Joanne Fielding
- Department of Neuroscience, Central Clinical School, Monash University, Alfred Centre, Melbourne, VIC, Australia
| | - Meaghan Clough
- Department of Neuroscience, Central Clinical School, Monash University, Alfred Centre, Melbourne, VIC, Australia
| | - Owen White
- Department of Neuroscience, Central Clinical School, Monash University, Alfred Centre, Melbourne, VIC, Australia
| | - Branislava Godic
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Russell Martin
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Ramesh Rajan
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
28
|
Hausfeld L, Shiell M, Formisano E, Riecke L. Cortical processing of distracting speech in noisy auditory scenes depends on perceptual demand. Neuroimage 2020; 228:117670. [PMID: 33359352 DOI: 10.1016/j.neuroimage.2020.117670] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 12/13/2020] [Accepted: 12/14/2020] [Indexed: 11/15/2022] Open
Abstract
Selective attention is essential for the processing of multi-speaker auditory scenes because they require the perceptual segregation of the relevant speech ("target") from irrelevant speech ("distractors"). For simple sounds, it has been suggested that the processing of multiple distractor sounds depends on bottom-up factors affecting task performance. However, it remains unclear whether such dependency applies to naturalistic multi-speaker auditory scenes. In this study, we tested the hypothesis that increased perceptual demand (the processing requirement posed by the scene to separate the target speech) reduces the cortical processing of distractor speech thus decreasing their perceptual segregation. Human participants were presented with auditory scenes including three speakers and asked to selectively attend to one speaker while their EEG was acquired. The perceptual demand of this selective listening task was varied by introducing an auditory cue (interaural time differences, ITDs) for segregating the target from the distractor speakers, while acoustic differences between the distractors were matched in ITD and loudness. We obtained a quantitative measure of the cortical segregation of distractor speakers by assessing the difference in how accurately speech-envelope following EEG responses could be predicted by models of averaged distractor speech versus models of individual distractor speech. In agreement with our hypothesis, results show that interaural segregation cues led to improved behavioral word-recognition performance and stronger cortical segregation of the distractor speakers. The neural effect was strongest in the δ-band and at early delays (0 - 200 ms). Our results indicate that during low perceptual demand, the human cortex represents individual distractor speech signals as more segregated. This suggests that, in addition to purely acoustical properties, the cortical processing of distractor speakers depends on factors like perceptual demand.
Collapse
Affiliation(s)
- Lars Hausfeld
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands.
| | - Martha Shiell
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands; Maastricht Centre for Systems Biology, 6200MD Maastricht, The Netherlands
| | - Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands
| |
Collapse
|
29
|
Hauth CF, Berning SC, Kollmeier B, Brand T. Modeling Binaural Unmasking of Speech Using a Blind Binaural Processing Stage. Trends Hear 2020; 24:2331216520975630. [PMID: 33305690 PMCID: PMC7734536 DOI: 10.1177/2331216520975630] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.
Collapse
Affiliation(s)
- Christopher F Hauth
- Medizinische Physik and Cluster of Excellence Hearing4All Carl-von-Ossietzky Universität Oldenburg, Oldenburg, Germany
| | - Simon C Berning
- Medizinische Physik and Cluster of Excellence Hearing4All Carl-von-Ossietzky Universität Oldenburg, Oldenburg, Germany
| | - Birger Kollmeier
- Medizinische Physik and Cluster of Excellence Hearing4All Carl-von-Ossietzky Universität Oldenburg, Oldenburg, Germany
| | - Thomas Brand
- Medizinische Physik and Cluster of Excellence Hearing4All Carl-von-Ossietzky Universität Oldenburg, Oldenburg, Germany
| |
Collapse
|
30
|
Gao X, Yan T, Huang T, Li X, Zhang YX. Speech in noise perception improved by training fine auditory discrimination: far and applicable transfer of perceptual learning. Sci Rep 2020; 10:19320. [PMID: 33168921 PMCID: PMC7653913 DOI: 10.1038/s41598-020-76295-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 10/21/2020] [Indexed: 12/12/2022] Open
Abstract
A longstanding focus of perceptual learning research is learning specificity, the difficulty for learning to transfer to tasks and situations beyond the training setting. Previous studies have focused on promoting transfer across stimuli, such as from one sound frequency to another. Here we examined whether learning could transfer across tasks, particularly from fine discrimination of sound features to speech perception in noise, one of the most frequently encountered perceptual challenges in real life. Separate groups of normal-hearing listeners were trained on auditory interaural level difference (ILD) discrimination, interaural time difference (ITD) discrimination, and fundamental frequency (F0) discrimination with non-speech stimuli delivered through headphones. While ITD training led to no improvement, both ILD and F0 training produced learning as well as transfer to speech-in-noise perception when noise differed from speech in the trained feature. These training benefits did not require similarity of task or stimuli between training and application settings, construing far and wide transfer. Thus, notwithstanding task specificity among basic perceptual skills such as discrimination of different sound features, auditory learning appears readily transferable between these skills and their “upstream” tasks utilizing them, providing an effective approach to improving performance in challenging situations or challenged populations.
Collapse
Affiliation(s)
- Xiang Gao
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Tingting Yan
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Ting Huang
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Xiaoli Li
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Yu-Xuan Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
31
|
Vicente T, Lavandier M, Buchholz JM. A binaural model implementing an internal noise to predict the effect of hearing impairment on speech intelligibility in non-stationary noises. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:3305. [PMID: 33261412 DOI: 10.1121/10.0002660] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 10/22/2020] [Indexed: 05/20/2023]
Abstract
A binaural model predicting speech intelligibility in envelope-modulated noise for normal-hearing (NH) and hearing-impaired listeners is proposed. The study shows the importance of considering an internal noise with two components relying on the individual audiogram and the level of the external stimuli. The model was optimized and verified using speech reception thresholds previously measured in three experiments involving NH and hearing-impaired listeners and sharing common methods. The anechoic target, in front of the listener, was presented simultaneously through headphones with two anechoic noise-vocoded speech maskers (VSs) either co-located with the target or spatially separated using an infinite broadband interaural level difference without crosstalk between ears. In experiment 1, two stationary noise maskers were also tested. In experiment 2, the VSs were presented at different sensation levels to vary audibility. In experiment 3, the effects of realistic interaural time and level differences were also tested. The model was applied to two datasets involving NH listeners to verify its backward compatibility. It was optimized to predict the data, leading to a correlation and mean absolute error between data and predictions above 0.93 and below 1.1 dB, respectively. The different internal noise approaches proposed in the literature to describe hearing impairment are discussed.
Collapse
Affiliation(s)
- Thibault Vicente
- Université de Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue Maurice Audin, 69518 Vaulx-en-Velin Cedex, France
| | - Mathieu Lavandier
- Université de Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue Maurice Audin, 69518 Vaulx-en-Velin Cedex, France
| | - Jörg M Buchholz
- Department of Linguistics-Audiology, Australian Hearing Hub, Macquarie University, 2109 New South Wales, Australia
| |
Collapse
|
32
|
Zhang J, Wang X, Wang NY, Fu X, Gan T, Galvin JJ, Willis S, Xu K, Thomas M, Fu QJ. Tonal Language Speakers Are Better Able to Segregate Competing Speech According to Talker Sex Differences. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2801-2810. [PMID: 32692939 PMCID: PMC7872724 DOI: 10.1044/2020_jslhr-19-00421] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2019] [Revised: 04/01/2020] [Accepted: 05/15/2020] [Indexed: 06/01/2023]
Abstract
Purpose The aim of this study was to compare release from masking (RM) between Mandarin-speaking and English-speaking listeners with normal hearing for competing speech when target-masker sex cues, spatial cues, or both were available. Method Speech recognition thresholds (SRTs) for competing speech were measured in 21 Mandarin-speaking and 15 English-speaking adults with normal hearing using a modified coordinate response measure task. SRTs were measured for target sentences produced by a male talker in the presence of two masker talkers (different male talkers or female talkers). The target sentence was always presented directly in front of the listener, and the maskers were either colocated with the target or were spatially separated from the target (+90°, -90°). Stimuli were presented via headphones and were virtually spatialized using head-related transfer functions. Three masker conditions were used to measure RM relative to the baseline condition: (a) talker sex cues, (b) spatial cues, or (c) combined talker sex and spatial cues. Results The results showed large amounts of RM according to talker sex and/or spatial cues. There was no significant difference in SRTs between Chinese and English listeners for the baseline condition, where no talker sex or spatial cues were available. Furthermore, there was no significant difference in RM between Chinese and English listeners when spatial cues were available. However, RM was significantly larger for Chinese listeners when talker sex cues or combined talker sex and spatial cues were available. Conclusion Listeners who speak a tonal language such as Mandarin Chinese may be able to take greater advantage of talker sex cues than listeners who do not speak a tonal language.
Collapse
Affiliation(s)
- Juan Zhang
- Department of Otolaryngology, Head and Neck Surgery, Beijing Chaoyang Hospital, Capital Medical University, China
| | - Xing Wang
- Department of Otolaryngology, Head and Neck Surgery, Beijing Chaoyang Hospital, Capital Medical University, China
| | - Ning-yu Wang
- Department of Otolaryngology, Head and Neck Surgery, Beijing Chaoyang Hospital, Capital Medical University, China
| | - Xin Fu
- Department of Otolaryngology, Head and Neck Surgery, Beijing Chaoyang Hospital, Capital Medical University, China
| | - Tian Gan
- Department of Otolaryngology, Head and Neck Surgery, Beijing Chaoyang Hospital, Capital Medical University, China
| | | | - Shelby Willis
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles
| | - Kevin Xu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles
| | - Mathew Thomas
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles
| |
Collapse
|
33
|
Picou EM, Davis H, Lewis D, Tharpe AM. Contralateral Routing of Signal Systems Can Improve Speech Recognition and Comprehension in Dynamic Classrooms. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2468-2482. [PMID: 32574079 DOI: 10.1044/2020_jslhr-19-00411] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Objective The purpose of this study was to evaluate the effects of hearing aid-based rerouting systems (remote microphone [RM] and contralateral routing of signals [CROS]) on speech recognition and comprehension for children with limited usable hearing unilaterally. A secondary purpose was to evaluate students' perceptions of CROS benefits in classrooms. Method Twenty children aged 10-16 years with limited useable hearing in one ear completed tasks of sentence recognition and comprehension in a laboratory. For both tasks, speech was presented from one of four loudspeakers in an interleaved fashion. Speech loudspeakers were either midline, monaural direct, or monaural indirect, and noise loudspeakers surrounded the participant. Throughout testing, the RM was always near the midline loudspeaker. Six established users of CROS systems completed a newly developed questionnaire that queried experiences in diverse listening situations. Results There were no effects of RM or CROS use on performance for speech presented from front or monaural direct loudspeakers. However, for monaural indirect loudspeakers, CROS improved sentence recognition and RM impaired recognition. In the comprehension task, CROS improved comprehension by 11 rationalized arcsine units, but RM did not affect comprehension. Questionnaire results demonstrated that students report CROS benefits for talkers in the front and from the side, but not for situations requiring localization. Conclusions The results support CROS benefits without CROS disadvantages in a laboratory environment that reflects a dynamic classroom. Thus, CROS systems have the potential to improve hearing in contemporary classrooms for students, especially if there is only a single microphone.
Collapse
Affiliation(s)
- Erin M Picou
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville TN
| | - Hilary Davis
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville TN
| | - Dawna Lewis
- Boys Town National Research Hospital, Omaha, NE
| | - Anne Marie Tharpe
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville TN
| |
Collapse
|
34
|
Schoenmaker E, van de Par S. The role of reliable interaural time difference cues in ambiguous binaural signals for the intelligibility of multitalker speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:4041. [PMID: 32611159 DOI: 10.1121/10.0001382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 05/15/2020] [Indexed: 06/11/2023]
Abstract
When listening to speech in the presence of concurrent talkers, listeners can benefit from glimpses that occur as a result of spectro-temporal modulations in the speech signals. These glimpses are characterized by a high local signal-to-noise ratio and allow listeners to collect relatively undistorted and reliable information on target speech features. A series of experiments was designed to measure the spatial advantage for binaurally presented speech when useful interaural time difference (ITD) information was provided only in glimpses of speech signals with otherwise ambiguous ITDs. For interaurally coherent signals, ITD information provided by target glimpses contributed substantially to the spatial advantage, but consistent target ITDs overall appeared to be of minor importance to speech intelligibility. For interaurally incoherent signals, a similarly large contribution of coherent ITD information in glimpses to the spatial advantage was not observed. Rather, target speech intelligibility depended on the interaural coherence of the interfering speech signals. While the previous observation conforms with models of auditory object formation, and the latter is consistent with equalization-cancellation theory modeling the spatial advantage, the two seem to be at odds for the presented set of experiments. A conceptual framework employing different strategies to process the perceptual foreground and background may solve this issue.
Collapse
Affiliation(s)
- Esther Schoenmaker
- Acoustics Group, Cluster of Excellence Hearing4all, Carl von Ossietzky University, Carl-von-Ossietzky-Strasse 9-11, 26129 Oldenburg, Germany
| | - Steven van de Par
- Acoustics Group, Cluster of Excellence Hearing4all, Carl von Ossietzky University, Carl-von-Ossietzky-Strasse 9-11, 26129 Oldenburg, Germany
| |
Collapse
|
35
|
Sutojo S, Par S, Schoenmaker E. Contribution of binaural masking release to improved speech intelligibility for different masker types. Eur J Neurosci 2020; 51:1339-1352. [DOI: 10.1111/ejn.13980] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Revised: 04/23/2018] [Accepted: 05/18/2018] [Indexed: 11/28/2022]
Affiliation(s)
- Sarinah Sutojo
- Acoustics Group, Cluster of Excellence Hearing4all Carl von Ossietzky University Oldenburg Germany
| | - Steven Par
- Acoustics Group, Cluster of Excellence Hearing4all Carl von Ossietzky University Oldenburg Germany
| | - Esther Schoenmaker
- Acoustics Group, Cluster of Excellence Hearing4all Carl von Ossietzky University Oldenburg Germany
| |
Collapse
|
36
|
Kubiak AM, Rennies J, Ewert SD, Kollmeier B. Prediction of individual speech recognition performance in complex listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1379. [PMID: 32237817 DOI: 10.1121/10.0000759] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 01/31/2020] [Indexed: 06/11/2023]
Abstract
This study examined how well individual speech recognition thresholds in complex listening scenarios could be predicted by a current binaural speech intelligibility model. Model predictions were compared with experimental data measured for seven normal-hearing and 23 hearing-impaired listeners who differed widely in their degree of hearing loss, age, as well as performance in clinical speech tests. The experimental conditions included two masker types (multi-talker or two-talker maskers), and two spatial conditions (maskers co-located with the frontal target or symmetrically separated from the target). The results showed that interindividual variability could not be well predicted by a model including only individual audiograms. Predictions improved when an additional individual "proficiency factor" was derived from one of the experimental conditions or a standard speech test. Overall, the current model can predict individual performance relatively well (except in conditions high in informational masking), but the inclusion of age-related factors may lead to even further improvements.
Collapse
Affiliation(s)
- Aleksandra M Kubiak
- Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Cluster of Excellence "Hearing4all," Oldenburg, Germany
| | - Jan Rennies
- Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Cluster of Excellence "Hearing4all," Oldenburg, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Birger Kollmeier
- Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Cluster of Excellence "Hearing4all," Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
37
|
Ahrens A, Marschall M, Dau T. The effect of spatial energy spread on sound image size and speech intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1368. [PMID: 32237851 DOI: 10.1121/10.0000747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 01/30/2020] [Indexed: 06/11/2023]
Abstract
This study explored the relationship between perceived sound image size and speech intelligibility for sound sources reproduced over loudspeakers. Sources with varying degrees of spatial energy spread were generated using ambisonics processing. Young normal-hearing listeners estimated sound image size as well as performed two spatial release from masking (SRM) tasks with two symmetrically arranged interfering talkers. Either the target-to-masker ratio or the separation angle was varied adaptively. Results showed that the sound image size did not change systematically with the energy spread. However, a larger energy spread did result in a decreased SRM. Furthermore, the listeners needed a greater angular separation angle between the target and the interfering sources for sources with a larger energy spread. Further analysis revealed that the method employed to vary the energy spread did not lead to systematic changes in the interaural cross correlations. Future experiments with competing talkers using ambisonics or similar methods may consider the resulting energy spread in relation to the minimum separation angle between sound sources in order to avoid degradations in speech intelligibility.
Collapse
Affiliation(s)
- Axel Ahrens
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Building 352, Ørsteds Plads, 2800 Kongens Lyngby, Denmark
| | - Marton Marschall
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Building 352, Ørsteds Plads, 2800 Kongens Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Building 352, Ørsteds Plads, 2800 Kongens Lyngby, Denmark
| |
Collapse
|
38
|
Baltzell LS, Swaminathan J, Cho AY, Lavandier M, Best V. Binaural sensitivity and release from speech-on-speech masking in listeners with and without hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1546. [PMID: 32237845 PMCID: PMC7060089 DOI: 10.1121/10.0000812] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 02/07/2020] [Accepted: 02/11/2020] [Indexed: 05/29/2023]
Abstract
Listeners with sensorineural hearing loss routinely experience less spatial release from masking (SRM) in speech mixtures than listeners with normal hearing. Hearing-impaired listeners have also been shown to have degraded temporal fine structure (TFS) sensitivity, a consequence of which is degraded access to interaural time differences (ITDs) contained in the TFS. Since these "binaural TFS" cues are critical for spatial hearing, it has been hypothesized that degraded binaural TFS sensitivity accounts for the limited SRM experienced by hearing-impaired listeners. In this study, speech stimuli were noise-vocoded using carriers that were systematically decorrelated across the left and right ears, thus simulating degraded binaural TFS sensitivity. Both (1) ITD sensitivity in quiet and (2) SRM in speech mixtures spatialized using ITDs (or binaural release from masking; BRM) were measured as a function of TFS interaural decorrelation in young normal-hearing and hearing-impaired listeners. This allowed for the examination of the relationship between ITD sensitivity and BRM over a wide range of ITD thresholds. This paper found that, for a given ITD sensitivity, hearing-impaired listeners experienced less BRM than normal-hearing listeners, suggesting that binaural TFS sensitivity can account for only a modest portion of the BRM deficit in hearing-impaired listeners. However, substantial individual variability was observed.
Collapse
Affiliation(s)
- Lucas S Baltzell
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Adrian Y Cho
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Mathieu Lavandier
- University of Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue Maurice Audin, F-69518 Vaulx-en-Velin Cedex, France
| | - Virginia Best
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
39
|
Luo L, Xu N, Wang Q, Li L. Disparity in interaural time difference improves the accuracy of neural representations of individual concurrent narrowband sounds in rat inferior colliculus and auditory cortex. J Neurophysiol 2020; 123:695-706. [PMID: 31891521 DOI: 10.1152/jn.00284.2019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The central mechanisms underlying binaural unmasking for spectrally overlapping concurrent sounds, which are unresolved in the peripheral auditory system, remain largely unknown. In this study, frequency-following responses (FFRs) to two binaurally presented independent narrowband noises (NBNs) with overlapping spectra were recorded simultaneously in the inferior colliculus (IC) and auditory cortex (AC) in anesthetized rats. The results showed that for both IC FFRs and AC FFRs, introducing an interaural time difference (ITD) disparity between the two concurrent NBNs enhanced the representation fidelity, reflected by the increased coherence between the responses evoked by double-NBN stimulation and the responses evoked by single NBNs. The ITD disparity effect varied across frequency bands, being more marked for higher frequency bands in the IC and lower frequency bands in the AC. Moreover, the coherence between IC responses and AC responses was also enhanced by the ITD disparity, and the enhancement was most prominent for low-frequency bands and the IC and the AC on the same side. These results suggest a critical role of the ITD cue in the neural segregation of spectrotemporally overlapping sounds.NEW & NOTEWORTHY When two spectrally overlapped narrowband noises are presented at the same time with the same sound-pressure level, they mask each other. Introducing a disparity in interaural time difference between these two narrowband noises improves the accuracy of the neural representation of individual sounds in both the inferior colliculus and the auditory cortex. The lower frequency signal transformation from the inferior colliculus to the auditory cortex on the same side is also enhanced, showing the effect of binaural unmasking.
Collapse
Affiliation(s)
- Lu Luo
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Na Xu
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Qian Wang
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Liang Li
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.,Beijing Institute for Brain Disorders, Beijing, China
| |
Collapse
|
40
|
Zamiri Abdollahi F, Delphi M, Delphi V. The Correlation Analysis Between the Spatial Hearing Questionnaire (SHQ) and the Psychophysical Measurement of Spatial Hearing. Indian J Otolaryngol Head Neck Surg 2019; 71:1658-1662. [PMID: 31750232 DOI: 10.1007/s12070-019-01674-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Accepted: 05/27/2019] [Indexed: 11/24/2022] Open
Abstract
The aim of the present study was examining the relationship between a psychophysical spatial hearing test (spatial word in noise test) and Spatial Hearing Questionnaire. Sixty-six adults (18-40 years old) were divided in three groups: normal subjects, subjects with mild and moderate hearing loss. Spatial word in noise test and Persian version of the spatial hearing questionnaire were evaluated and compared among these groups. According to Pearson's test, there was a significant positive correlation between the scores of spatial word in noise test and Persian version of the Spatial Hearing Questionnaire in three groups (r = 0.64-0.89). Hearing loss can deteriorate spatial hearing ability. Both objective and subjective spatial hearing tests are shown to be effective in detecting spatial hearing disorder.
Collapse
Affiliation(s)
| | - Maryam Delphi
- 2Musculoskeletal Rehabilitation Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Vafa Delphi
- 2Musculoskeletal Rehabilitation Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| |
Collapse
|
41
|
Erdem BK, Çiprut A. Evaluation of Speech, Spatial Perception and Hearing Quality in Unilateral, Bimodal and Bilateral Cochlear Implant Users. Turk Arch Otorhinolaryngol 2019; 57:149-153. [PMID: 31620697 DOI: 10.5152/tao.2019.4105] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Accepted: 04/02/2019] [Indexed: 11/22/2022] Open
Abstract
Objective The aim of the study was to conduct a scale-based evaluation of the hearing skills of unilateral, bimodal and bilateral cochlear implant (CI) users, including distinguishing, orientating and locating speech and environmental sounds in their surrounding environment that they are exposed to in different contexts of everyday life. The scale results were compared between groups. Methods A total of 74 cochlear implant users, 30 unilateral, 30 bimodal and 14 bilateral, were included in the study. Their ages ranged from 11 to 64 years. Participants were assessed using the Speech, Spatial and Qualities of Hearing Scale (SSQ). Results Bilateral CI users' subjective ratings of their own hearing skills were found to be significantly better than those of bimodal and unilateral CI users; bimodal users' subjective ratings were also found to be significantly better than those of unilateral CI users. Paired comparisons showed statistically significant differences between the groups in terms of total scores of Speech, Spatial, Qualities of Hearing and General SSQ (p<0.05). Conclusion Our findings show that bilateral use of cochlear implants should be recommended for those presently using bimodal and unilateral devices. Moreover, subjective tests should be used regularly along with objective tests for evaluating CI patients.
Collapse
Affiliation(s)
- Büşra Koçak Erdem
- Department of Audiology, Lütfi Kırdar Training and Research Hospital, İstanbul, Turkey
| | - Ayça Çiprut
- Department of Audiology, Marmara University School of Medicine, İstanbul, Turkey
| |
Collapse
|
42
|
Muñoz RV, Aspöck L, Fels J. Spatial Release From Masking Under Different Reverberant Conditions in Young and Elderly Subjects: Effect of Moving or Stationary Maskers at Circular and Radial Conditions. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:3582-3595. [PMID: 31525113 DOI: 10.1044/2019_jslhr-h-19-0092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Purpose Normal-hearing and hard-of-hearing listeners suffer from reduced speech intelligibility in noisy and reverberant environments. Although daily listening environments are in constant motion, most researchers have only studied speech-in-noise perception for stationary masker locations. The aim of this study was to investigate the spatial release from masking (SRM) of circularly and radially moving maskers under different room acoustic conditions for young and elderly subjects. Method Twelve young subjects with normal hearing and 12 elderly subjects with normal hearing or mild hearing loss were tested. Several different room acoustic conditions were simulated and reproduced via headphones using binaural synthesis. The target speech stream consisted of German digit triplets, and masker stream consisted of quasistationary noise with matched long-term averaged speech spectra. During the experiment, the position of the masker was changed to be in different stationary positions, or varied continuously. In the latter case, it was moved either on a circular trajectory spanning a 90° azimuth angle or on a radial trajectory linearly increasing the distance to the receiver from 0.5 m to 1.8 m. Absorption characteristics of the virtual room's surfaces were changed, recreating an anechoic room, a treated room with mean reverberation times (RT60) = 0.48 s, and an untreated room with mean RT60 = 1.26 s. Results For the circular condition, a significant difference was found between moving and stationary maskers, F(4, 44) = 20.91, p < .001, with a bigger SRM for stationary maskers than moving masker conditions. Also, both age groups displayed a significant decrease in SRM over the reverberation conditions: F(2, 22) = 12.24, p < .001. For the radial condition, both age groups showed a significant decrease in SRM over the reverberation conditions, F(2, 22) = 13.62, p < .001, as well as the moving and stationary masker conditions, F(8, 88) = 29.23, p < .001. In general, the SRM of a moving masker decreased when the reverberation increased, especially for elderly subjects. Conclusions A radially moving masker led to improved SRM in an anechoic environment for both age groups, whereas a circularly moving masker caused degraded SRM, especially for elderly subjects in the highly reverberant environment. Supplemental Material https://doi.org/10.23641/asha.9795371.
Collapse
Affiliation(s)
- Rhoddy Viveros Muñoz
- Teaching and Research Area of Medical Acoustics, Institute of Technical Acoustics, RWTH Aachen University, Germany
| | - Lukas Aspöck
- Chair and Institute of Technical Acoustics, RWTH Aachen University, Germany
| | - Janina Fels
- Teaching and Research Area of Medical Acoustics, Institute of Technical Acoustics, RWTH Aachen University, Germany
| |
Collapse
|
43
|
Deng Y, Choi I, Shinn-Cunningham B, Baumgartner R. Impoverished auditory cues limit engagement of brain networks controlling spatial selective attention. Neuroimage 2019; 202:116151. [PMID: 31493531 DOI: 10.1016/j.neuroimage.2019.116151] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 08/02/2019] [Accepted: 08/31/2019] [Indexed: 12/30/2022] Open
Abstract
Spatial selective attention enables listeners to process a signal of interest in natural settings. However, most past studies on auditory spatial attention used impoverished spatial cues: presenting competing sounds to different ears, using only interaural differences in time (ITDs) and/or intensity (IIDs), or using non-individualized head-related transfer functions (HRTFs). Here we tested the hypothesis that impoverished spatial cues impair spatial auditory attention by only weakly engaging relevant cortical networks. Eighteen normal-hearing listeners reported the content of one of two competing syllable streams simulated at roughly +30° and -30° azimuth. The competing streams consisted of syllables from two different-sex talkers. Spatialization was based on natural spatial cues (individualized HRTFs), individualized IIDs, or generic ITDs. We measured behavioral performance as well as electroencephalographic markers of selective attention. Behaviorally, subjects recalled target streams most accurately with natural cues. Neurally, spatial attention significantly modulated early evoked sensory response magnitudes only for natural cues, not in conditions using only ITDs or IIDs. Consistent with this, parietal oscillatory power in the alpha band (8-14 Hz; associated with filtering out distracting events from unattended directions) showed significantly less attentional modulation with isolated spatial cues than with natural cues. Our findings support the hypothesis that spatial selective attention networks are only partially engaged by impoverished spatial auditory cues. These results not only suggest that studies using unnatural spatial cues underestimate the neural effects of spatial auditory attention, they also illustrate the importance of preserving natural spatial cues in assistive listening devices to support robust attentional control.
Collapse
Affiliation(s)
- Yuqi Deng
- Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Inyong Choi
- Communication Sciences & Disorders, University of Iowa, Iowa City, IA, 52242, USA
| | - Barbara Shinn-Cunningham
- Biomedical Engineering, Boston University, Boston, MA, 02215, USA; Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Robert Baumgartner
- Biomedical Engineering, Boston University, Boston, MA, 02215, USA; Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria.
| |
Collapse
|
44
|
Rouhbakhsh N, Mahdi J, Hwo J, Nobel B, Mousave F. Spatial hearing processing: electrophysiological documentation at subcortical and cortical levels. Int J Neurosci 2019; 129:1119-1132. [DOI: 10.1080/00207454.2019.1635129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Nematollah Rouhbakhsh
- HEARing Cooperation Research Centre, Melbourne, Australia
- Department of Audiology and Speech Pathology, School of Health Sciences, University of Melbourne, Melbourne, Australia
- National Acoustic Laboratories, Australian Hearing Hub, Macquarie University, Sydney, Australia
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Pich-e Shemiran, Tehran, Iran
| | - John Mahdi
- The New York Academy of Sciences, New York, NY, USA
| | - Jacob Hwo
- Department of Biomedical Science, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Baran Nobel
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, Queensland, Australia
| | - Fati Mousave
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, Queensland, Australia
| |
Collapse
|
45
|
Tissieres I, Crottaz-Herbette S, Clarke S. Implicit representation of the auditory space: contribution of the left and right hemispheres. Brain Struct Funct 2019; 224:1569-1582. [PMID: 30848352 DOI: 10.1007/s00429-019-01853-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 02/25/2019] [Indexed: 11/24/2022]
Abstract
Spatial cues contribute to the ability to segregate sound sources and thus facilitate their detection and recognition. This implicit use of spatial cues can be preserved in cases of cortical spatial deafness, suggesting that partially distinct neural networks underlie the explicit sound localization and the implicit use of spatial cues. We addressed this issue by assessing 40 patients, 20 patients with left and 20 patients with right hemispheric damage, for their ability to use auditory spatial cues implicitly in a paradigm of spatial release from masking (SRM) and explicitly in sound localization. The anatomical correlates of their performance were determined with voxel-based lesion-symptom mapping (VLSM). During the SRM task, the target was always presented at the centre, whereas the masker was presented at the centre or at one of the two lateral positions on the right or left side. The SRM effect was absent in some but not all patients; the inability to perceive the target when the masker was at one of the lateral positions correlated with lesions of the left temporo-parieto-frontal cortex or of the right inferior parietal lobule and the underlying white matter. As previously reported, sound localization depended critically on the right parietal and opercular cortex. Thus, explicit and implicit use of spatial cues depends on at least partially distinct neural networks. Our results suggest that the implicit use may rely on the left-dominant position-linked representation of sound objects, which has been demonstrated in previous EEG and fMRI studies.
Collapse
Affiliation(s)
- Isabel Tissieres
- Service de neuropsychologie et de neuroréhabilitation, Centre Hospitalier Universitaire Vaudois (CHUV), Université de Lausanne, Lausanne, Switzerland
| | - Sonia Crottaz-Herbette
- Service de neuropsychologie et de neuroréhabilitation, Centre Hospitalier Universitaire Vaudois (CHUV), Université de Lausanne, Lausanne, Switzerland
| | - Stephanie Clarke
- Service de neuropsychologie et de neuroréhabilitation, Centre Hospitalier Universitaire Vaudois (CHUV), Université de Lausanne, Lausanne, Switzerland.
| |
Collapse
|
46
|
Rana B, Buchholz JM. Effect of improving audibility on better-ear glimpsing using non-linear amplification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:3465. [PMID: 30599669 DOI: 10.1121/1.5083823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 11/30/2018] [Indexed: 06/09/2023]
Abstract
Better-ear glimpsing (BEG) utilizes interaural level differences (ILDs) to improve speech intelligibility in noise. This spatial benefit is reduced in most hearing-impaired (HI) listeners due to their increased hearing loss at high frequencies. Even though this benefit can be improved by providing increased amplification, the improvement is limited by loudness discomfort. An alternative solution therefore extends ILDs to low frequencies, which has been shown to provide a substantial benefit from BEG. In contrast to previous studies, which only applied linear stimulus manipulations, wide dynamic range compression was applied here to improve the audibility of soft sounds while ensuring loudness comfort for loud sounds. Performance in both speech intelligibility and BEG was measured in 13 HI listeners at three different masker levels and for different interaural stimulus manipulations. The results revealed that at low signal levels, performance substantially improved with increasing masker level, but this improvement was reduced by the compressive behaviour at higher levels. Moreover, artificially extending ILDs by applying infinite (broadband) ILDs provided an extra spatial benefit in speech reception thresholds of up to 5 dB on top of that already provided by natural ILDs and interaural time differences, which increased with increasing signal level.
Collapse
Affiliation(s)
- Baljeet Rana
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia
| | - Jörg M Buchholz
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia
| |
Collapse
|
47
|
Rennies J, Kidd G. Benefit of binaural listening as revealed by speech intelligibility and listening effort. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2147. [PMID: 30404476 PMCID: PMC6185866 DOI: 10.1121/1.5057114] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 09/13/2018] [Accepted: 09/13/2018] [Indexed: 05/22/2023]
Abstract
In contrast to the well-known benefits for speech intelligibility, the advantage afforded by binaural stimulus presentation for reducing listening effort has not been thoroughly examined. This study investigated spatial release of listening effort and its relation to binaural speech intelligibility in listeners with normal hearing. Psychometric functions for speech intelligibility of a frontal target talker masked by a stationary speech-shaped noise were estimated for several different noise azimuths, different degrees of reverberation, and by maintaining only interaural level or time differences. For each of these conditions, listening effort was measured using a categorical scaling procedure. The results revealed that listening effort was significantly reduced when target and masker were spatially separated in anechoic conditions. This effect extended well into the range of signal-to-noise ratios (SNRs) in which speech intelligibility was at ceiling, and disappeared only at the highest SNRs. In reverberant conditions, spatial release from listening effort was observed for high, but not low, direct-to-reverberant ratios. The findings suggest that listening effort assessment can be a useful method for revealing the benefits of spatial separation of sources under realistic listening conditions comprising favorable SNRs and low reverberation, which typically are not apparent by other means.
Collapse
Affiliation(s)
- Jan Rennies
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
48
|
Rana B, Buchholz JM. Effect of audibility on better-ear glimpsing as a function of frequency in normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:2195. [PMID: 29716302 DOI: 10.1121/1.5031007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Better-ear glimpsing (BEG) is an auditory phenomenon that helps understanding speech in noise by utilizing interaural level differences (ILDs). The benefit provided by BEG is limited in hearing-impaired (HI) listeners by reduced audibility at high frequencies. Rana and Buchholz [(2016). J. Acoust. Soc. Am. 140(2), 1192-1205] have shown that artificially enhancing ILDs at low and mid frequencies can help HI listeners understanding speech in noise, but the achieved benefit is smaller than in normal-hearing (NH) listeners. To understand how far this difference is explained by differences in audibility, audibility was carefully controlled here in ten NH and ten HI listeners and speech reception thresholds (SRTs) in noise were measured in a spatially separated and co-located condition as a function of frequency and sensation level. Maskers were realized by noise-vocoded speech and signals were spatialized using artificially generated broadband ILDs. The spatial benefit provided by BEG and SRTs improved consistently with increasing sensation level, but was limited in the HI listeners by loudness discomfort. Further, the HI listeners performed similar to NH listeners when differences in audibility were compensated. The results help to understand the hearing aid gain that is required to maximize the spatial benefit provided by ILDs as a function of frequency.
Collapse
Affiliation(s)
- Baljeet Rana
- National Acoustic Laboratories, 16 University Avenue, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Jörg M Buchholz
- National Acoustic Laboratories, 16 University Avenue, Macquarie University, Sydney, New South Wales 2109, Australia
| |
Collapse
|
49
|
Corbin NE, Buss E, Leibold LJ. Spatial Release From Masking in Children: Effects of Simulated Unilateral Hearing Loss. Ear Hear 2018; 38:223-235. [PMID: 27787392 PMCID: PMC5321780 DOI: 10.1097/aud.0000000000000376] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The purpose of this study was twofold: (1) to determine the effect of an acute simulated unilateral hearing loss on children's spatial release from masking in two-talker speech and speech-shaped noise, and (2) to develop a procedure to be used in future studies that will assess spatial release from masking in children who have permanent unilateral hearing loss. There were three main predictions. First, spatial release from masking was expected to be larger in two-talker speech than in speech-shaped noise. Second, simulated unilateral hearing loss was expected to worsen performance in all listening conditions, but particularly in the spatially separated two-talker speech masker. Third, spatial release from masking was expected to be smaller for children than for adults in the two-talker masker. DESIGN Participants were 12 children (8.7 to 10.9 years) and 11 adults (18.5 to 30.4 years) with normal bilateral hearing. Thresholds for 50%-correct recognition of Bamford-Kowal-Bench sentences were measured adaptively in continuous two-talker speech or speech-shaped noise. Target sentences were always presented from a loudspeaker at 0° azimuth. The masker stimulus was either co-located with the target or spatially separated to +90° or -90° azimuth. Spatial release from masking was quantified as the difference between thresholds obtained when the target and masker were co-located and thresholds obtained when the masker was presented from +90° or -90° azimuth. Testing was completed both with and without a moderate simulated unilateral hearing loss, created with a foam earplug and supra-aural earmuff. A repeated-measures design was used to compare performance between children and adults, and performance in the no-plug and simulated-unilateral-hearing-loss conditions. RESULTS All listeners benefited from spatial separation of target and masker stimuli on the azimuth plane in the no-plug listening conditions; this benefit was larger in two-talker speech than in speech-shaped noise. In the simulated-unilateral-hearing-loss conditions, a positive spatial release from masking was observed only when the masker was presented ipsilateral to the simulated unilateral hearing loss. In the speech-shaped noise masker, spatial release from masking in the no-plug condition was similar to that obtained when the masker was presented ipsilateral to the simulated unilateral hearing loss. In contrast, in the two-talker speech masker, spatial release from masking in the no-plug condition was much larger than that obtained when the masker was presented ipsilateral to the simulated unilateral hearing loss. When either masker was presented contralateral to the simulated unilateral hearing loss, spatial release from masking was negative. This pattern of results was observed for both children and adults, although children performed more poorly overall. CONCLUSIONS Children and adults with normal bilateral hearing experience greater spatial release from masking for a two-talker speech than a speech-shaped noise masker. Testing in a two-talker speech masker revealed listening difficulties in the presence of disrupted binaural input that were not observed in a speech-shaped noise masker. This procedure offers promise for the assessment of spatial release from masking in children with permanent unilateral hearing loss.
Collapse
Affiliation(s)
- Nicole E. Corbin
- Department of Allied Health Sciences, Division of Speech and Hearing Sciences, University of North Carolina at Chapel Hill, School of Medicine, Chapel Hill, NC, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, School of Medicine, Chapel Hill, NC, USA
| | | |
Collapse
|
50
|
Jakien KM, Kampel SD, Gordon SY, Gallun FJ. The Benefits of Increased Sensation Level and Bandwidth for Spatial Release From Masking. Ear Hear 2018; 38:e13-e21. [PMID: 27556520 PMCID: PMC5161636 DOI: 10.1097/aud.0000000000000352] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 06/03/2016] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Spatial release from masking (SRM) can increase speech intelligibility in complex listening environments. The goal of the present study was to document how speech-in-speech stimuli could be best processed to encourage optimum SRM for listeners who represent a range of ages and amounts of hearing loss. We examined the effects of equating stimulus audibility among listeners, presenting stimuli at uniform sensation levels (SLs), and filtering stimuli at two separate bandwidths. DESIGN Seventy-one participants completed two speech intelligibility experiments (36 listeners in experiment 1; all 71 in experiment 2) in which a target phrase from the coordinate response measure (CRM) and two masking phrases from the CRM were presented simultaneously via earphones using a virtual spatial array, such that the target sentence was always at 0 degree azimuth angle and the maskers were either colocated or positioned at ±45 degrees. Experiments 1 and 2 examined the impacts of SL, age, and hearing loss on SRM. Experiment 2 also assessed the effects of stimulus bandwidth on SRM. RESULTS Overall, listeners' ability to achieve SRM improved with increased SL. Younger listeners with less hearing loss achieved more SRM than older or hearing-impaired listeners. It was hypothesized that SL and bandwidth would result in dissociable effects on SRM. However, acoustical analysis revealed that effective audible bandwidth, defined as the highest frequency at which the stimulus was audible at both ears, was the best predictor of performance. Thus, increasing SL seemed to improve SRM by increasing the effective bandwidth rather than increasing the level of already audible components. CONCLUSIONS Performance for all listeners, regardless of age or hearing loss, improved with an increase in overall SL and/or bandwidth, but the improvement was small relative to the benefits of spatial separation.
Collapse
Affiliation(s)
- Kasey M. Jakien
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| | - Sean D. Kampel
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| | - Samuel Y. Gordon
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| | - Frederick J. Gallun
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| |
Collapse
|