1
|
Best V, Roverud E. Externalization of Speech When Listening With Hearing Aids. Trends Hear 2024; 28:23312165241229572. [PMID: 38347733 PMCID: PMC10865954 DOI: 10.1177/23312165241229572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/08/2024] [Accepted: 01/15/2024] [Indexed: 02/15/2024] Open
Abstract
Subjective reports indicate that hearing aids can disrupt sound externalization and/or reduce the perceived distance of sounds. Here we conducted an experiment to explore this phenomenon and to quantify how frequently it occurs for different hearing-aid styles. Of particular interest were the effects of microphone position (behind the ear vs. in the ear) and dome type (closed vs. open). Participants were young adults with normal hearing or with bilateral hearing loss, who were fitted with hearing aids that allowed variations in the microphone position and the dome type. They were seated in a large sound-treated booth and presented with monosyllabic words from loudspeakers at a distance of 1.5 m. Their task was to rate the perceived externalization of each word using a rating scale that ranged from 10 (at the loudspeaker in front) to 0 (in the head) to -10 (behind the listener). On average, compared to unaided listening, hearing aids tended to reduce perceived distance and lead to more in-the-head responses. This was especially true for closed domes in combination with behind-the-ear microphones. The behavioral data along with acoustical recordings made in the ear canals of a manikin suggest that increased low-frequency ear-canal levels (with closed domes) and ambiguous spatial cues (with behind-the-ear microphones) may both contribute to breakdowns of externalization.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, MA 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, MA 02215, USA
| |
Collapse
|
2
|
Roverud E, Villard S, Kidd G. Strength of target source segregation cues affects the outcome of speech-on-speech masking experiments. J Acoust Soc Am 2023; 153:2780. [PMID: 37140176 PMCID: PMC10319449 DOI: 10.1121/10.0019307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 04/11/2023] [Accepted: 04/14/2023] [Indexed: 05/05/2023]
Abstract
In speech-on-speech listening experiments, some means for designating which talker is the "target" must be provided for the listener to perform better than chance. However, the relative strength of the segregation variables designating the target could affect the results of the experiment. Here, we examine the interaction of two source segregation variables-spatial separation and talker gender differences-and demonstrate that the relative strengths of these cues may affect the interpretation of the results. Participants listened to sentence pairs spoken by different-gender target and masker talkers, presented naturally or vocoded (degrading gender cues), either colocated or spatially separated. Target and masker words were temporally interleaved to eliminate energetic masking in either an every-other-word or randomized order of presentation. Results showed that the order of interleaving had no effect on recall performance. For natural speech with strong talker gender cues, spatial separation of sources yielded no improvement in performance. For vocoded speech with degraded talker gender cues, performance improved significantly with spatial separation of sources. These findings reveal that listeners may shift among target source segregation cues contingent on cue viability. Finally, performance was poor when the target was designated after stimulus presentation, indicating strong reliance on the cues.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Sarah Villard
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
3
|
Roverud E, Dubno JR, Richards VM, Kidd G. Cross-frequency weights in normal and impaired hearing: Stimulus factors, stimulus dimensions, and associations with speech recognition. J Acoust Soc Am 2021; 150:2327. [PMID: 34717459 PMCID: PMC8637742 DOI: 10.1121/10.0006450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 09/02/2021] [Accepted: 09/08/2021] [Indexed: 06/13/2023]
Abstract
Previous studies of level discrimination reported that listeners with high-frequency sensorineural hearing loss (SNHL) place greater weight on high frequencies than normal-hearing (NH) listeners. It is not clear whether these results are influenced by stimulus factors (e.g., group differences in presentation levels, cross-frequency discriminability of level differences used to measure weights) and whether such weights generalize to other tasks. Here, NH and SNHL weights were measured for level, duration, and frequency discrimination of two-tone complexes after measuring discriminability just-noticeable differences for each frequency and stimulus dimension. Stimuli were presented at equal sensation level (SL) or equal sound pressure level (SPL). Results showed that weights could change depending on which frequency contained the more discriminable level difference with uncontrolled cross-frequency discriminability. When cross-frequency discriminability was controlled, weights were consistent for level and duration discrimination, but not for frequency discrimination. Comparing equal SL and equal SPL weights indicated greater weight on the higher-level tone for level and duration discrimination. Weights were unrelated to improvements in recognition of low-pass-filtered speech with increasing cutoff frequency. These results suggest that cross-frequency weights and NH and SNHL weighting differences are influenced by stimulus factors and may not generalize to the use of speech cues in specific frequency regions.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| | - Virginia M Richards
- Department of Cognitive Sciences, 2201 Social and Behavioral Sciences Gateway, University of California-Irvine, Irvine, California 92697-5100, USA
| | - Gerald Kidd
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
4
|
Roverud E, Bradlow A, Kidd G. Examining the sentence superiority effect for sentences presented and reported in forwards or backwards order. Appl Psycholinguist 2020; 41:381-400. [PMID: 34121781 PMCID: PMC8191368 DOI: 10.1017/s014271642000003x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Memory for speech benefits from linguistic structure. Recall is better for sentences than for random strings of words (the "sentence superiority effect"; SSE), and evidence suggests that ongoing speech may be organized advantageously as clauses in memory (recall by word position shows within-clause "U shape"). In this study, we examined the SSE and clause-based organization for closed-set speech materials with low semantic predictability and without typical prosody. An overall SSE was observed and accuracy by word position was enhanced at the clause boundaries for these materials. Next, we tested the effects of mental manipulation on the SSE and clause-based organization. Listeners heard word strings that were syntactic, were arranged syntactically then presented backwards, or were random draws. Participants responded to materials as presented or in reversed order, requiring mental manipulation. Clause-level organization was apparent only for materials presented in syntactic order regardless of response order. After accounting for benefits due to reductions in uncertainty for these close-set materials, an SSE was present for syntactic materials regardless of response order, and for the syntactic backwards condition with reverse-order response (yielding a syntactically correct sentence in the response). Thus, the SSE was both resistant to and could be obtained following mental manipulation.
Collapse
|
5
|
Roverud E, Dubno JR, Kidd G. Hearing-Impaired Listeners Show Reduced Attention to High-Frequency Information in the Presence of Low-Frequency Information. Trends Hear 2020; 24:2331216520945516. [PMID: 32853117 PMCID: PMC7557677 DOI: 10.1177/2331216520945516] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 06/30/2020] [Accepted: 07/07/2020] [Indexed: 11/29/2022] Open
Abstract
Many listeners with sensorineural hearing loss have uneven hearing sensitivity across frequencies. This study addressed whether this uneven hearing loss leads to a biasing of attention to different frequency regions. Normal-hearing (NH) and hearing-impaired (HI) listeners performed a pattern discrimination task at two distant center frequencies (CFs): 750 and 3500 Hz. The patterns were sequences of pure tones in which each successive tonal element was randomly selected from one of two possible frequencies surrounding a CF. The stimuli were presented at equal sensation levels to ensure equal audibility. In addition, the frequency separation of the tonal elements within a pattern was adjusted for each listener so that equal pattern discrimination performance was obtained for each CF in quiet. After these adjustments, the pattern discrimination task was performed under conditions in which independent patterns were presented at both CFs simultaneously. The listeners were instructed to attend to the low or high CF before the stimulus (assessing selective attention to frequency with instruction) or after the stimulus (divided attention, assessing inherent frequency biases). NH listeners demonstrated approximately equal performance decrements (re: quiet) between the two CFs. HI listeners demonstrated much larger performance decrements at the 3500 Hz CF than at the 750 Hz CF in combined-presentation conditions for both selective and divided attention conditions, indicating a low-frequency attentional bias that is apparently not under subject control. Surprisingly, the magnitude of this frequency bias was not related to the degree of asymmetry in thresholds at the two CFs.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language & Hearing Sciences, Boston University
| | - Judy R. Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina
| | - Gerald Kidd
- Department of Speech, Language & Hearing Sciences, Boston University
| |
Collapse
|
6
|
Rennies J, Best V, Roverud E, Kidd G. Energetic and Informational Components of Speech-on-Speech Masking in Binaural Speech Intelligibility and Perceived Listening Effort. Trends Hear 2019; 23:2331216519854597. [PMID: 31172880 PMCID: PMC6557024 DOI: 10.1177/2331216519854597] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Speech perception in complex sound fields can greatly benefit from different unmasking cues to segregate the target from interfering voices. This study investigated the role of three unmasking cues (spatial separation, gender differences, and masker time reversal) on speech intelligibility and perceived listening effort in normal-hearing listeners. Speech intelligibility and categorically scaled listening effort were measured for a female target talker masked by two competing talkers with no unmasking cues or one to three unmasking cues. In addition to natural stimuli, all measurements were also conducted with glimpsed speech—which was created by removing the time–frequency tiles of the speech mixture in which the maskers dominated the mixture—to estimate the relative amounts of informational and energetic masking as well as the effort associated with source segregation. The results showed that all unmasking cues as well as glimpsing improved intelligibility and reduced listening effort and that providing more than one cue was beneficial in overcoming informational masking. The reduction in listening effort due to glimpsing corresponded to increases in signal-to-noise ratio of 8 to 18 dB, indicating that a significant amount of listening effort was devoted to segregating the target from the maskers. Furthermore, the benefit in listening effort for all unmasking cues extended well into the range of positive signal-to-noise ratios at which speech intelligibility was at ceiling, suggesting that listening effort is a useful tool for evaluating speech-on-speech masking conditions at typical conversational levels.
Collapse
Affiliation(s)
- Jan Rennies
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA.,2 Fraunhofer Institute for Digital Media Technology IDMT, Project Group Hearing, Speech and Audio Technology, Oldenburg, Germany.,3 Cluster of Excellence Hearing4all, Carl-von-Ossietzky University, Oldenburg, Germany
| | - Virginia Best
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Elin Roverud
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Gerald Kidd
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| |
Collapse
|
7
|
Best V, Roverud E, Baltzell L, Rennies J, Lavandier M. The importance of a broad bandwidth for understanding "glimpsed" speech. J Acoust Soc Am 2019; 146:3215. [PMID: 31795657 PMCID: PMC6847933 DOI: 10.1121/1.5131651] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
When a target talker speaks in the presence of competing talkers, the listener must not only segregate the voices but also understand the target message based on a limited set of spectrotemporal regions ("glimpses") in which the target voice dominates the acoustic mixture. Here, the hypothesis that a broad audible bandwidth is more critical for these sparse representations of speech than it is for intact speech is tested. Listeners with normal hearing were presented with sentences that were either intact, or progressively "glimpsed" according to a competing two-talker masker presented at various levels. This was achieved by using an ideal binary mask to exclude time-frequency units in the target that would be dominated by the masker in the natural mixture. In each glimpsed condition, speech intelligibility was measured for a range of low-pass conditions (cutoff frequencies from 500 to 8000 Hz). Intelligibility was poorer for sparser speech, and the bandwidth required for optimal intelligibility increased with the sparseness of the speech. The combined effects of glimpsing and bandwidth reduction were well captured by a simple metric based on the proportion of audible target glimpses retained. The findings may be relevant for understanding the impact of high-frequency hearing loss on everyday speech communication.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Lucas Baltzell
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jan Rennies
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Mathieu Lavandier
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
8
|
Best V, Swaminathan J, Kopčo N, Roverud E, Shinn-Cunningham B. A "Buildup" of Speech Intelligibility in Listeners With Normal Hearing and Hearing Loss. Trends Hear 2019; 22:2331216518807519. [PMID: 30353783 PMCID: PMC6201174 DOI: 10.1177/2331216518807519] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The perception of simple auditory mixtures is known to evolve over time. For
instance, a common example of this is the “buildup” of stream segregation that
is observed for sequences of tones alternating in pitch. Yet very little is
known about how the perception of more complicated auditory scenes, such as
multitalker mixtures, changes over time. Previous data are consistent with the
idea that the ability to segregate a target talker from competing sounds
improves rapidly when stable cues are available, which leads to improvements in
speech intelligibility. This study examined the time course of this buildup in
listeners with normal and impaired hearing. Five simultaneous sequences of
digits, varying in length from three to six digits, were presented from five
locations in the horizontal plane. A synchronized visual cue at one location
indicated which sequence was the target on each trial. We observed a buildup in
digit identification performance, driven primarily by reductions in confusions
between the target and the maskers, that occurred over the course of three to
four digits. Performance tended to be poorer in listeners with hearing loss;
however, there was only weak evidence that the buildup was diminished or slowed
in this group.
Collapse
Affiliation(s)
- Virginia Best
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | | | - Norbert Kopčo
- 3 Faculty of Science, Institute of Computer Science, P. J. Safarik University, Kosice, Slovakia
| | - Elin Roverud
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | | |
Collapse
|
9
|
Kidd G, Mason CR, Best V, Roverud E, Swaminathan J, Jennings T, Clayton K, Steven Colburn H. Determining the energetic and informational components of speech-on-speech masking in listeners with sensorineural hearing loss. J Acoust Soc Am 2019; 145:440. [PMID: 30710924 PMCID: PMC6347574 DOI: 10.1121/1.5087555] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 11/19/2018] [Accepted: 12/18/2018] [Indexed: 05/20/2023]
Abstract
The ability to identify the words spoken by one talker masked by two or four competing talkers was tested in young-adult listeners with sensorineural hearing loss (SNHL). In a reference/baseline condition, masking speech was colocated with target speech, target and masker talkers were female, and the masker was intelligible. Three comparison conditions included replacing female masker talkers with males, time-reversal of masker speech, and spatial separation of sources. All three variables produced significant release from masking. To emulate energetic masking (EM), stimuli were subjected to ideal time-frequency segregation retaining only the time-frequency units where target energy exceeded masker energy. Subjects were then tested with these resynthesized "glimpsed stimuli." For either two or four maskers, thresholds only varied about 3 dB across conditions suggesting that EM was roughly equal. Compared to normal-hearing listeners from an earlier study [Kidd, Mason, Swaminathan, Roverud, Clayton, and Best, J. Acoust. Soc. Am. 140, 132-144 (2016)], SNHL listeners demonstrated both greater energetic and informational masking as well as higher glimpsed thresholds. Individual differences were correlated across masking release conditions suggesting that listeners could be categorized according to their general ability to solve the task. Overall, both peripheral and central factors appear to contribute to the higher thresholds for SNHL listeners.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Todd Jennings
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Kameron Clayton
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - H Steven Colburn
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
10
|
Best V, Ahlstrom JB, Mason CR, Roverud E, Perrachione TK, Kidd G, Dubno JR. Talker identification: Effects of masking, hearing loss, and age. J Acoust Soc Am 2018; 143:1085. [PMID: 29495693 PMCID: PMC5820061 DOI: 10.1121/1.5024333] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 01/24/2018] [Accepted: 01/29/2018] [Indexed: 06/08/2023]
Abstract
The ability to identify who is talking is an important aspect of communication in social situations and, while empirical data are limited, it is possible that a disruption to this ability contributes to the difficulties experienced by listeners with hearing loss. In this study, talker identification was examined under both quiet and masked conditions. Subjects were grouped by hearing status (normal hearing/sensorineural hearing loss) and age (younger/older adults). Listeners first learned to identify the voices of four same-sex talkers in quiet, and then talker identification was assessed (1) in quiet, (2) in speech-shaped, steady-state noise, and (3) in the presence of a single, unfamiliar same-sex talker. Both younger and older adults with hearing loss, as well as older adults with normal hearing, generally performed more poorly than younger adults with normal hearing, although large individual differences were observed in all conditions. Regression analyses indicated that both age and hearing loss were predictors of performance in quiet, and there was some evidence for an additional contribution of hearing loss in the presence of masking. These findings suggest that both hearing loss and age may affect the ability to identify talkers in "cocktail party" situations.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Jayne B Ahlstrom
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, Charleston, South Carolina 29425, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Tyler K Perrachione
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, Charleston, South Carolina 29425, USA
| |
Collapse
|
11
|
Best V, Roverud E, Mason CR, Kidd G. Examination of a hybrid beamformer that preserves auditory spatial cues. J Acoust Soc Am 2017; 142:EL369. [PMID: 29092558 PMCID: PMC5724719 DOI: 10.1121/1.5007279] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 08/31/2017] [Accepted: 09/30/2017] [Indexed: 06/01/2023]
Abstract
A hearing-aid strategy that combines a beamforming microphone array in the high frequencies with natural binaural signals in the low frequencies was examined. This strategy attempts to balance the benefits of beamforming (improved signal-to-noise ratio) with the benefits of binaural listening (spatial awareness and location-based segregation). The crossover frequency was varied from 200 to 1200 Hz, and performance was compared to full-spectrum binaural and beamformer conditions. Speech intelligibility in the presence of noise or competing speech was measured in listeners with and without hearing loss. Results showed that the optimal crossover frequency depended on the listener and the nature of the interference.
Collapse
MESH Headings
- Acoustic Stimulation
- Adult
- Audiometry, Speech
- Case-Control Studies
- Comprehension
- Correction of Hearing Impairment/instrumentation
- Cues
- Equipment Design
- Female
- Hearing
- Hearing Aids
- Hearing Loss, Bilateral/diagnosis
- Hearing Loss, Bilateral/physiopathology
- Hearing Loss, Bilateral/psychology
- Hearing Loss, Bilateral/rehabilitation
- Hearing Loss, Sensorineural/diagnosis
- Hearing Loss, Sensorineural/physiopathology
- Hearing Loss, Sensorineural/psychology
- Hearing Loss, Sensorineural/rehabilitation
- Humans
- Male
- Noise/adverse effects
- Perceptual Masking
- Persons With Hearing Impairments/psychology
- Persons With Hearing Impairments/rehabilitation
- Signal Processing, Computer-Assisted
- Sound Localization
- Speech Intelligibility
- Speech Perception
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA , , ,
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA , , ,
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA , , ,
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA , , ,
| |
Collapse
|
12
|
Best V, Roverud E, Streeter T, Mason CR, Kidd G. The Benefit of a Visually Guided Beamformer in a Dynamic Speech Task. Trends Hear 2017; 21:2331216517722304. [PMID: 28758567 PMCID: PMC5542081 DOI: 10.1177/2331216517722304] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 06/07/2017] [Accepted: 06/26/2017] [Indexed: 11/16/2022] Open
Abstract
The aim of this study was to evaluate the performance of a visually guided hearing aid (VGHA) under conditions designed to capture some aspects of "real-world" communication settings. The VGHA uses eye gaze to steer the acoustic look direction of a highly directional beamforming microphone array. Although the VGHA has been shown to enhance speech intelligibility for fixed-location, frontal targets, it is currently not known whether these benefits persist in the face of frequent changes in location of the target talker that are typical of conversational turn-taking. Participants were 14 young adults, 7 with normal hearing and 7 with bilateral sensorineural hearing impairment. Target stimuli were sequences of 12 question-answer pairs that were embedded in a mixture of competing conversations. The participant's task was to respond via a key press after each answer indicating whether it was correct or not. Spatialization of the stimuli and microphone array processing were done offline using recorded impulse responses, before presentation over headphones. The look direction of the array was steered according to the eye movements of the participant as they followed a visual cue presented on a widescreen monitor. Performance was compared for a "dynamic" condition in which the target stimulus moved between three locations, and a "fixed" condition with a single target location. The benefits of the VGHA over natural binaural listening observed in the fixed condition were reduced in the dynamic condition, largely because visual fixation was less accurate.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Timothy Streeter
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Christine R. Mason
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| |
Collapse
|
13
|
Best V, Mason CR, Swaminathan J, Roverud E, Kidd G. Use of a glimpsing model to understand the performance of listeners with and without hearing loss in spatialized speech mixtures. J Acoust Soc Am 2017; 141:81. [PMID: 28147587 PMCID: PMC5392092 DOI: 10.1121/1.4973620] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
In many situations, listeners with sensorineural hearing loss demonstrate reduced spatial release from masking compared to listeners with normal hearing. This deficit is particularly evident in the "symmetric masker" paradigm in which competing talkers are located to either side of a central target talker. However, there is some evidence that reduced target audibility (rather than a spatial deficit per se) under conditions of spatial separation may contribute to the observed deficit. In this study a simple "glimpsing" model (applied separately to each ear) was used to isolate the target information that is potentially available in binaural speech mixtures. Intelligibility of these glimpsed stimuli was then measured directly. Differences between normally hearing and hearing-impaired listeners observed in the natural binaural condition persisted for the glimpsed condition, despite the fact that the task no longer required segregation or spatial processing. This result is consistent with the idea that the performance of listeners with hearing loss in the spatialized mixture was limited by their ability to identify the target speech based on sparse glimpses, possibly as a result of some of those glimpses being inaudible.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Christine R Mason
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
14
|
Best V, Streeter T, Roverud E, Mason CR, Kidd G. A Flexible Question-and-Answer Task for Measuring Speech Understanding. Trends Hear 2016; 20:2331216516678706. [PMID: 27888257 PMCID: PMC5131808 DOI: 10.1177/2331216516678706] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 10/20/2016] [Accepted: 10/20/2016] [Indexed: 11/16/2022] Open
Abstract
This report introduces a new speech task based on simple questions and answers. The task differs from a traditional sentence recall task in that it involves an element of comprehension and can be implemented in an ongoing fashion. It also contains two target items (the question and the answer) that may be associated with different voices and locations to create dynamic listening scenarios. A set of 227 questions was created, covering six broad categories (days of the week, months of the year, numbers, colors, opposites, and sizes). All questions and their one-word answers were spoken by 11 female and 11 male talkers. In this study, listeners were presented with question-answer pairs and asked to indicate whether the answer was true or false. Responses were given as simple button or key presses, which are quick to make and easy to score. Two preliminary experiments are presented that illustrate different ways of implementing the basic task. In the first experiment, question-answer pairs were presented in speech-shaped noise, and performance was compared across subjects, question categories, and time, to examine the different sources of variability. In the second experiment, sequences of question-answer pairs were presented amidst competing conversations in an ongoing, spatially dynamic listening scenario. Overall, the question-and-answer task appears to be feasible and could be implemented flexibly in a number of different ways.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Timothy Streeter
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| |
Collapse
|
15
|
Kidd G, Mason CR, Swaminathan J, Roverud E, Clayton KK, Best V. Determining the energetic and informational components of speech-on-speech masking. J Acoust Soc Am 2016; 140:132. [PMID: 27475139 PMCID: PMC5392100 DOI: 10.1121/1.4954748] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Identification of target speech was studied under masked conditions consisting of two or four independent speech maskers. In the reference conditions, the maskers were colocated with the target, the masker talkers were the same sex as the target, and the masker speech was intelligible. The comparison conditions, intended to provide release from masking, included different-sex target and masker talkers, time-reversal of the masker speech, and spatial separation of the maskers from the target. Significant release from masking was found for all comparison conditions. To determine whether these reductions in masking could be attributed to differences in energetic masking, ideal time-frequency segregation (ITFS) processing was applied so that the time-frequency units where the masker energy dominated the target energy were removed. The remaining target-dominated "glimpses" were reassembled as the stimulus. Speech reception thresholds measured using these resynthesized ITFS-processed stimuli were the same for the reference and comparison conditions supporting the conclusion that the amount of energetic masking across conditions was the same. These results indicated that the large release from masking found under all comparison conditions was due primarily to a reduction in informational masking. Furthermore, the large individual differences observed generally were correlated across the three masking release conditions.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Kameron K Clayton
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
16
|
Roverud E, Best V, Mason CR, Swaminathan J, Kidd G. Informational Masking in Normal-Hearing and Hearing-Impaired Listeners Measured in a Nonspeech Pattern Identification Task. Trends Hear 2016; 20:2331216516638516. [PMID: 27059627 PMCID: PMC4871212 DOI: 10.1177/2331216516638516] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Revised: 01/26/2016] [Accepted: 02/16/2016] [Indexed: 11/16/2022] Open
Abstract
Individuals with sensorineural hearing loss (SNHL) often experience more difficulty with listening in multisource environments than do normal-hearing (NH) listeners. While the peripheral effects of sensorineural hearing loss certainly contribute to this difficulty, differences in central processing of auditory information may also contribute. To explore this issue, it is important to account for peripheral differences between NH and these hearing-impaired (HI) listeners so that central effects in multisource listening can be examined. In the present study, NH and HI listeners performed a tonal pattern identification task at two distant center frequencies (CFs), 850 and 3500 Hz. In an attempt to control for differences in the peripheral representations of the stimuli, the patterns were presented at the same sensation level (15 dB SL), and the frequency deviation of the tones comprising the patterns was adjusted to obtain equal quiet pattern identification performance across all listeners at both CFs. Tonal sequences were then presented at both CFs simultaneously (informational masking conditions), and listeners were asked either to selectively attend to a source (CF) or to divide attention between CFs and identify the pattern at a CF designated after each trial. There were large differences between groups in the frequency deviations necessary to perform the pattern identification task. After compensating for these differences, there were small differences between NH and HI listeners in the informational masking conditions. HI listeners showed slightly greater performance asymmetry between the low and high CFs than did NH listeners, possibly due to central differences in frequency weighting between groups.
Collapse
|
17
|
Roverud E, Strickland EA. The effects of ipsilateral, contralateral, and bilateral broadband noise on the mid-level hump in intensity discrimination. J Acoust Soc Am 2015; 138:3245-3261. [PMID: 26627798 PMCID: PMC4662679 DOI: 10.1121/1.4935515] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Revised: 10/21/2015] [Accepted: 10/28/2015] [Indexed: 05/29/2023]
Abstract
Previous psychoacoustical and physiological studies indicate that the medial olivocochlear reflex (MOCR), a bilateral, sound-evoked reflex, may lead to improved sound intensity discrimination in background noise. The MOCR can decrease the range of basilar-membrane compression and can counteract effects of neural adaptation from background noise. However, the contribution of these processes to intensity discrimination is not well understood. This study examined the effect of ipsilateral, contralateral, and bilateral noise on the "mid-level hump." The mid-level hump refers to intensity discrimination Weber fractions (WFs) measured for short-duration, high-frequency tones which are poorer at mid levels than at lower or higher levels. The mid-level hump WFs may reflect a limitation due to basilar-membrane compression, and thus may be decreased by the MOCR. The noise was either short (50 ms) or long (150 ms), with the long noise intended to elicit the sluggish MOCR. For a tone in quiet, mid-level hump WFs improved with ipsilateral noise for most listeners, but not with contralateral noise. For a tone in ipsilateral noise, WFs improved with contralateral noise for most listeners, but only when both noises were long. These results are consistent with MOCR-induced WF improvements, possibly via decreases in effects of compression and neural adaptation.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, Indiana 47907, USA
| | - Elizabeth A Strickland
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, Indiana 47907, USA
| |
Collapse
|
18
|
Roverud E, Strickland EA. Exploring the source of the mid-level hump for intensity discrimination in quiet and the effects of noise. J Acoust Soc Am 2015; 137:1318-35. [PMID: 25786945 PMCID: PMC4368585 DOI: 10.1121/1.4908243] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Revised: 01/28/2015] [Accepted: 01/30/2015] [Indexed: 05/29/2023]
Abstract
Intensity discrimination Weber fractions (WFs) measured for short, high-frequency tones in quiet are larger at mid levels than at lower or higher levels. The source of this "mid-level hump" is a matter of debate. One theory is that the mid-level hump reflects basilar-membrane compression, and that WFs decrease at higher levels due to spread-of-excitation cues. To test this theory, Experiment 1 measured the mid-level hump and growth-of-masking functions to estimate the basilar membrane input/output (I/O) function in the same listeners. Results showed the initial rise in WFs could be accounted for by the change in I/O function slope, but there was additional unexplained variability in WFs. Previously, Plack [(1998). J. Acoust. Soc. Am. 103(5), 2530-2538] showed that long-duration notched noise (NN) presented with the tone reduced the mid-level hump even with a temporal gap in the NN. Plack concluded the results were consistent with central profile analysis. However, simultaneous, forward, and backward NN were not examined separately, which may independently test peripheral and central mechanisms of the NN. Experiment 2 measured WFs at the mid-level hump in the presence of NN and narrowband noise of different durations and temporal positions relative to the tone. Results varied across subjects, but were consistent with more peripheral mechanisms.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907
| | - Elizabeth A Strickland
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
19
|
Roverud E, Strickland EA. Accounting for nonmonotonic precursor duration effects with gain reduction in the temporal window model. J Acoust Soc Am 2014; 135:1321-34. [PMID: 24606271 PMCID: PMC3985874 DOI: 10.1121/1.4864783] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Revised: 01/23/2014] [Accepted: 01/27/2014] [Indexed: 05/19/2023]
Abstract
The mechanisms of forward masking are not clearly understood. The temporal window model (TWM) proposes that masking occurs via a neural mechanism that integrates within a temporal window. The medial olivocochlear reflex (MOCR), a sound-evoked reflex that reduces cochlear amplifier gain, may also contribute to forward masking if the preceding sound reduces gain for the signal. Psychophysical evidence of gain reduction can be observed using a growth of masking (GOM) paradigm with an off-frequency forward masker and a precursor. The basilar membrane input/output (I/O) function is estimated from the GOM function, and the I/O function gain is reduced by the precursor. In this study, the effect of precursor duration on this gain reduction effect was examined for on- and off-frequency precursors. With on-frequency precursors, thresholds increased with increasing precursor duration, then decreased (rolled over) for longer durations. Thresholds with off-frequency precursors continued to increase with increasing precursor duration. These results are not consistent with solely neural masking, but may reflect gain reduction that selectively affects on-frequency stimuli. The TWM was modified to include history-dependent gain reduction to simulate the MOCR, called the temporal window model-gain reduction (TWM-GR). The TWM-GR predicted rollover and the differences with on- and off-frequency precursors whereas the TWM did not.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907-2038
| | - Elizabeth A Strickland
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907-2038
| |
Collapse
|
20
|
Roverud E, Strickland EA. The time course of cochlear gain reduction measured using a more efficient psychophysical technique. J Acoust Soc Am 2010; 128:1203-14. [PMID: 20815456 PMCID: PMC2945748 DOI: 10.1121/1.3473695] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2009] [Revised: 07/02/2010] [Accepted: 07/06/2010] [Indexed: 05/09/2023]
Abstract
In a previous study it was shown that an on-frequency precursor intended to activate the medial olivocochlear reflex (MOCR) at the signal frequency reduces the gain estimated from growth-of-masking (GOM) functions. This is called the temporal effect (TE). In Expt. 1 a shorter method of measuring this change in gain is established. GOM functions were measured with an on- and off-frequency precursor presented before the masker and signal, and used to estimate Input/Output functions. The change in gain estimated in this way was very similar to that estimated from comparing two points measured with a single fixed masker level on the lower legs of the GOM functions. In Expt. 2, the TE was measured as a function of precursor duration and signal delay. For short precursor durations and short delays the TE increased (buildup) or remained constant as delay increased, then decreased. The TE also increased with precursor duration for the shortest delay. The results were fitted with a model based on the time course of the MOCR. The model fitted the data well, and predicted the buildup. This buildup is not consistent with exponential decay predicted by neural adaptation or persistence of excitation.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907-2038, USA.
| | | |
Collapse
|