1
|
Talker change detection by listeners varying in age and hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2482-2491. [PMID: 38587430 PMCID: PMC11003761 DOI: 10.1121/10.0025539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/06/2024] [Accepted: 03/19/2024] [Indexed: 04/09/2024]
Abstract
Despite a vast literature on how speech intelligibility is affected by hearing loss and advanced age, remarkably little is known about the perception of talker-related information in these populations. Here, we assessed the ability of listeners to detect whether a change in talker occurred while listening to and identifying sentence-length sequences of words. Participants were recruited in four groups that differed in their age (younger/older) and hearing status (normal/impaired). The task was conducted in quiet or in a background of same-sex two-talker speech babble. We found that age and hearing loss had detrimental effects on talker change detection, in addition to their expected effects on word recognition. We also found subtle differences in the effects of age and hearing loss for trials in which the talker changed vs trials in which the talker did not change. These findings suggest that part of the difficulty encountered by older listeners, and by listeners with hearing loss, when communicating in group situations, may be due to a reduced ability to identify and discriminate between the participants in the conversation.
Collapse
|
2
|
Individual differences in speech intelligibility at a cocktail party: A modeling perspective. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:1076. [PMID: 34470293 PMCID: PMC8561716 DOI: 10.1121/10.0005851] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 07/07/2021] [Accepted: 07/21/2021] [Indexed: 06/13/2023]
Abstract
This study aimed at predicting individual differences in speech reception thresholds (SRTs) in the presence of symmetrically placed competing talkers for young listeners with sensorineural hearing loss. An existing binaural model incorporating the individual audiogram was revised to handle severe hearing losses by (a) taking as input the target speech level at SRT in a given condition and (b) introducing a floor in the model to limit extreme negative better-ear signal-to-noise ratios. The floor value was first set using SRTs measured with stationary and modulated noises. The model was then used to account for individual variations in SRTs found in two previously published data sets that used speech maskers. The model accounted well for the variation in SRTs across listeners with hearing loss, based solely on differences in audibility. When considering listeners with normal hearing, the model could predict the best SRTs, but not the poorer SRTs, suggesting that other factors limit performance when audibility (as measured with the audiogram) is not compromised.
Collapse
|
3
|
Informational masking of negative masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:798. [PMID: 32113297 PMCID: PMC7004829 DOI: 10.1121/10.0000652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 01/10/2020] [Accepted: 01/10/2020] [Indexed: 06/10/2023]
Abstract
Negative masking (NM) is a ubiquitous finding in near-"threshold" psychophysics in which the detectability of a near-threshold signal improves when added to a copy of itself, i.e., a pedestal or masker. One interpretation of NM suggests that the pedestal acts as an informative cue, thereby reducing uncertainty and improving performance relative to detection in its absence. The purpose of this study was to test this hypothesis. Intensity discrimination thresholds were measured for 100-ms, 1000-Hz near-threshold tones. In the reference condition, thresholds were measured in quiet (no masker other than the pedestal). In comparison conditions, thresholds were measured in the presence of one of two additional maskers: a notched-noise masker or a random-frequency multitone masker. The additional maskers were intended to cause different amounts of uncertainty and, in turn, to differentially influence NM. The results were generally consistent with an uncertainty-based interpretation of NM: NM was found both in quiet and in notched-noise, yet it was eliminated by the multitone masker. A competing interpretation of NM based on nonlinear transduction does not account for all of the results. Profile analysis may have been a factor in performance and this suggests that NM may be attributable to, or influenced by, multiple mechanisms.
Collapse
|
4
|
Determining the energetic and informational components of speech-on-speech masking in listeners with sensorineural hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:440. [PMID: 30710924 PMCID: PMC6347574 DOI: 10.1121/1.5087555] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 11/19/2018] [Accepted: 12/18/2018] [Indexed: 05/20/2023]
Abstract
The ability to identify the words spoken by one talker masked by two or four competing talkers was tested in young-adult listeners with sensorineural hearing loss (SNHL). In a reference/baseline condition, masking speech was colocated with target speech, target and masker talkers were female, and the masker was intelligible. Three comparison conditions included replacing female masker talkers with males, time-reversal of masker speech, and spatial separation of sources. All three variables produced significant release from masking. To emulate energetic masking (EM), stimuli were subjected to ideal time-frequency segregation retaining only the time-frequency units where target energy exceeded masker energy. Subjects were then tested with these resynthesized "glimpsed stimuli." For either two or four maskers, thresholds only varied about 3 dB across conditions suggesting that EM was roughly equal. Compared to normal-hearing listeners from an earlier study [Kidd, Mason, Swaminathan, Roverud, Clayton, and Best, J. Acoust. Soc. Am. 140, 132-144 (2016)], SNHL listeners demonstrated both greater energetic and informational masking as well as higher glimpsed thresholds. Individual differences were correlated across masking release conditions suggesting that listeners could be categorized according to their general ability to solve the task. Overall, both peripheral and central factors appear to contribute to the higher thresholds for SNHL listeners.
Collapse
|
5
|
Talker identification: Effects of masking, hearing loss, and age. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:1085. [PMID: 29495693 PMCID: PMC5820061 DOI: 10.1121/1.5024333] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 01/24/2018] [Accepted: 01/29/2018] [Indexed: 06/08/2023]
Abstract
The ability to identify who is talking is an important aspect of communication in social situations and, while empirical data are limited, it is possible that a disruption to this ability contributes to the difficulties experienced by listeners with hearing loss. In this study, talker identification was examined under both quiet and masked conditions. Subjects were grouped by hearing status (normal hearing/sensorineural hearing loss) and age (younger/older adults). Listeners first learned to identify the voices of four same-sex talkers in quiet, and then talker identification was assessed (1) in quiet, (2) in speech-shaped, steady-state noise, and (3) in the presence of a single, unfamiliar same-sex talker. Both younger and older adults with hearing loss, as well as older adults with normal hearing, generally performed more poorly than younger adults with normal hearing, although large individual differences were observed in all conditions. Regression analyses indicated that both age and hearing loss were predictors of performance in quiet, and there was some evidence for an additional contribution of hearing loss in the presence of masking. These findings suggest that both hearing loss and age may affect the ability to identify talkers in "cocktail party" situations.
Collapse
|
6
|
Examination of a hybrid beamformer that preserves auditory spatial cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:EL369. [PMID: 29092558 PMCID: PMC5724719 DOI: 10.1121/1.5007279] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 08/31/2017] [Accepted: 09/30/2017] [Indexed: 06/01/2023]
Abstract
A hearing-aid strategy that combines a beamforming microphone array in the high frequencies with natural binaural signals in the low frequencies was examined. This strategy attempts to balance the benefits of beamforming (improved signal-to-noise ratio) with the benefits of binaural listening (spatial awareness and location-based segregation). The crossover frequency was varied from 200 to 1200 Hz, and performance was compared to full-spectrum binaural and beamformer conditions. Speech intelligibility in the presence of noise or competing speech was measured in listeners with and without hearing loss. Results showed that the optimal crossover frequency depended on the listener and the nature of the interference.
Collapse
MESH Headings
- Acoustic Stimulation
- Adult
- Audiometry, Speech
- Case-Control Studies
- Comprehension
- Correction of Hearing Impairment/instrumentation
- Cues
- Equipment Design
- Female
- Hearing
- Hearing Aids
- Hearing Loss, Bilateral/diagnosis
- Hearing Loss, Bilateral/physiopathology
- Hearing Loss, Bilateral/psychology
- Hearing Loss, Bilateral/rehabilitation
- Hearing Loss, Sensorineural/diagnosis
- Hearing Loss, Sensorineural/physiopathology
- Hearing Loss, Sensorineural/psychology
- Hearing Loss, Sensorineural/rehabilitation
- Humans
- Male
- Noise/adverse effects
- Perceptual Masking
- Persons With Hearing Impairments/psychology
- Persons With Hearing Impairments/rehabilitation
- Signal Processing, Computer-Assisted
- Sound Localization
- Speech Intelligibility
- Speech Perception
Collapse
|
7
|
The Benefit of a Visually Guided Beamformer in a Dynamic Speech Task. Trends Hear 2017; 21:2331216517722304. [PMID: 28758567 PMCID: PMC5542081 DOI: 10.1177/2331216517722304] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 06/07/2017] [Accepted: 06/26/2017] [Indexed: 11/16/2022] Open
Abstract
The aim of this study was to evaluate the performance of a visually guided hearing aid (VGHA) under conditions designed to capture some aspects of "real-world" communication settings. The VGHA uses eye gaze to steer the acoustic look direction of a highly directional beamforming microphone array. Although the VGHA has been shown to enhance speech intelligibility for fixed-location, frontal targets, it is currently not known whether these benefits persist in the face of frequent changes in location of the target talker that are typical of conversational turn-taking. Participants were 14 young adults, 7 with normal hearing and 7 with bilateral sensorineural hearing impairment. Target stimuli were sequences of 12 question-answer pairs that were embedded in a mixture of competing conversations. The participant's task was to respond via a key press after each answer indicating whether it was correct or not. Spatialization of the stimuli and microphone array processing were done offline using recorded impulse responses, before presentation over headphones. The look direction of the array was steered according to the eye movements of the participant as they followed a visual cue presented on a widescreen monitor. Performance was compared for a "dynamic" condition in which the target stimulus moved between three locations, and a "fixed" condition with a single target location. The benefits of the VGHA over natural binaural listening observed in the fixed condition were reduced in the dynamic condition, largely because visual fixation was less accurate.
Collapse
|
8
|
Use of a glimpsing model to understand the performance of listeners with and without hearing loss in spatialized speech mixtures. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:81. [PMID: 28147587 PMCID: PMC5392092 DOI: 10.1121/1.4973620] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
In many situations, listeners with sensorineural hearing loss demonstrate reduced spatial release from masking compared to listeners with normal hearing. This deficit is particularly evident in the "symmetric masker" paradigm in which competing talkers are located to either side of a central target talker. However, there is some evidence that reduced target audibility (rather than a spatial deficit per se) under conditions of spatial separation may contribute to the observed deficit. In this study a simple "glimpsing" model (applied separately to each ear) was used to isolate the target information that is potentially available in binaural speech mixtures. Intelligibility of these glimpsed stimuli was then measured directly. Differences between normally hearing and hearing-impaired listeners observed in the natural binaural condition persisted for the glimpsed condition, despite the fact that the task no longer required segregation or spatial processing. This result is consistent with the idea that the performance of listeners with hearing loss in the spatialized mixture was limited by their ability to identify the target speech based on sparse glimpses, possibly as a result of some of those glimpses being inaudible.
Collapse
|
9
|
A Flexible Question-and-Answer Task for Measuring Speech Understanding. Trends Hear 2016; 20:2331216516678706. [PMID: 27888257 PMCID: PMC5131808 DOI: 10.1177/2331216516678706] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 10/20/2016] [Accepted: 10/20/2016] [Indexed: 11/16/2022] Open
Abstract
This report introduces a new speech task based on simple questions and answers. The task differs from a traditional sentence recall task in that it involves an element of comprehension and can be implemented in an ongoing fashion. It also contains two target items (the question and the answer) that may be associated with different voices and locations to create dynamic listening scenarios. A set of 227 questions was created, covering six broad categories (days of the week, months of the year, numbers, colors, opposites, and sizes). All questions and their one-word answers were spoken by 11 female and 11 male talkers. In this study, listeners were presented with question-answer pairs and asked to indicate whether the answer was true or false. Responses were given as simple button or key presses, which are quick to make and easy to score. Two preliminary experiments are presented that illustrate different ways of implementing the basic task. In the first experiment, question-answer pairs were presented in speech-shaped noise, and performance was compared across subjects, question categories, and time, to examine the different sources of variability. In the second experiment, sequences of question-answer pairs were presented amidst competing conversations in an ongoing, spatially dynamic listening scenario. Overall, the question-and-answer task appears to be feasible and could be implemented flexibly in a number of different ways.
Collapse
|
10
|
Determining the energetic and informational components of speech-on-speech masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:132. [PMID: 27475139 PMCID: PMC5392100 DOI: 10.1121/1.4954748] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Identification of target speech was studied under masked conditions consisting of two or four independent speech maskers. In the reference conditions, the maskers were colocated with the target, the masker talkers were the same sex as the target, and the masker speech was intelligible. The comparison conditions, intended to provide release from masking, included different-sex target and masker talkers, time-reversal of the masker speech, and spatial separation of the maskers from the target. Significant release from masking was found for all comparison conditions. To determine whether these reductions in masking could be attributed to differences in energetic masking, ideal time-frequency segregation (ITFS) processing was applied so that the time-frequency units where the masker energy dominated the target energy were removed. The remaining target-dominated "glimpses" were reassembled as the stimulus. Speech reception thresholds measured using these resynthesized ITFS-processed stimuli were the same for the reference and comparison conditions supporting the conclusion that the amount of energetic masking across conditions was the same. These results indicated that the large release from masking found under all comparison conditions was due primarily to a reduction in informational masking. Furthermore, the large individual differences observed generally were correlated across the three masking release conditions.
Collapse
|
11
|
Informational Masking in Normal-Hearing and Hearing-Impaired Listeners Measured in a Nonspeech Pattern Identification Task. Trends Hear 2016; 20:2331216516638516. [PMID: 27059627 PMCID: PMC4871212 DOI: 10.1177/2331216516638516] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Revised: 01/26/2016] [Accepted: 02/16/2016] [Indexed: 11/16/2022] Open
Abstract
Individuals with sensorineural hearing loss (SNHL) often experience more difficulty with listening in multisource environments than do normal-hearing (NH) listeners. While the peripheral effects of sensorineural hearing loss certainly contribute to this difficulty, differences in central processing of auditory information may also contribute. To explore this issue, it is important to account for peripheral differences between NH and these hearing-impaired (HI) listeners so that central effects in multisource listening can be examined. In the present study, NH and HI listeners performed a tonal pattern identification task at two distant center frequencies (CFs), 850 and 3500 Hz. In an attempt to control for differences in the peripheral representations of the stimuli, the patterns were presented at the same sensation level (15 dB SL), and the frequency deviation of the tones comprising the patterns was adjusted to obtain equal quiet pattern identification performance across all listeners at both CFs. Tonal sequences were then presented at both CFs simultaneously (informational masking conditions), and listeners were asked either to selectively attend to a source (CF) or to divide attention between CFs and identify the pattern at a CF designated after each trial. There were large differences between groups in the frequency deviations necessary to perform the pattern identification task. After compensating for these differences, there were small differences between NH and HI listeners in the informational masking conditions. HI listeners showed slightly greater performance asymmetry between the low and high CFs than did NH listeners, possibly due to central differences in frequency weighting between groups.
Collapse
|
12
|
Abstract
The benefit provided to listeners with sensorineural hearing loss (SNHL) by an acoustic beamforming microphone array was determined in a speech-on-speech masking experiment. Normal-hearing controls were tested as well. For the SNHL listeners, prescription-determined gain was applied to the stimuli, and performance using the beamformer was compared with that obtained using bilateral amplification. The listener identified speech from a target talker located straight ahead (0° azimuth) in the presence of four competing talkers that were either colocated with, or spatially separated from, the target. The stimuli were spatialized using measured impulse responses and presented via earphones. In the spatially separated masker conditions, the four maskers were arranged symmetrically around the target at ±15° and ±30° or at ±45° and ±90°. Results revealed that masked speech reception thresholds for spatially separated maskers were higher (poorer) on average for the SNHL than for the normal-hearing listeners. For most SNHL listeners in the wider masker separation condition, lower thresholds were obtained through the microphone array than through bilateral amplification. Large intersubject differences were found in both listener groups. The best masked speech reception thresholds overall were found for a hybrid condition that combined natural and beamforming listening in order to preserve localization for broadband sources.
Collapse
|
13
|
Musical training, individual differences and the cocktail party problem. Sci Rep 2015; 5:11628. [PMID: 26112910 PMCID: PMC4481518 DOI: 10.1038/srep11628] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 06/02/2015] [Indexed: 11/09/2022] Open
Abstract
Are musicians better able to understand speech in noise than non-musicians? Recent findings have produced contradictory results. Here we addressed this question by asking musicians and non-musicians to understand target sentences masked by other sentences presented from different spatial locations, the classical 'cocktail party problem' in speech science. We found that musicians obtained a substantial benefit in this situation, with thresholds ~6 dB better than non-musicians. Large individual differences in performance were noted particularly for the non-musically trained group. Furthermore, in different conditions we manipulated the spatial location and intelligibility of the masking sentences, thus changing the amount of 'informational masking' (IM) while keeping the amount of 'energetic masking' (EM) relatively constant. When the maskers were unintelligible and spatially separated from the target (low in IM), musicians and non-musicians performed comparably. These results suggest that the characteristics of speech maskers and the amount of IM can influence the magnitude of the differences found between musicians and non-musicians in multiple-talker "cocktail party" environments. Furthermore, considering the task in terms of the EM-IM distinction provides a conceptual framework for future behavioral and neuroscientific studies which explore the underlying sensory and cognitive mechanisms contributing to enhanced "speech-in-noise" perception by musicians.
Collapse
|
14
|
Better-ear glimpsing in hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:EL213-9. [PMID: 25698053 PMCID: PMC4327925 DOI: 10.1121/1.4907737] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Revised: 12/27/2014] [Accepted: 01/08/2015] [Indexed: 05/26/2023]
Abstract
When competing speech sounds are spatially separated, listeners can make use of the ear with the better target-to-masker ratio. Recent studies showed that listeners with normal hearing are able to efficiently make use of this "better-ear," even when it alternates between left and right ears at different times in different frequency bands, which may contribute to the ability to listen in spatialized speech mixtures. In the present study, better-ear glimpsing in listeners with bilateral sensorineural hearing impairment, who perform poorly in spatialized speech mixtures, was investigated. The results suggest that this deficit is not related to better-ear glimpsing.
Collapse
|
15
|
The role of syntax in maintaining the integrity of streams of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:766-77. [PMID: 25234885 PMCID: PMC3986016 DOI: 10.1121/1.4861354] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Revised: 12/13/2013] [Accepted: 12/23/2013] [Indexed: 05/21/2023]
Abstract
This study examined the ability of listeners to utilize syntactic structure to extract a target stream of speech from among competing sounds. Target talkers were identified by voice or location, which was held constant throughout a test utterance, and paired with correct or incorrect (random word order) target sentence syntax. Both voice and location provided reliable cues for identifying target speech even when other features varied unpredictably. The target sentences were masked either by predominantly energetic maskers (noise bursts) or by predominantly informational maskers (similar speech in random word order). When the maskers were noise bursts, target sentence syntax had relatively minor effects on identification performance. However, when the maskers were other talkers, correct target sentence syntax resulted in significantly better speech identification performance than incorrect syntax. Furthermore, conformance to correct syntax alone was sufficient to accurately identify the target speech. The results were interpreted as supporting the idea that the predictability of the elements comprising streams of speech, as manifested by syntactic structure, is an important factor in binding words together into coherent streams. Furthermore, these findings suggest that predictability is particularly important for maintaining the coherence of an auditory stream over time under conditions high in informational masking.
Collapse
|
16
|
Perceiving sequential dependencies in auditory streams. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:1215-1231. [PMID: 23927120 PMCID: PMC3745531 DOI: 10.1121/1.4812276] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Revised: 05/30/2013] [Accepted: 06/08/2013] [Indexed: 05/30/2023]
Abstract
This study examined the ability of human listeners to detect the presence and judge the strength of a statistical dependency among the elements comprising sequences of sounds. The statistical dependency was imposed by specifying transition matrices that determined the likelihood of occurrence of the sound elements. Markov chains were constructed from these transition matrices having states that were pure tones/noise bursts that varied along the stimulus dimensions of frequency and/or interaural time difference. Listeners reliably detected the presence of a statistical dependency in sequences of sounds varying along these stimulus dimensions. Furthermore, listeners were able to discriminate the relative strength of the dependency in pairs of successive sound sequences. Random variation along an irrelevant stimulus dimension had small but significant adverse effects on performance. A much greater decrement in performance was found when the sound sequences were concurrent. Likelihood ratios were computed based on the transition matrices to specify Ideal Observer performance for the experimental conditions. Preliminary modeling efforts were made based on degradations of Ideal Observer performance intended to represent human observer limitations. This experimental approach appears to be useful for examining auditory "stream" formation and maintenance over time based on the predictability of the constituent sound elements.
Collapse
|
17
|
Spatial release from masking as a function of the spectral overlap of competing talkers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:3677-3680. [PMID: 23742322 PMCID: PMC3689785 DOI: 10.1121/1.4803517] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2013] [Revised: 03/28/2013] [Accepted: 04/08/2013] [Indexed: 05/29/2023]
Abstract
This study tested the hypothesis that the reduced spatial release from speech-on-speech masking typically observed in listeners with sensorineural hearing loss results from increased energetic masking. Target sentences were presented simultaneously with a speech masker, and the spectral overlap between the pair (and hence the energetic masking) was systematically varied. The results are consistent with increased energetic masking in listeners with hearing loss that limits performance when listening in speech mixtures. However, listeners with hearing loss did not exhibit reduced spatial release from masking when stimuli were filtered into narrow bands.
Collapse
|
18
|
Design and preliminary testing of a visually guided hearing aid. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:EL202-EL207. [PMID: 23464129 PMCID: PMC3585754 DOI: 10.1121/1.4791710] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2012] [Accepted: 01/28/2013] [Indexed: 06/01/2023]
Abstract
An approach to hearing aid design is described, and preliminary acoustical and perceptual measurements are reported, in which an acoustic beam-forming microphone array is coupled to an eye-glasses-mounted eye-tracker. This visually guided hearing aid (VGHA)-currently a laboratory-based prototype-senses direction of gaze using the eye tracker and an interface converts those values into control signals that steer the acoustic beam accordingly. Preliminary speech intelligibility measurements with noise and speech maskers revealed near- or better-than normal spatial release from masking with the VGHA. Although not yet a wearable prosthesis, the principle underlying the device is supported by these findings.
Collapse
|
19
|
The influence of non-spatial factors on measures of spatial release from masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:3103-10. [PMID: 22501083 PMCID: PMC3339507 DOI: 10.1121/1.3693656] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study tested the hypothesis that the reduction in spatial release from masking (SRM) resulting from sensorineural hearing loss in competing speech mixtures is influenced by the characteristics of the interfering speech. A frontal speech target was presented simultaneously with two intelligible or two time-reversed (unintelligible) speech maskers that were either colocated with the target or were symmetrically separated from the target in the horizontal plane. The difference in SRM between listeners with hearing impairment and listeners with normal hearing was substantially larger for the forward maskers (deficit of 5.8 dB) than for the reversed maskers (deficit of 1.6 dB). This was driven by the fact that all listeners, regardless of hearing abilities, performed similarly (and poorly) in the colocated condition with intelligible maskers. The same conditions were then tested in listeners with normal hearing using headphone stimuli that were degraded by noise vocoding. Reducing the number of available spectral channels systematically reduced the measured SRM, and again, more so for forward (reduction of 3.8 dB) than for reversed speech maskers (reduction of 1.8 dB). The results suggest that non-spatial factors can strongly influence both the magnitude of SRM and the apparent deficit in SRM for listeners with impaired hearing.
Collapse
|
20
|
Contextual effects in the identification of nonspeech auditory patterns. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:3926-38. [PMID: 22225048 PMCID: PMC3253596 DOI: 10.1121/1.3658442] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2011] [Revised: 10/05/2011] [Accepted: 10/07/2011] [Indexed: 05/31/2023]
Abstract
This study investigated the benefit of a priori cues in a masked nonspeech pattern identification experiment. Targets were narrowband sequences of tone bursts forming six easily identifiable frequency patterns selected randomly on each trial. The frequency band containing the target was randomized. Maskers were also narrowband sequences of tone bursts chosen randomly on every trial. Targets and maskers were presented monaurally in mutually exclusive frequency bands, producing large amounts of informational masking. Cuing the masker produced a significant improvement in performance, while holding the target frequency band constant provided no benefit. The cue providing the greatest benefit was a copy of the masker presented ipsilaterally before the target-plus-masker. The masker cue presented contralaterally, and a notched-noise cue produced smaller benefits. One possible mechanism underlying these findings is auditory "enhancement" in which the neural response to the target is increased relative to the masker by differential prior stimulation of the target and masker frequency regions. A second possible mechanism provides a benefit to performance by comparing the spectrotemporal correspondence of the cue and target-plus-masker and is effective for either ipsilateral or contralateral cue presentation. These effects improve identification performance by emphasizing spectral contrasts in sequences or streams of sounds.
Collapse
|
21
|
Spatial release from masking in normally hearing and hearing-impaired listeners as a function of the temporal overlap of competing talkers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:1616-25. [PMID: 21428524 PMCID: PMC3078033 DOI: 10.1121/1.3533733] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2010] [Revised: 12/07/2010] [Accepted: 12/13/2010] [Indexed: 05/21/2023]
Abstract
Listeners with sensorineural hearing loss are poorer than listeners with normal hearing at understanding one talker in the presence of another. This deficit is more pronounced when competing talkers are spatially separated, implying a reduced "spatial benefit" in hearing-impaired listeners. This study tested the hypothesis that this deficit is due to increased masking specifically during the simultaneous portions of competing speech signals. Monosyllabic words were compressed to a uniform duration and concatenated to create target and masker sentences with three levels of temporal overlap: 0% (non-overlapping in time), 50% (partially overlapping), or 100% (completely overlapping). Listeners with hearing loss performed particularly poorly in the 100% overlap condition, consistent with the idea that simultaneous speech sounds are most problematic for these listeners. However, spatial release from masking was reduced in all overlap conditions, suggesting that increased masking during periods of temporal overlap is only one factor limiting spatial unmasking in hearing-impaired listeners.
Collapse
|
22
|
Stimulus factors influencing spatial release from speech-on-speech masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:1965-78. [PMID: 20968368 PMCID: PMC2981113 DOI: 10.1121/1.3478781] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
This study examined spatial release from masking (SRM) when a target talker was masked by competing talkers or by other types of sounds. The focus was on the role of interaural time differences (ITDs) and time-varying interaural level differences (ILDs) under conditions varying in the strength of informational masking (IM). In the first experiment, a target talker was masked by two other talkers that were either colocated with the target or were symmetrically spatially separated from the target with the stimuli presented through loudspeakers. The sounds were filtered into different frequency regions to restrict the available interaural cues. The largest SRM occurred for the broadband condition followed by a low-pass condition. However, even the highest frequency bandpass-filtered condition (3-6 kHz) yielded a significant SRM. In the second experiment the stimuli were presented via earphones. The listeners identified the speech of a target talker masked by one or two other talkers or noises when the maskers were colocated with the target or were perceptually separated by ITDs. The results revealed a complex pattern of masking in which the factors affecting performance in colocated and spatially separated conditions are to a large degree independent.
Collapse
|
23
|
The intelligibility of pointillistic speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:EL196-EL201. [PMID: 20000894 PMCID: PMC2792325 DOI: 10.1121/1.3258062] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2009] [Accepted: 10/09/2009] [Indexed: 05/28/2023]
Abstract
A form of processed speech is described that is highly discriminable in a closed-set identification format. The processing renders speech into a set of sinusoidal pulses played synchronously across frequency. The processing and results from several experiments are described. The number and width of frequency analysis channels and tone-pulse duration were variables. In one condition, various proportions of the tones were randomly removed. The processed speech was remarkably resilient to these manipulations. This type of speech may be useful for examining multitalker listening situations in which a high degree of stimulus control is required.
Collapse
|
24
|
Abstract
The benefit of wearing hearing aids in multitalker, reverberant listening environments was evaluated in a study of speech-on-speech masking with two groups of listeners with hearing loss (younger/older). Listeners selectively attended a known spatial location in two room conditions (low/high reverberation) and identified target speech in the presence of two competing talkers that were either co-located or symmetrically spatially separated from the target. The amount of spatial release from masking (SRM) with bilateral aids was similar to that when listening unaided at or near an equivalent sensation level and was negatively correlated with the amount of hearing loss. When using a single aid, SRM was reduced and was related to the level of the stimulus in the unaided ear. Increased reverberation also reduced SRM in all listening conditions. Results suggest a complex interaction between hearing loss, hearing aid use, reverberation, and performance in auditory selective attention tasks.
Collapse
|
25
|
Listening to every other word: examining the strength of linkage variables in forming streams of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:3793-802. [PMID: 19206805 PMCID: PMC2676624 DOI: 10.1121/1.2998980] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
In a variation on a procedure originally developed by Broadbent [(1952). "Failures of attention in selective listening," J. Exp. Psychol. 44, 428-433] listeners were presented with two sentences spoken in a sequential, interleaved-word format. Sentence one (target) comprised the odd-numbered words in the sequence and sentence two (masker) comprised the even-numbered words in the sequence. The task was to report the words in sentence one. The goal was to determine the effectiveness of cues linking the words of the target (or masker) over time. Three such "linkage variables" were examined: (1) fixed talker, (2) fixed perceived interaural location, and (3) correct syntactic structure. All of the linkage variables provided a significant advantage when applied to the target compared to the baseline condition in which the linkage variables were randomized. However, these linkage variables were not effective when applied to the masker. Word position effects were found such that performance in the baseline condition declined, and the advantages of the linkage variables increased, for the words near the end of the sentence. Overall, this approach appears to be useful for examining interference in speech recognition that has little or no peripheral component. The results suggest that variables that link target words together improve their resiliency to interference and/or their recall.
Collapse
|
26
|
Effects of sensorineural hearing loss on visually guided attention in a multitalker environment. J Assoc Res Otolaryngol 2008; 10:142-9. [PMID: 19009321 DOI: 10.1007/s10162-008-0146-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2008] [Accepted: 10/17/2008] [Indexed: 11/29/2022] Open
Abstract
This study asked whether or not listeners with sensorineural hearing loss have an impaired ability to use top-down attention to enhance speech intelligibility in the presence of interfering talkers. Listeners were presented with a target string of spoken digits embedded in a mixture of five spatially separated speech streams. The benefit of providing simple visual cues indicating when and/or where the target would occur was measured in listeners with hearing loss, listeners with normal hearing, and a control group of listeners with normal hearing who were tested at a lower target-to-masker ratio to equate their baseline (no cue) performance with the hearing-loss group. All groups received robust benefits from the visual cues. The magnitude of the spatial-cue benefit, however, was significantly smaller in listeners with hearing loss. Results suggest that reduced utility of selective attention for resolving competition between simultaneous sounds contributes to the communication difficulties experienced by listeners with hearing loss in everyday listening situations.
Collapse
|
27
|
The effects of hearing loss and age on the benefit of spatial separation between multiple talkers in reverberant rooms. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:3064-75. [PMID: 19045792 PMCID: PMC2736722 DOI: 10.1121/1.2980441] [Citation(s) in RCA: 124] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This study investigated the interaction between hearing loss, reverberation, and age on the benefit of spatially separating multiple masking talkers from a target talker. Four listener groups were tested based on hearing status and age. On every trial listeners heard three different sentences spoken simultaneously by different female talkers. Listeners reported keywords from the target sentence, which was presented at a fixed and known location. Maskers were colocated with the target or presented from spatially separated and symmetrically placed loudspeakers, creating a situation with no simple "better-ear." Reverberation was also varied. The target-to-masker ratio at threshold for identification of the fixed-level target was measured by adapting the level of the maskers. On average, listeners with hearing loss showed less spatial release from masking than normal-hearing listeners. Age was a significant factor although small differences in hearing sensitivity across age groups may have contributed to this effect. Spatial release was reduced in the more reverberant room condition but in most cases a significant advantage remained. These results provide evidence for a large benefit of spatial separation in a multitalker situation that is likely due to perceptual factors. However, this benefit is significantly reduced by both hearing loss and reverberation.
Collapse
|
28
|
Informational masking increases the costs of monitoring multiple channels. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:EL223-EL229. [PMID: 19062790 PMCID: PMC2677339 DOI: 10.1121/1.2968302] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2008] [Accepted: 07/17/2008] [Indexed: 05/27/2023]
Abstract
This study examined the costs of simultaneously monitoring two frequency regions. Listeners detected low- and high-frequency tones in a 2I4AFC procedure. On every trial, each signal was presented in either the first or second interval independently. Comparison of thresholds in single- and dual-signal conditions provided an estimate of the costs. Thresholds were obtained in quiet, in notched-filtered noise, and in randomized multitone maskers. No cost was found in quiet, whereas large costs were found for the masked conditions, especially for the multitone masker. These results suggest that costs of dividing attention in frequency depend on both signal and nonsignal channels.
Collapse
|
29
|
Tuning in the spatial dimension: evidence from a masked speech identification task. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:1146-58. [PMID: 18681603 PMCID: PMC2809679 DOI: 10.1121/1.2945710] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2007] [Revised: 05/23/2008] [Accepted: 05/28/2008] [Indexed: 05/23/2023]
Abstract
Spatial release from masking was studied in a three-talker soundfield listening experiment. The target talker was presented at 0 degrees azimuth and the maskers were either colocated or symmetrically positioned around the target, with a different masker talker on each side. The symmetric placement greatly reduced any "better ear" listening advantage. When the maskers were separated from the target by +/-15 degrees , the average spatial release from masking was 8 dB. Wider separations increased the release to more than 12 dB. This large effect was eliminated when binaural cues and perceived spatial separation were degraded by covering one ear with an earplug and earmuff. Increasing reverberation in the room increased the target-to-masker ratio (TM) for the separated, but not colocated, conditions reducing the release from masking, although a significant advantage of spatial separation remained. Time reversing the masker speech improved performance in both the colocated and spatially separated cases but lowered TM the most for the colocated condition, also resulting in a reduction in the spatial release from masking. Overall, the spatial tuning observed appears to depend on the presence of interaural differences that improve the perceptual segregation of sources and facilitate the focus of attention at a point in space.
Collapse
|
30
|
The extent to which a position-based explanation accounts for binaural release from informational masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:439-449. [PMID: 18646988 PMCID: PMC2587211 DOI: 10.1121/1.2924127] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2007] [Revised: 04/14/2008] [Accepted: 04/15/2008] [Indexed: 05/26/2023]
Abstract
Detection was measured for a 500 Hz tone masked by noise (an "energetic" masker) or sets of ten randomly drawn tones (an "informational" masker). Presenting the maskers diotically and the target tone with a variety of interaural differences (interaural amplitude ratios and/or interaural time delays) resulted in reduced detection thresholds relative to when the target was presented diotically ("binaural release from masking"). Thresholds observed when time and amplitude differences applied to the target were "reinforcing" (favored the same ear, resulting in a lateralized position for the target) were not significantly different from thresholds obtained when differences were "opposing" (favored opposite ears, resulting in a centered position for the target). This irrelevance of differences in the perceived location of the target is a classic result for energetic maskers but had not previously been shown for informational maskers. However, this parallellism between the patterns of binaural release for energetic and informational maskers was not accompanied by high correlations between the patterns for individual listeners, supporting the idea that the mechanisms for binaural release from energetic and informational masking are fundamentally different.
Collapse
|
31
|
The ability to listen with independent ears. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:2814-2825. [PMID: 18189571 DOI: 10.1121/1.2780143] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
In three experiments, listeners identified speech processed into narrow bands and presented to the right ("target") ear. The ability of listeners to ignore (or even use) conflicting contralateral stimulation was examined by presenting various maskers to the target ear ("ipsilateral") and nontarget ear ("contralateral"). Theoretically, an absence of contralateral interference would imply selectively attending to only the target ear; the presence of interference from the contralateral stimulus would imply that listeners were unable to treat the stimuli at the two ears independently; and improved performance in the presence of informative contralateral stimulation would imply that listeners can process the signals at both ears and keep them separate rather than combining them. Experiments showed evidence of the ability to selectively process (or respond to) only the target ear in some, but not all, conditions. No evidence was found for improved performance due to contralateral stimulation. The pattern of interference found across experiments supports an interaction of stimulus-based factors (auditory grouping) and task-based factors (demand for processing resources) and suggests that listeners may not always be able to listen to the "better" ear even when it would be beneficial to do so.
Collapse
|
32
|
The advantage of knowing where to listen. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:3804-15. [PMID: 16419825 DOI: 10.1121/1.2109187] [Citation(s) in RCA: 173] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
This study examined the role of focused attention along the spatial (azimuthal) dimension in a highly uncertain multitalker listening situation. The task of the listener was to identify key words from a target talker in the presence of two other talkers simultaneously uttering similar sentences. When the listener had no a priori knowledge about target location, or which of the three sentences was the target sentence, performance was relatively poor-near the value expected simply from choosing to focus attention on only one of the three locations. When the target sentence was cued before the trial, but location was uncertain, performance improved significantly relative to the uncued case. When spatial location information was provided before the trial, performance improved significantly for both cued and uncued conditions. If the location of the target was certain, proportion correct identification performance was higher than 0.9 independent of whether the target was cued beforehand. In contrast to studies in which known versus unknown spatial locations were compared for relatively simple stimuli and tasks, the results of the current experiments suggest that the focus of attention along the spatial dimension can play a very significant role in solving the "cocktail party" problem.
Collapse
|
33
|
Informational masking for simultaneous nonspeech stimuli: psychometric functions for fixed and randomly mixed maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:2482-97. [PMID: 16266169 DOI: 10.1121/1.2032748] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Sensitivity d' and response bias beta were measured as a function of target level for the detection of a 1000-Hz tone in multitone maskers using a one interval, two-alternative forced-choice (1I-2AFC) paradigm. Ten such maskers, each with eight randomly selected components in the region 200-5000 Hz, with 800-1250 Hz excluded to form a protected zone, were presented under two conditions: the fixed condition, in which the same eight-component masker is used throughout an experimental run, and the random condition, in which an eight-component masker is chosen randomly trial-to-trial from the given set of ten such maskers. Differences between the results obtained with these two conditions help characterize the listener's susceptibility to informational masking (IM). The d' results show great intersubject variability, but can be reasonably well fit by simple energy-detector models in which internal noise and filter bandwidth are used as fitting parameters. In contrast, the beta results are not well fit by these models. In addition to presentation of new data and its relation to energy-detector models, this paper provides comments on a variety of issues, problems, and research needs in the IM area.
Collapse
|
34
|
Binaural release from informational masking in a speech identification task. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:1614-25. [PMID: 16240821 DOI: 10.1121/1.1984876] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Binaural release from informational masking (IM) was examined in a speech identification task. Target and masker sentences were processed into mutually exclusive frequency bands, thus limiting energetic masking (EM), and presented over headphones. In a baseline condition, both were presented monotically to the same ear (TmMm). Despite minimal frequency overlap between target and masker, the presence of the masker resulted in reduced performance, or IM. Presenting the target monotically and the masker diotically (TmMo) resulted in a release from IM. Release was also obtained by imposing interaural differences in level (ILDs) and in time (ITDs) on the maskers (T(m)M(ILD), T(m)M(ITD)). Any masker with a perceived lateral position that differed from that of a truly monaural stimulus resulted in a similar amount of release from IM relative to TmMm. For binaural targets and maskers (T(o)M(ILD), T(o)M(ITD)), release was seen whenever ITDs or ILDs differed between target and masker. These results suggest that binaural cues can be very effective in reducing IM. Because mechanisms based on differences in perceived location make predictions that are similar to those of nonlocation-based binaural mechanisms, a variant of the equalization-cancellation model is also considered.
Collapse
|
35
|
Combining energetic and informational masking for speech identification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:982-92. [PMID: 16158654 DOI: 10.1121/1.1953167] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.
Collapse
|
36
|
The effect of spatial separation on informational masking of speech in normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:2169-80. [PMID: 15898658 DOI: 10.1121/1.1861598] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
The ability to understand speech in a multi-source environment containing informational masking may depend on the perceptual arrangement of signal and masker objects in space. In normal-hearing listeners, Arbogast et al. [J. Acoust. Soc. Am. 112, 2086-2098 (2002)] found an 18-dB spatial release from a primarily informational masker, compared to 7 dB for a primarily energetic masker. This article extends the earlier work to include the study of listeners with sensorineural hearing loss. Listeners performed closed-set speech recognition in two spatial conditions: 0 degrees and 90 degrees separation between signal and masker. Three maskers were tested: (1) the different-band sentence masker was designed to be primarily informational; (2) the different-band noise masker was a control for the different-band sentence; and (3) the same-band noise masker was designed to be primarily energetic. The spatial release from the different-band sentence was larger than for the other maskers, but was smaller (10 dB) for the hearing-impaired group than for the normal-hearing group (15 dB). The smaller benefit for the hearing-impaired listeners can be partially explained by masker sensation level. However, the results suggest that hearing-impaired listeners can use the perceptual effect of spatial separation to improve speech recognition in the presence of a primarily informational masker.
Collapse
|
37
|
Multiple bursts, multiple looks, and stream coherence in the release from informational masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 114:2835-2845. [PMID: 14650018 DOI: 10.1121/1.1621864] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
In the simultaneous multitone masking paradigm introduced by Neff and Green [Percept. Psychophys. 41, 409-415 (1987)] the masker typically is a small number of tones having frequencies and levels that are randomly drawn on every presentation. Large amounts of masking for a pure-tone signal often occur that are thought to reflect central, rather than peripheral, limitations on processing. Previous work from this laboratory has indicated that playing a rapid succession of randomly drawn multitone maskers in each observation interval dramatically reduces the amount of masking that is observed relative to a single burst (SB). In this multiple-bursts-different (MBD) procedure, the signal tone is the only constant frequency component during the sequence of bursts and tends to perceptually segregate from the masker. In this study, the number of masker bursts and the interburst interval (IBI) were varied. The goals were to determine how the release from masking relative to the SB condition depends on the number of bursts and to examine whether increasing the IBI would cause each burst to be processed independently. If the latter were true, it might disrupt the perception of signal stream coherence, thereby diminishing the MBD advantage. However, multiple independent looks could also lead to an improvement in performance. For those subjects showing large amounts of informational masking in the SB condition, substantial reduction in masked thresholds occurred as the number of masker bursts increased, while masking increased as IBI lengthened. The results were not consistent with a simple version of a multiple-look model in which the information from each burst was combined optimally, but instead appear to be attributable to mechanisms involved in the perceptual organization of sounds.
Collapse
|
38
|
Informational masking and musical training. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 114:1543-9. [PMID: 14514207 DOI: 10.1121/1.1598197] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The relationship between musical training and informational masking was studied for 24 young adult listeners with normal hearing. The listeners were divided into two groups based on musical training. In one group, the listeners had little or no musical training; the other group was comprised of highly trained, currently active musicians. The hypothesis was that musicians may be less susceptible to informational masking, which is thought to reflect central, rather than peripheral, limitations on the processing of sound. Masked thresholds were measured in two conditions, similar to those used by Kidd et al. [J. Acoust. Soc. Am. 95, 3475-3480 (1994)]. In both conditions the signal was comprised of a series of repeated tone bursts at 1 kHz. The masker was comprised of a series of multitone bursts, gated with the signal. In one condition the frequencies of the masker were selected randomly for each burst; in the other condition the masker frequencies were selected randomly for the first burst of each interval and then remained constant throughout the interval. The difference in thresholds between the two conditions was taken as a measure of informational masking. Frequency selectivity, using the notched-noise method, was also estimated in the two groups. The results showed no difference in frequency selectivity between the two groups, but showed a large and significant difference in the amount of informational masking between musically trained and untrained listeners. This informational masking task, which requires no knowledge specific to musical training (such as note or interval names) and is generally not susceptible to systematic short- or medium-term training effects, may provide a basis for further studies of analytic listening abilities in different populations.
Collapse
|
39
|
Abstract
Simultaneous tones that are harmonically related tend to be grouped perceptually to form a unitary auditory image. A partial that is mistuned stands out from the other tones, and harmonic complexes with different fundamental frequencies can readily be perceived as separate auditory objects. These phenomena are evidence for the strong role of harmonicity in perceptual grouping and segregation of sounds. This study measured the discriminability of harmonicity directly. In a two interval, two alternative forced-choice (2I2AFC) paradigm, the listener chose which of two sounds, signal or foil, was composed of tones that more closely matched an exact harmonic relationship. In one experiment, the signal was varied from perfectly harmonic to highly inharmonic by adding frequency perturbation to each component. The foil always had 100% perturbation. Group mean performance decreased from greater than 90% correct for 0% signal perturbation to near chance for 80% signal perturbation. In the second experiment, adding a masker presented simultaneously with the signals and foils disrupted harmonicity. Both monaural and dichotic conditions were tested. Signal level was varied relative to masker level to obtain psychometric functions from which slopes and midpoints were estimated. Dichotic presentation of these audible stimuli improved performance by 3-10 dB, due primarily to a release from "informational masking" by the perceptual segregation of the signal from the masker.
Collapse
|
40
|
Informational masking: counteracting the effects of stimulus uncertainty by decreasing target-masker similarity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 114:368-379. [PMID: 12880048 DOI: 10.1121/1.1577562] [Citation(s) in RCA: 128] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Previous work has indicated that target-masker similarity, as well as stimulus uncertainty, influences the amount of informational masking that occurs in detection, discrimination, and recognition tasks. In each of five experiments reported in this paper, the detection threshold for a tonal target in random multitone maskers presented simultaneously with the target tone was measured for two conditions using the same set of five listeners. In one condition, the target was constructed to be "similar" (S) to the masker; in the other condition, it was constructed to be "dissimilar" (D) to the masker. The specific masker varied across experiments, but was constant for the two conditions. Target-masker similarity varied in dimensions such as duration, perceived location, direction of frequency glide, and spectro-temporal coherence. Group-mean results show large decreases in the amount of masking for the D condition relative to the S condition. In addition, individual differences (a hallmark of informational masking) are found to be much greater in the S condition than in the D condition. Furthermore, listener vulnerability to informational masking is found to be consistent to at least a moderate degree across experiments.
Collapse
|
41
|
|
42
|
CS-dependent response probability in an auditory masked-detection task: considerations based on models of Pavlovian conditioning. THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY. B, COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY 2003; 56:193-205. [PMID: 12791568 DOI: 10.1080/02724990244000052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
Experimental studies were performed using a Pavlovian-conditioned eyeblink response to measure detection of a variable-sound-level tone (T) in a fixed-sound-level masking noise (N) in rabbits. Results showed an increase in the asymptotic probability of conditioned responses (CRs) to the reinforced TN trials and a decrease in the asymptotic rate of eyeblink responses to the non-reinforced N presentations as a function of the sound level of the T. These observations are consistent with expected behaviour in an auditory masked detection task, but they are not consistent with predictions from a traditional application of the Rescorla-Wagner or Pearce models of associative learning. To implement these models, one typically considers only the actual stimuli and reinforcement on each trial. We found that by considering perceptual interactions and concepts from signal detection theory, these models could predict the CS dependence on the sound level of the T. In these alternative implementations, the animals response probabilities were used as a guide in making assumptions about the "effective stimuli".
Collapse
|
43
|
Informational masking caused by contralateral stimulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:1594-1603. [PMID: 12656394 DOI: 10.1121/1.1547440] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Although informational masking is thought to reflect central mechanisms, the effects are generally much stronger when the target and masker are presented to the same ear than when they are presented to different ears. However, the results of a recent study by Brungart and Simpson [J. Acoust. Soc. Am. 112, 2985-2995 (2002)] indicated that a speech masker that is presented contralateral to a speech signal can produce substantial amounts of informational masking when a second speech masker is played simultaneously in the same ear as the signal. In this study, we conducted a series of experiments that paralleled those of Brungart and Simpson but used a pure-tone signal and multitone informational maskers in a detection task. Both the signal and the maskers were played as sequences of short bursts in each observation interval. The maskers were arranged in two types of spectrotemporal patterns. One type of pattern, called "multiple-bursts same" (MBS), has previously been shown to produce very large amounts of informational masking while the other type of pattern, called "multiple-bursts different" (MBD), has been shown to produce very small amounts of informational masking. Several conditions of ipsilateral, contralateral, and combined presentation of these maskers were tested. The results showed that presentation of the MBS masker in the contralateral ear produced a substantial amount of informational masking when the MBD masker was simultaneously presented to the ipsilateral ear. The results supported the earlier findings of Brungart and Simpson indicating that listeners are unable to selectively focus their attention on a single ear in some complex dichotic listening conditions. These results suggest that this contralateral masking effect is not restricted to speech and may reflect more general limitations on processing capacity. Further, it was concluded that the magnitude of the contralateral masking effect was related both to the informational masking value of the contralateral masker and the complexity of the stimulus and/or task in the ear in which the signal was presented.
Collapse
|
44
|
The effect of spatial separation on informational and energetic masking of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 112:2086-98. [PMID: 12430820 DOI: 10.1121/1.1510141] [Citation(s) in RCA: 163] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
The effect of spatial separation of sources on the masking of a speech signal was investigated for three types of maskers, ranging from energetic to informational. Normal-hearing listeners performed a closed-set speech identification task in the presence of a masker at various signal-to-noise ratios. Stimuli were presented in a quiet sound field. The signal was played from 0 degrees azimuth and a masker was played either from the same location or from 90 degrees to the right. Signals and maskers were derived from sentences that were preprocessed by a modified cochlear-implant simulation program that filtered each sentence into 15 frequency bands, extracted the envelopes from each band, and used these envelopes to modulate pure tones at the center frequencies of the bands. In each trial, the signal was generated by summing together eight randomly selected frequency bands from the preprocessed signal sentence. Three maskers were derived from the preprocessed masker sentences: (1) different-band sentence, which was generated by summing together six randomly selected frequency bands out of the seven bands not present in the signal (resulting in primarily informational masking); (2) different-band noise, which was generated by convolving the different-band sentence with Gaussian noise; and (3) same-band noise, which was generated by summing the same eight bands from the preprocessed masker sentence that were used in the signal sentence and convolving the result with Gaussian noise (resulting in primarily energetic masking). Results revealed that in the different-band sentence masker, the effect of spatial separation averaged 18 dB (at 51% correct), while in the different-band and same-band noise maskers the effect was less than 10 dB. These results suggest that, in these conditions, the advantage due to spatial separation of sources is greater for informational masking than for energetic masking.
Collapse
|
45
|
Primary motor cortex neuronal discharge during reach-to-grasp: controlling the hand as a unit. Arch Ital Biol 2002; 140:229-36. [PMID: 12173526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
Abstract
This study has begun to test the hypothesis that aspects of hand/object shape are represented in the discharge of primary motor cortex (M1) neurons. Two monkeys were trained in a visually cued reach-to-grasp task, in which object properties and grasp forces were systematically varied. Behavioral analyses show that the reach and grasp force production were constant across the objects. The discharge of M1 neurons was highly modulated during the reach and grasp. Multiple linear regressions models revealed that the M1 discharge was highly dependent on the object grasped, with object class, volume, orientation and grasp force as significant predictors. These findings are interpreted as evidence that the CNS controls the hand as a unit.
Collapse
|
46
|
Abstract
Measures of energetic and informational masking were obtained from 46 listeners with sensorineural hearing loss. The task was to detect the presence of a sequence of eight contiguous 60-ms bursts of a pure tone embedded in masker bursts that were played synchronously with the signal. The masker was either a sequence of Gaussian noise bursts (energetic masker) or a sequence of random-frequency 2-tone bursts (informational masker). The 2-tone maskers were of two types: one type that normally tends to produce large amounts of informational masking and a second type that normally tends to produce very little informational masking. The two informational maskers are called "multiple-bursts same" (MBS), because the same frequency components are present in each burst of a sequence, and "multiple-bursts different" (MBD), because different frequency components are presented in each burst of a sequence. The difference in masking observed for these two maskers is thought to occur because the signal perceptually segregates from the masker in the MBD condition but fuses with the masker in MBS. In the present study, the effectiveness of the MBD masker, measured as the signal-to-masker ratio at masked threshold, increased with increasing hearing loss. In contrast, the signal-to-masker ratio at masked threshold for the MBS masker changed much less as a function of hearing loss. These results suggest that sensorineural hearing loss interferes with the ability of the listener to perceptually segregate individual components of complex sounds. The results from the energetic masking condition, which included critical ratio estimates for all listeners and auditory filter characteristics for a subset of the listeners, indicated that increasing hearing loss also reduced frequency selectivity at the signal frequency. Overall, these results suggest that the increased susceptibility to masking observed in listeners with sensorineural hearing loss is a consequence of both peripheral and central processes.
Collapse
|
47
|
Similarity, uncertainty, and masking in the identification of nonspeech auditory patterns. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 111:1367-1376. [PMID: 11931314 DOI: 10.1121/1.1448342] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This study examined whether increasing the similarity between informational maskers and signals would increase the amount of masking obtained in a nonspeech pattern identification task. The signals were contiguous sequences of pure-tone bursts arranged in six narrow-band spectro-temporal patterns. The informational maskers were sequences of multitone bursts played synchronously with the signal tones. The listener's task was to identify the patterns in a 1-interval 6-alternative forced-choice procedure. Three types of multitone maskers were generated according to different randomization rules. For the least signal-like informational masker, the components in each multitone burst were chosen at random within the frequency range of 200-6500 Hz, excluding a "protected region" around the signal frequencies. For the intermediate masker, the frequency components in the first burst were chosen quasirandomly, but the components in successive bursts were constrained to fall in narrow frequency bands around the frequencies of the components in the initial burst. Within the narrow bands the frequencies were randomized. This masker was considered to be more similar to the signal patterns because it consisted of a set of narrow-band sequences any one of which might be mistaken for a signal pattern. The most signal-like masker was similar to the intermediate masker in that it consisted of a set of synchronously played narrow-band sequences, but the variation in frequency within each sequence was sinusoidal, completing roughly one period in a sequence. This masker consisted of discernible patterns but not patterns that were part of the set of signals. In addition, masking produced by Gaussian noise bursts--thought to produce primarily peripherally based "energetic masking"--was measured and compared to the informational masking results. For the three informational maskers, more masking was produced by the maskers comprised of narrow-band sequences than for the masker in which the frequencies were not constrained to narrow bands. Also, the slopes of the performance-level functions for the three informational maskers were much shallower than for the Gaussian noise masker or for no masker. The findings provided qualified support for the hypothesis that increasing the similarity between signals and maskers, or parts of the maskers, causes greater informational masking. However, it is also possible that the greater masking was a consequence of increasing the number of perceptual "streams" that had to be evaluated by the listener.
Collapse
|
48
|
Binaural detection with narrowband and wideband reproducible noise maskers: I. Results for human. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 111:336-345. [PMID: 11831806 DOI: 10.1121/1.1423929] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This study investigated binaural detection of tonal targets (500 Hz) using sets of individual masker waveforms with two different bandwidths. Previous studies of binaural detection with wideband noise maskers show that responses to individual noise waveforms are correlated between diotic (N0S0) and dichotic (N0S(pi)) conditions [Gilkey et al., J. Acoust. Soc. Am. 78, 1207-1219 (1985)]; however, results for narrowband maskers are not correlated across interaural configurations [Isabelle and Colburn, J. Acoust. Soc. Am. 89, 352-359 (1991)]. This study was designed to allow direct comparison, in detail, of responses across bandwidths and interaural configurations. Subjects were tested on a binaural detection task using both narrowband (100-Hz bandwidth) and wideband (100 Hz to 3 kHz) noise maskers that had identical spectral components in the 100-Hz frequency band surrounding the tone frequency. The results of this study were consistent with the previous studies: N0S0 and N0S(pi) responses were more strongly correlated for wideband maskers than for narrowband maskers. Differences in the results for these two bandwidths suggest that binaural detection is not determined solely by the masker spectrum within the critical band centered on the target frequency, but rather that remote frequencies must be included in the analysis and modeling of binaural detection with wideband maskers. Results across the set of individual noises obtained with the fixed-level testing were comparable to those obtained with a tracking procedure which was similar to the procedure used in a companion study of rabbit subjects [Zheng et al., J. Acoust. Soc. Am. 111, 346-356 (2002)].
Collapse
|
49
|
Binaural detection with narrowband and wideband reproducible noise maskers: II. Results for rabbit. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 111:346-356. [PMID: 11831807 DOI: 10.1121/1.1423930] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Binaural detection with narrowband and wideband noise maskers was examined by using a Pavlovian-conditioned eyeblink response in rabbits. The target was a tone at 500 Hz, and the maskers were ten individual noise samples having one of two bandwidths, 200 Hz (410 Hz to 610 Hz) or 2900 Hz (100 Hz to 3 kHz). The narrowband noise maskers were created by filtering the wideband noise maskers such that the two sets of maskers had identical spectra in the 200-Hz frequency region surrounding the tone. The responses across the set of noise maskers were compared across bandwidths and across interaural configurations (N0S0 and N0S(pi)). Responses across the set of noise waveforms were not strongly correlated across bandwidths; this result is inconsistent with models for binaural detection that depend only upon the narrow band of energy centered at the frequency of the target tone. Responses were correlated across interaural configurations for the wideband masker condition, but not for the narrowband masker. All of these results were consistent with the companion study of human listeners [Evilsizer et al., J. Acoust. Soc. Am. 111, 336-345 (2002)] and with the results of human studies of binaural detection that used only wideband [Gilkey et al., J. Acoust. Soc. Am. 78, 1207-1219 (1985)] or narrowband [Isabelle and Colburn, J. Acoust. Soc. Am. 89, 352-259 (1991)] individual noise maskers.
Collapse
|
50
|
Studies of binaural detection in the rabbit (Oryctolagus cuniculus) with Pavlovian conditioning. Behav Neurosci 2001. [PMID: 11439454 DOI: 10.1037//0735-7044.115.3.650] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A Pavlovian conditioned eyeblink response in rabbits (Oryctolagus cuniculus) was used to study psychoacoustical phenomena previously demonstrated in human listeners and other animals. This article contains the results of a tone-in-noise detection study to examine 2 psychoacoustical phenomena in rabbit and in human listeners: (a) the binaural masking level difference (BMLD) and (b) differential performance across reproducible noise masker waveforms. The rabbits demonstrated a BMLD comparable in size to other species. Significant differences in performance across reproducible noise masker waveforms were seen in the rabbits. This performance was compared with the performance of human listeners using the same set of waveforms.
Collapse
|