1
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term neuroplasticity interact during the perceptual learning of concurrent speech. Cereb Cortex 2024; 34:bhad543. [PMID: 38212291 PMCID: PMC10839853 DOI: 10.1093/cercor/bhad543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/13/2024] Open
Abstract
Plasticity from auditory experience shapes the brain's encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ~ 45 min training sessions recorded simultaneously with high-density electroencephalography (EEG). We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. Although both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings reinforce the domain-general benefits of musicianship but reveal that successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity, which first emerge at a cortical level.
Collapse
Affiliation(s)
- Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
| | - Jack Stirn
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
2
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term experience-dependent neuroplasticity interact during the perceptual learning of concurrent speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559640. [PMID: 37808665 PMCID: PMC10557636 DOI: 10.1101/2023.09.26.559640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Plasticity from auditory experiences shapes brain encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ∼45 minute training sessions recorded simultaneously with high-density EEG. We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. While both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings confirm domain-general benefits for musicianship but reveal successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity that first emerge at a cortical level.
Collapse
|
3
|
Gilday OD, Praegel B, Maor I, Cohen T, Nelken I, Mizrahi A. Surround suppression in mouse auditory cortex underlies auditory edge detection. PLoS Comput Biol 2023; 19:e1010861. [PMID: 36656876 PMCID: PMC9888713 DOI: 10.1371/journal.pcbi.1010861] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 01/31/2023] [Accepted: 01/09/2023] [Indexed: 01/20/2023] Open
Abstract
Surround suppression (SS) is a fundamental property of sensory processing throughout the brain. In the auditory system, the early processing stream encodes sounds using a one dimensional physical space-frequency. Previous studies in the auditory system have shown SS to manifest as bandwidth tuning around the preferred frequency. We asked whether bandwidth tuning can be found around frequencies away from the preferred frequency. We exploited the simplicity of spectral representation of sounds to study SS by manipulating both sound frequency and bandwidth. We recorded single unit spiking activity from the auditory cortex (ACx) of awake mice in response to an array of broadband stimuli with varying central frequencies and bandwidths. Our recordings revealed that a significant portion of neuronal response profiles had a preferred bandwidth that varied in a regular way with the sound's central frequency. To gain insight into the possible mechanism underlying these responses, we modelled neuronal activity using a variation of the "Mexican hat" function often used to model SS. The model accounted for response properties of single neurons with high accuracy. Our data and model show that these responses in ACx obey simple rules resulting from the presence of lateral inhibitory sidebands, mostly above the excitatory band of the neuron, that result in sensitivity to the location of top frequency edges, invariant to other spectral attributes. Our work offers a simple explanation for auditory edge detection and possibly other computations of spectral content in sounds.
Collapse
Affiliation(s)
- Omri David Gilday
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Benedikt Praegel
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Ido Maor
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Tav Cohen
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Israel Nelken
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Adi Mizrahi
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
- * E-mail:
| |
Collapse
|
4
|
Gohari N, Hosseini Dastgerdi Z, Bernstein LJ, Alain C. Neural correlates of concurrent sound perception: A review and guidelines for future research. Brain Cogn 2022; 163:105914. [PMID: 36155348 DOI: 10.1016/j.bandc.2022.105914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/30/2022] [Accepted: 09/02/2022] [Indexed: 11/02/2022]
Abstract
The perception of concurrent sound sources depends on processes (i.e., auditory scene analysis) that fuse and segregate acoustic features according to harmonic relations, temporal coherence, and binaural cues (encompass dichotic pitch, location difference, simulated echo). The object-related negativity (ORN) and P400 are electrophysiological indices of concurrent sound perception. Here, we review the different paradigms used to study concurrent sound perception and the brain responses obtained from these paradigms. Recommendations regarding the design and recording parameters of the ORN and P400 are made, and their clinical applications in assessing central auditory processing ability in different populations are discussed.
Collapse
Affiliation(s)
- Nasrin Gohari
- Department of Audiology, School of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Zahra Hosseini Dastgerdi
- Department of Audiology, School of Rehabilitation, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Lori J Bernstein
- Department of Supportive Care, University Health Network, and Department of Psychiatry, University of Toronto, Toronto, Canada
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care & Department of Psychology, University of Toronto, Canada
| |
Collapse
|
5
|
Cortical Processing of Binaural Cues as Shown by EEG Responses to Random-Chord Stereograms. J Assoc Res Otolaryngol 2021; 23:75-94. [PMID: 34904205 PMCID: PMC8783002 DOI: 10.1007/s10162-021-00820-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/06/2021] [Indexed: 10/26/2022] Open
Abstract
Spatial hearing facilitates the perceptual organization of complex soundscapes into accurate mental representations of sound sources in the environment. Yet, the role of binaural cues in auditory scene analysis (ASA) has received relatively little attention in recent neuroscientific studies employing novel, spectro-temporally complex stimuli. This may be because a stimulation paradigm that provides binaurally derived grouping cues of sufficient spectro-temporal complexity has not yet been established for neuroscientific ASA experiments. Random-chord stereograms (RCS) are a class of auditory stimuli that exploit spectro-temporal variations in the interaural envelope correlation of noise-like sounds with interaurally coherent fine structure; they evoke salient auditory percepts that emerge only under binaural listening. Here, our aim was to assess the usability of the RCS paradigm for indexing binaural processing in the human brain. To this end, we recorded EEG responses to RCS stimuli from 12 normal-hearing subjects. The stimuli consisted of an initial 3-s noise segment with interaurally uncorrelated envelopes, followed by another 3-s segment, where envelope correlation was modulated periodically according to the RCS paradigm. Modulations were applied either across the entire stimulus bandwidth (wideband stimuli) or in temporally shifting frequency bands (ripple stimulus). Event-related potentials and inter-trial phase coherence analyses of the EEG responses showed that the introduction of the 3- or 5-Hz wideband modulations produced a prominent change-onset complex and ongoing synchronized responses to the RCS modulations. In contrast, the ripple stimulus elicited a change-onset response but no response to ongoing RCS modulation. Frequency-domain analyses revealed increased spectral power at the fundamental frequency and the first harmonic of wideband RCS modulations. RCS stimulation yields robust EEG measures of binaurally driven auditory reorganization and has potential to provide a flexible stimulation paradigm suitable for isolating binaural effects in ASA experiments.
Collapse
|
6
|
Swanborough H, Staib M, Frühholz S. Neurocognitive dynamics of near-threshold voice signal detection and affective voice evaluation. SCIENCE ADVANCES 2020; 6:6/50/eabb3884. [PMID: 33310844 PMCID: PMC7732184 DOI: 10.1126/sciadv.abb3884] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 10/29/2020] [Indexed: 05/10/2023]
Abstract
Communication and voice signal detection in noisy environments are universal tasks for many species. The fundamental problem of detecting voice signals in noise (VIN) is underinvestigated especially in its temporal dynamic properties. We investigated VIN as a dynamic signal-to-noise ratio (SNR) problem to determine the neurocognitive dynamics of subthreshold evidence accrual and near-threshold voice signal detection. Experiment 1 showed that dynamic VIN, including a varying SNR and subthreshold sensory evidence accrual, is superior to similar conditions with nondynamic SNRs or with acoustically matched sounds. Furthermore, voice signals with affective meaning have a detection advantage during VIN. Experiment 2 demonstrated that VIN is driven by an effective neural integration in an auditory cortical-limbic network at and beyond the near-threshold detection point, which is preceded by activity in subcortical auditory nuclei. This demonstrates the superior recognition advantage of communication signals in dynamic noise contexts, especially when carrying socio-affective meaning.
Collapse
Affiliation(s)
- Huw Swanborough
- Cognitive and Affective Neuroscience Unit, Department of Psychology, University of Zurich, Zurich, Switzerland.
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Matthias Staib
- Cognitive and Affective Neuroscience Unit, Department of Psychology, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Sascha Frühholz
- Cognitive and Affective Neuroscience Unit, Department of Psychology, University of Zurich, Zurich, Switzerland.
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
- Department of Psychology, University of Oslo, Oslo, Norway
| |
Collapse
|
7
|
Abstract
Being able to pick out particular sounds, such as speech, against a background of other sounds represents one of the key tasks performed by the auditory system. Understanding how this happens is important because speech recognition in noise is particularly challenging for older listeners and for people with hearing impairments. Central to this ability is the capacity of neurons to adapt to the statistics of sounds reaching the ears, which helps to generate noise-tolerant representations of sounds in the brain. In more complex auditory scenes, such as a cocktail party — where the background noise comprises other voices, sound features associated with each source have to be grouped together and segregated from those belonging to other sources. This depends on precise temporal coding and modulation of cortical response properties when attending to a particular speaker in a multi-talker environment. Furthermore, the neural processing underlying auditory scene analysis is shaped by experience over multiple timescales.
Collapse
|
8
|
Yellamsetty A, Bidelman GM. Brainstem correlates of concurrent speech identification in adverse listening conditions. Brain Res 2019; 1714:182-192. [PMID: 30796895 DOI: 10.1016/j.brainres.2019.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/07/2019] [Accepted: 02/19/2019] [Indexed: 01/20/2023]
Abstract
When two voices compete, listeners can segregate and identify concurrent speech sounds using pitch (fundamental frequency, F0) and timbre (harmonic) cues. Speech perception is also hindered by the signal-to-noise ratio (SNR). How clear and degraded concurrent speech sounds are represented at early, pre-attentive stages of the auditory system is not well understood. To this end, we measured scalp-recorded frequency-following responses (FFR) from the EEG while human listeners heard two concurrently presented, steady-state (time-invariant) vowels whose F0 differed by zero or four semitones (ST) presented diotically in either clean (no noise) or noise-degraded (+5dB SNR) conditions. Listeners also performed a speeded double vowel identification task in which they were required to identify both vowels correctly. Behavioral results showed that speech identification accuracy increased with F0 differences between vowels, and this perceptual F0 benefit was larger for clean compared to noise degraded (+5dB SNR) stimuli. Neurophysiological data demonstrated more robust FFR F0 amplitudes for single compared to double vowels and considerably weaker responses in noise. F0 amplitudes showed speech-on-speech masking effects, along with a non-linear constructive interference at 0ST, and suppression effects at 4ST. Correlations showed that FFR F0 amplitudes failed to predict listeners' identification accuracy. In contrast, FFR F1 amplitudes were associated with faster reaction times, although this correlation was limited to noise conditions. The limited number of brain-behavior associations suggests subcortical activity mainly reflects exogenous processing rather than perceptual correlates of concurrent speech perception. Collectively, our results demonstrate that FFRs reflect pre-attentive coding of concurrent auditory stimuli that only weakly predict the success of identifying concurrent speech.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Department of Communication Sciences & Disorders, University of South Florida, USA.
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
9
|
Yellamsetty A, Bidelman GM. Low- and high-frequency cortical brain oscillations reflect dissociable mechanisms of concurrent speech segregation in noise. Hear Res 2018; 361:92-102. [PMID: 29398142 DOI: 10.1016/j.heares.2018.01.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 12/09/2017] [Accepted: 01/12/2018] [Indexed: 10/18/2022]
Abstract
Parsing simultaneous speech requires listeners use pitch-guided segregation which can be affected by the signal-to-noise ratio (SNR) in the auditory scene. The interaction of these two cues may occur at multiple levels within the cortex. The aims of the current study were to assess the correspondence between oscillatory brain rhythms and determine how listeners exploit pitch and SNR cues to successfully segregate concurrent speech. We recorded electrical brain activity while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero or four semitones (STs) presented in either clean or noise-degraded (+5 dB SNR) conditions. We found that behavioral identification was more accurate for vowel mixtures with larger pitch separations but F0 benefit interacted with noise. Time-frequency analysis decomposed the EEG into different spectrotemporal frequency bands. Low-frequency (θ, β) responses were elevated when speech did not contain pitch cues (0ST > 4ST) or was noisy, suggesting a correlate of increased listening effort and/or memory demands. Contrastively, γ power increments were observed for changes in both pitch (0ST > 4ST) and SNR (clean > noise), suggesting high-frequency bands carry information related to acoustic features and the quality of speech representations. Brain-behavior associations corroborated these effects; modulations in low-frequency rhythms predicted the speed of listeners' perceptual decisions with higher bands predicting identification accuracy. Results are consistent with the notion that neural oscillations reflect both automatic (pre-perceptual) and controlled (post-perceptual) mechanisms of speech processing that are largely divisible into high- and low-frequency bands of human brain rhythms.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; Univeristy of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
10
|
Bidelman GM, Yellamsetty A. Noise and pitch interact during the cortical segregation of concurrent speech. Hear Res 2017; 351:34-44. [PMID: 28578876 DOI: 10.1016/j.heares.2017.05.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 05/09/2017] [Accepted: 05/23/2017] [Indexed: 10/19/2022]
Abstract
Behavioral studies reveal listeners exploit intrinsic differences in voice fundamental frequency (F0) to segregate concurrent speech sounds-the so-called "F0-benefit." More favorable signal-to-noise ratio (SNR) in the environment, an extrinsic acoustic factor, similarly benefits the parsing of simultaneous speech. Here, we examined the neurobiological substrates of these two cues in the perceptual segregation of concurrent speech mixtures. We recorded event-related brain potentials (ERPs) while listeners performed a speeded double-vowel identification task. Listeners heard two concurrent vowels whose F0 differed by zero or four semitones presented in either clean (no noise) or noise-degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in correctly identifying both vowels for larger F0 separations but F0-benefit was more pronounced at more favorable SNRs (i.e., pitch × SNR interaction). Analysis of the ERPs revealed that only the P2 wave (∼200 ms) showed a similar F0 x SNR interaction as behavior and was correlated with listeners' perceptual F0-benefit. Neural classifiers applied to the ERPs further suggested that speech sounds are segregated neurally within 200 ms based on SNR whereas segregation based on pitch occurs later in time (400-700 ms). The earlier timing of extrinsic SNR compared to intrinsic F0-based segregation implies that the cortical extraction of speech from noise is more efficient than differentiating speech based on pitch cues alone, which may recruit additional cortical processes. Findings indicate that noise and pitch differences interact relatively early in cerebral cortex and that the brain arrives at the identities of concurrent speech mixtures as early as ∼200 ms.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, 38152, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, 38152, USA; Univeristy of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, 38163, USA.
| | - Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, 38152, USA
| |
Collapse
|
11
|
Wilbiks JMP, Dyson BJ. The Dynamics and Neural Correlates of Audio-Visual Integration Capacity as Determined by Temporal Unpredictability, Proactive Interference, and SOA. PLoS One 2016; 11:e0168304. [PMID: 27977790 PMCID: PMC5158043 DOI: 10.1371/journal.pone.0168304] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Accepted: 11/17/2016] [Indexed: 11/25/2022] Open
Abstract
Over 5 experiments, we challenge the idea that the capacity of audio-visual integration need be fixed at 1 item. We observe that the conditions under which audio-visual integration is most likely to exceed 1 occur when stimulus change operates at a slow rather than fast rate of presentation and when the task is of intermediate difficulty such as when low levels of proactive interference (3 rather than 8 interfering visual presentations) are combined with the temporal unpredictability of the critical frame (Experiment 2), or, high levels of proactive interference are combined with the temporal predictability of the critical frame (Experiment 4). Neural data suggest that capacity might also be determined by the quality of perceptual information entering working memory. Experiment 5 supported the proposition that audio-visual integration was at play during the previous experiments. The data are consistent with the dynamic nature usually associated with cross-modal binding, and while audio-visual integration capacity likely cannot exceed uni-modal capacity estimates, performance may be better than being able to associate only one visual stimulus with one auditory stimulus.
Collapse
Affiliation(s)
- Jonathan M. P. Wilbiks
- Department of Psychology, Ryerson University, Toronto, Ontario, Canada
- Department of Psychology, Mount Allison University, Sackville, New Brunswick, Canada
- * E-mail:
| | - Benjamin J. Dyson
- Department of Psychology, Ryerson University, Toronto, Ontario, Canada
- School of Psychology, University of Sussex, Falmer, United Kingdom
| |
Collapse
|
12
|
Mehta AH, Yasin I, Oxenham AJ, Shamma S. Neural correlates of attention and streaming in a perceptually multistable auditory illusion. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:2225. [PMID: 27794350 PMCID: PMC5849028 DOI: 10.1121/1.4963902] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Revised: 09/12/2016] [Accepted: 09/20/2016] [Indexed: 06/06/2023]
Abstract
In a complex acoustic environment, acoustic cues and attention interact in the formation of streams within the auditory scene. In this study, a variant of the "octave illusion" [Deutsch (1974). Nature 251, 307-309] was used to investigate the neural correlates of auditory streaming, and to elucidate the effects of attention on the interaction between sequential and concurrent sound segregation in humans. By directing subjects' attention to different frequencies and ears, it was possible to elicit several different illusory percepts with the identical stimulus. The first experiment tested the hypothesis that the illusion depends on the ability of listeners to perceptually stream the target tones from within the alternating sound sequences. In the second experiment, concurrent psychophysical measures and electroencephalography recordings provided neural correlates of the various percepts elicited by the multistable stimulus. The results show that the perception and neural correlates of the auditory illusion can be manipulated robustly by attentional focus and that the illusion is constrained in much the same way as auditory stream segregation, suggesting common underlying mechanisms.
Collapse
Affiliation(s)
- Anahita H Mehta
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1X 8EE, United Kingdom
| | - Ifat Yasin
- Department of Computer Science, University College London, 66-72 Gower Street, London WC1E 6BT, United Kingdom
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Shihab Shamma
- Institute for Systems Research, 2203 A.V. Williams Building, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
13
|
Leung AWS, Jolicoeur P, Alain C. Attentional Capacity Limits Gap Detection during Concurrent Sound Segregation. J Cogn Neurosci 2015. [PMID: 26226073 DOI: 10.1162/jocn_a_00849] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Detecting a brief silent interval (i.e., a gap) is more difficult when listeners perceive two concurrent sounds rather than one in a sound containing a mistuned harmonic in otherwise in-tune harmonics. This impairment in gap detection may reflect the interaction of low-level encoding or the division of attention between two sound objects, both of which could interfere with signal detection. To distinguish between these two alternatives, we compared ERPs during active and passive listening with complex harmonic tones that could include a gap, a mistuned harmonic, both features, or neither. During active listening, participants indicated whether they heard a gap irrespective of mistuning. During passive listening, participants watched a subtitled muted movie of their choice while the same sounds were presented. Gap detection was impaired when the complex sounds included a mistuned harmonic that popped out as a separate object. The ERP analysis revealed an early gap-related activity that was little affected by mistuning during the active or passive listening condition. However, during active listening, there was a marked decrease in the late positive wave that was thought to index attention and response-related processes. These results suggest that the limitation in detecting the gap is related to attentional processing, possibly divided attention induced by the concurrent sound objects, rather than deficits in preattentional sensory encoding.
Collapse
Affiliation(s)
- Ada W S Leung
- University of Alberta.,Rotman Research Institute, Baycrest Centre for Geriatric Care, Toronto, Canada
| | - Pierre Jolicoeur
- Université de Montréal.,Centre de Recherche en Neuropsychologie et Cognition (CERNEC), Montréal, Canada.,BRAMS (International Laboratory for Brain, Music, and Sound Research), Montréal, Canada.,Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal (CRIUGM)
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Toronto, Canada.,University of Toronto
| |
Collapse
|
14
|
Bendixen A, Háden GP, Németh R, Farkas D, Török M, Winkler I. Newborn Infants Detect Cues of Concurrent Sound Segregation. Dev Neurosci 2015; 37:172-81. [DOI: 10.1159/000370237] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2014] [Accepted: 11/28/2014] [Indexed: 11/19/2022] Open
Abstract
Separating concurrent sounds is fundamental for a veridical perception of one's auditory surroundings. Sound components that are harmonically related and start at the same time are usually grouped into a common perceptual object, whereas components that are not in harmonic relation or have different onset times are more likely to be perceived in terms of separate objects. Here we tested whether neonates are able to pick up the cues supporting this sound organization principle. We presented newborn infants with a series of complex tones with their harmonics in tune (creating the percept of a unitary sound object) and with manipulated variants, which gave the impression of two concurrently active sound sources. The manipulated variant had either one mistuned partial (single-cue condition) or the onset of this mistuned partial was also delayed (double-cue condition). Tuned and manipulated sounds were presented in random order with equal probabilities. Recording the neonates' electroencephalographic responses allowed us to evaluate their processing of the sounds. Results show that, in both conditions, mistuned sounds elicited a negative displacement of the event-related potential (ERP) relative to tuned sounds from 360 to 400 ms after sound onset. The mistuning-related ERP component resembles the object-related negativity (ORN) component in adults, which is associated with concurrent sound segregation. Delayed onset additionally led to a negative displacement from 160 to 200 ms, which was probably more related to the physical parameters of the sounds than to their perceptual segregation. The elicitation of an ORN-like response in newborn infants suggests that neonates possess the basic capabilities of segregating concurrent sounds by detecting inharmonic relations between the co-occurring sounds.
Collapse
|
15
|
Bidelman GM, Alain C. Hierarchical neurocomputations underlying concurrent sound segregation: Connecting periphery to percept. Neuropsychologia 2015; 68:38-50. [DOI: 10.1016/j.neuropsychologia.2014.12.020] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 12/18/2014] [Accepted: 12/22/2014] [Indexed: 10/24/2022]
|
16
|
Vuvan DT, Podolak OM, Schmuckler MA. Memory for musical tones: the impact of tonality and the creation of false memories. Front Psychol 2014; 5:582. [PMID: 24971071 PMCID: PMC4054327 DOI: 10.3389/fpsyg.2014.00582] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2013] [Accepted: 05/25/2014] [Indexed: 11/24/2022] Open
Abstract
Although the relation between tonality and musical memory has been fairly well-studied, less is known regarding the contribution of tonal-schematic expectancies to this relation. Three experiments investigated the influence of tonal expectancies on memory for single tones in a tonal melodic context. In the first experiment, listener responses indicated superior recognition of both expected and unexpected targets in a major tonal context than for moderately expected targets. Importantly, and in support of previous work on false memories, listener responses also revealed a higher false alarm rate for expected than unexpected targets. These results indicate roles for tonal schematic congruency as well as distinctiveness in memory for melodic tones. The second experiment utilized minor melodies, which weakened tonal expectancies since the minor tonality can be represented in three forms simultaneously. Finally, tonal expectancies were abolished entirely in the third experiment through the use of atonal melodies. Accordingly, the expectancy-based results observed in the first experiment were disrupted in the second experiment, and disappeared in the third experiment. These results are discussed in light of schema theory, musical expectancy, and classic memory work on the availability and distinctiveness heuristics.
Collapse
Affiliation(s)
- Dominique T Vuvan
- Department of Psychology, International Laboratory for Brain, Music, and Sound Research, Université de Montréal Montreal, QC, Canada
| | - Olivia M Podolak
- Department of Psychology, University of Toronto Scarborough Toronto, ON, Canada
| | - Mark A Schmuckler
- Department of Psychology, University of Toronto Scarborough Toronto, ON, Canada
| |
Collapse
|
17
|
Alain C, Zendel BR, Hutka S, Bidelman GM. Turning down the noise: The benefit of musical training on the aging auditory brain. Hear Res 2014. [DOI: 10.10.1016/j.heares.2013.06.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
18
|
McMullan AR, Hambrook DA, Tata MS. Brain dynamics encode the spectrotemporal boundaries of auditory objects. Hear Res 2013; 304:77-90. [DOI: 10.1016/j.heares.2013.06.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Revised: 06/14/2013] [Accepted: 06/24/2013] [Indexed: 10/26/2022]
|
19
|
Cornella M, Leung S, Grimm S, Escera C. Regularity encoding and deviance detection of frequency modulated sweeps: Human middle- and long-latency auditory evoked potentials. Psychophysiology 2013; 50:1275-81. [DOI: 10.1111/psyp.12137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2013] [Accepted: 07/01/2013] [Indexed: 11/29/2022]
Affiliation(s)
- Miriam Cornella
- Institute for Brain, Cognition and Behavior (IR3C) and Cognitive Neuroscience Research Group, Department of Psychiatry and Clinical Psychobiology; University of Barcelona; Catalonia Spain
| | - Sumie Leung
- Institute for Brain, Cognition and Behavior (IR3C) and Cognitive Neuroscience Research Group, Department of Psychiatry and Clinical Psychobiology; University of Barcelona; Catalonia Spain
| | - Sabine Grimm
- Institute for Brain, Cognition and Behavior (IR3C) and Cognitive Neuroscience Research Group, Department of Psychiatry and Clinical Psychobiology; University of Barcelona; Catalonia Spain
| | - Carles Escera
- Institute for Brain, Cognition and Behavior (IR3C) and Cognitive Neuroscience Research Group, Department of Psychiatry and Clinical Psychobiology; University of Barcelona; Catalonia Spain
| |
Collapse
|
20
|
Alain C, Zendel BR, Hutka S, Bidelman GM. Turning down the noise: the benefit of musical training on the aging auditory brain. Hear Res 2013; 308:162-73. [PMID: 23831039 DOI: 10.1016/j.heares.2013.06.008] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Revised: 06/19/2013] [Accepted: 06/24/2013] [Indexed: 11/29/2022]
Abstract
Age-related decline in hearing abilities is a ubiquitous part of aging, and commonly impacts speech understanding, especially when there are competing sound sources. While such age effects are partially due to changes within the cochlea, difficulties typically exist beyond measurable hearing loss, suggesting that central brain processes, as opposed to simple peripheral mechanisms (e.g., hearing sensitivity), play a critical role in governing hearing abilities late into life. Current training regimens aimed to improve central auditory processing abilities have experienced limited success in promoting listening benefits. Interestingly, recent studies suggest that in young adults, musical training positively modifies neural mechanisms, providing robust, long-lasting improvements to hearing abilities as well as to non-auditory tasks that engage cognitive control. These results offer the encouraging possibility that musical training might be used to counteract age-related changes in auditory cognition commonly observed in older adults. Here, we reviewed studies that have examined the effects of age and musical experience on auditory cognition with an emphasis on auditory scene analysis. We infer that musical training may offer potential benefits to complex listening and might be utilized as a means to delay or even attenuate declines in auditory perception and cognition that often emerge later in life.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Canada; Department of Psychology, University of Toronto, Canada.
| | - Benjamin Rich Zendel
- International Laboratory for Brain, Music and Sound Research (BRAMS), Département de Psychologie, Université de Montréal, Québec, Canada; Centre de Recherche, Institut Universitaire de Gériatrie de Montréal, Québec, Canada
| | - Stefanie Hutka
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Canada; Department of Psychology, University of Toronto, Canada
| | - Gavin M Bidelman
- Institute for Intelligent Systems & School of Communication Sciences and Disorders, University of Memphis, USA
| |
Collapse
|
21
|
Grimm S, Escera C, Slabu L, Costa-Faidella J. Electrophysiological evidence for the hierarchical organization of auditory change detection in the human brain. Psychophysiology 2011; 48:377-84. [DOI: 10.1111/j.1469-8986.2010.01073.x] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Butcher A, Govenlock SW, Tata MS. A lateralized auditory evoked potential elicited when auditory objects are defined by spatial motion. Hear Res 2011; 272:58-68. [DOI: 10.1016/j.heares.2010.10.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Revised: 10/21/2010] [Accepted: 10/28/2010] [Indexed: 11/26/2022]
|
23
|
Park JY, Park H, Kim JI, Park HJ. Consonant chords stimulate higher EEG gamma activity than dissonant chords. Neurosci Lett 2011; 488:101-5. [DOI: 10.1016/j.neulet.2010.11.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2010] [Revised: 10/28/2010] [Accepted: 11/03/2010] [Indexed: 10/18/2022]
|
24
|
Sanders LD, Zobel BH, Freyman RL, Keen R. Manipulations of listeners' echo perception are reflected in event-related potentials. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:301-309. [PMID: 21303011 PMCID: PMC3055288 DOI: 10.1121/1.3514518] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Revised: 10/12/2010] [Accepted: 10/13/2010] [Indexed: 05/30/2023]
Abstract
To gain information from complex auditory scenes, it is necessary to determine which of the many loudness, pitch, and timbre changes originate from a single source. Grouping sound into sources based on spatial information is complicated by reverberant energy bouncing off multiple surfaces and reaching the ears from directions other than the source's location. The ability to localize sounds despite these echoes has been explored with the precedence effect: Identical sounds presented from two locations with a short stimulus onset asynchrony (e.g., 1-5 ms) are perceived as a single source with a location dominated by the lead sound. Importantly, echo thresholds, the shortest onset asynchrony at which a listener reports hearing the lag sound as a separate source about half of the time, can be manipulated by presenting sound pairs in contexts. Event-related brain potentials elicited by physically identical sounds in contexts that resulted in listeners reporting either one or two sources were compared. Sound pairs perceived as two sources elicited a larger anterior negativity 100-250 ms after onset, previously termed the object-related negativity, and a larger posterior positivity 250-500 ms. These results indicate that the models of room acoustics listeners form based on recent experience with the spatiotemporal properties of sound modulate perceptual as well as later higher-level processing.
Collapse
Affiliation(s)
- Lisa D Sanders
- Neuroscience and Behavior Program, Department of Psychology, University of Massachusetts, Amherst, Massachusetts 01003, USA.
| | | | | | | |
Collapse
|
25
|
Temporal coherence and attention in auditory scene analysis. Trends Neurosci 2010; 34:114-23. [PMID: 21196054 DOI: 10.1016/j.tins.2010.11.002] [Citation(s) in RCA: 292] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2010] [Revised: 11/03/2010] [Accepted: 11/05/2010] [Indexed: 11/23/2022]
Abstract
Humans and other animals can attend to one of multiple sounds and follow it selectively over time. The neural underpinnings of this perceptual feat remain mysterious. Some studies have concluded that sounds are heard as separate streams when they activate well-separated populations of central auditory neurons, and that this process is largely pre-attentive. Here, we argue instead that stream formation depends primarily on temporal coherence between responses that encode various features of a sound source. Furthermore, we postulate that only when attention is directed towards a particular feature (e.g. pitch) do all other temporally coherent features of that source (e.g. timbre and location) become bound together as a stream that is segregated from the incoherent features of other sources.
Collapse
|
26
|
Shamma SA, Micheyl C. Behind the scenes of auditory perception. Curr Opin Neurobiol 2010; 20:361-6. [PMID: 20456940 DOI: 10.1016/j.conb.2010.03.009] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2010] [Revised: 03/16/2010] [Accepted: 03/29/2010] [Indexed: 11/30/2022]
Abstract
'Auditory scenes' often contain contributions from multiple acoustic sources. These are usually heard as separate auditory 'streams', which can be selectively followed over time. How and where these auditory streams are formed in the auditory system is one of the most fascinating questions facing auditory scientists today. Findings published within the past two years indicate that both cortical and subcortical processes contribute to the formation of auditory streams, and they raise important questions concerning the roles of primary and secondary areas of auditory cortex in this phenomenon. In addition, these findings underline the importance of taking into account the relative timing of neural responses, and the influence of selective attention, in the search for neural correlates of the perception of auditory streams.
Collapse
Affiliation(s)
- Shihab A Shamma
- Department of Electrical and Computer Engineering & Institute for Systems Research, University of Maryland College Park, United States.
| | | |
Collapse
|
27
|
Pitch, harmonicity and concurrent sound segregation: psychoacoustical and neurophysiological findings. Hear Res 2009; 266:36-51. [PMID: 19788920 DOI: 10.1016/j.heares.2009.09.012] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2009] [Revised: 09/23/2009] [Accepted: 09/24/2009] [Indexed: 11/18/2022]
Abstract
Harmonic complex tones are a particularly important class of sounds found in both speech and music. Although these sounds contain multiple frequency components, they are usually perceived as a coherent whole, with a pitch corresponding to the fundamental frequency (F0). However, when two or more harmonic sounds occur concurrently, e.g., at a cocktail party or in a symphony, the auditory system must separate harmonics and assign them to their respective F0s so that a coherent and veridical representation of the different sounds sources is formed. Here we review both psychophysical and neurophysiological (single-unit and evoked-potential) findings, which provide some insight into how, and how well, the auditory system accomplishes this task. A survey of computational models designed to estimate multiple F0s and segregate concurrent sources is followed by a review of the empirical literature on the perception and neural coding of concurrent harmonic sounds, including vowels, as well as findings obtained using single complex tones with mistuned harmonics.
Collapse
|
28
|
Furness D. Abstracts of the British Society of Audiology Short Papers Meeting on Experimental Studies of Hearing and Deafness September 2006, Cambridge University, UK. Int J Audiol 2009. [DOI: 10.1080/14992020701521790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
29
|
Alain C, Quan J, McDonald K, Van Roon P. Noise-induced increase in human auditory evoked neuromagnetic fields. Eur J Neurosci 2009; 30:132-42. [PMID: 19558607 DOI: 10.1111/j.1460-9568.2009.06792.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Noise is usually detrimental to auditory perception. However, recent psychophysical studies have shown that low levels of broadband noise may improve signal detection. Here, we measured auditory evoked fields (AEFs) while participants listened passively to low-pitched and high-pitched tones (Experiment 1) or complex sounds that included a tuned or a mistuned component that yielded the perception of concurrent sound objects (Experiment 2). In both experiments, stimuli were embedded in low or intermediate levels of Gaussian noise or presented without background noise. For each participant, the AEFs were modeled with a pair of dipoles in the superior temporal plane, and the effects of noise were examined on the resulting source waveforms. In both experiments, the N1m was larger when the stimuli were embedded in low background noise than in the no-noise control condition. Complex sounds with a mistuned component generated an object-related negativity that was larger in the low-noise condition. The results show that low-level background noise facilitates AEFs associated with sound onset and can be beneficial for sorting out concurrent sound objects. We suggest that noise-induced increases in transient evoked responses may be mediated via efferent feedback connections between the auditory cortex and lower auditory centers.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
30
|
It all sounds the same to me: sequential ERP and behavioral effects during pitch and harmonicity judgments. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2008; 8:329-43. [PMID: 18814469 DOI: 10.3758/cabn.8.3.329] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The representation of complex sounds was examined by comparing both behavioral and event-related brain potentials (ERPs) to the change or repetition of fundamental frequency (f0) and harmonicity. In the pitch task, participants were asked to categorize the incoming stimulus as either low or high, regardless of harmonicity, and in the harmonicity task, participants indicated whether the stimulus was tuned or mistuned, regardless of pitch. Over three experiments, participants were faster in responding to pitch than to harmonicity. As a result of this asymmetry, behavioral and ERP data showed that irrelevant changes in harmonicity had little impact on performance during the pitch task, whereas harmonicity judgments were impeded by irrelevant changes in f0. These data are consistent with both general horse-race accounts of processing and specific accounts of mistuning detection that posit prior f0 registration. In addition, ERP components N2 and P3 were modulated by both intertrial contingency and task instructions, revealing the further influence of top-down mechanisms on concurrent sound segregation.
Collapse
|
31
|
From sounds to meaning: the role of attention during auditory scene analysis. Curr Opin Otolaryngol Head Neck Surg 2008. [DOI: 10.1097/moo.0b013e32830e2096] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
32
|
Lee AKC, Shinn-Cunningham BG. Effects of frequency disparities on trading of an ambiguous tone between two competing auditory objects. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:4340-4351. [PMID: 18537385 PMCID: PMC9014251 DOI: 10.1121/1.2908282] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2007] [Revised: 03/17/2008] [Accepted: 03/19/2008] [Indexed: 05/26/2023]
Abstract
Listeners are relatively good at estimating the true content of each physical source in a sound mixture in most everyday situations. However, if there is a spectrotemporal element that logically could belong to more than one object, the correct way to group that element can be ambiguous. Many psychoacoustic experiments have implicitly assumed that when a sound mixture contains ambiguous sound elements, the ambiguous elements "trade" between competing sources, such that the elements contribute more to one object in conditions when they contribute less to others. However, few studies have directly tested whether such trading occurs. While some studies found trading, trading failed in some recent studies in which spatial cues were manipulated to alter the perceptual organization. The current study extended this work by exploring whether trading occurs for similar sound mixtures when frequency content, rather than spatial cues, was manipulated to alter grouping. Unlike when spatial cues were manipulated, results are roughly consistent with trading. Together, results suggest that the degree to which trading is obeyed depends on how stimuli are manipulated to affect object formation.
Collapse
Affiliation(s)
- Adrian K C Lee
- Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| | | |
Collapse
|
33
|
Kalluri S, Depireux DA, Shamma SA. Perception and cortical neural coding of harmonic fusion in ferrets. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2701-16. [PMID: 18529189 PMCID: PMC2677325 DOI: 10.1121/1.2902178] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
This study examined the perception and cortical representation of harmonic complex tones, from the perspective of the spectral fusion evoked by such sounds. Experiment 1 tested whether ferrets spontaneously distinguish harmonic from inharmonic tones. In baseline sessions, ferrets detected a pure tone terminating a sequence of inharmonic tones. After they reached proficiency, a small fraction of the inharmonic tones were replaced with harmonic tones. Some of the animals confused the harmonic tones with the pure tones at twice the false-alarm rate. Experiment 2 sought correlates of harmonic fusion in single neurons of primary auditory cortex and anterior auditory field, by comparing responses to harmonic tones with those to inharmonic tones in the awake alert ferret. The effects of spectro-temporal filtering were accounted for by using the measured spectrotemporal receptive field to predict responses and by seeking correlates of fusion in the predictability of responses. Only 12% of units sampled distinguished harmonic tones from inharmonic tones, a small percentage that is consistent with the relatively weak ability of the ferrets to spontaneously discriminate harmonic tones from inharmonic tones in Experiment 1.
Collapse
Affiliation(s)
- Sridhar Kalluri
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA.
| | | | | |
Collapse
|
34
|
Bidet-Caulet A, Fischer C, Bauchet F, Aguera PE, Bertrand O. Neural substrate of concurrent sound perception: direct electrophysiological recordings from human auditory cortex. Front Hum Neurosci 2008; 1:5. [PMID: 18958219 PMCID: PMC2525982 DOI: 10.3389/neuro.09.005.2007] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2007] [Accepted: 01/03/2008] [Indexed: 12/04/2022] Open
Abstract
In everyday life, consciously or not, we are constantly disentangling the multiple auditory sources contributing to our acoustical environment. To better understand the neural mechanisms involved in concurrent sound processing, we manipulated sound onset asynchrony to induce the segregation or grouping of two concurrent sounds. Each sound consisted of amplitude-modulated tones at different carrier and modulation frequencies, allowing a cortical tagging of each sound. Electrophysiological recordings were carried out in epileptic patients with pharmacologically resistant partial epilepsy, implanted with depth electrodes in the temporal cortex. Patients were presented with the stimuli while they performed an auditory distracting task. We found that transient and steady-state evoked responses, and induced gamma oscillatory activities were enhanced in the case of onset synchrony. These effects were mainly located in the Heschl's gyrus for steady-state responses whereas they were found in the lateral superior temporal gyrus for evoked transient responses and induced gamma oscillations. They can be related to distinct neural mechanisms such as frequency selectivity and habituation. These results in the auditory cortex provide an anatomically refined description of the neurophysiological components which might be involved in the perception of concurrent sounds.
Collapse
|
35
|
Is a change as good with a rest? Task-dependent effects of inter-trial contingency on concurrent sound segregation. Brain Res 2008; 1189:135-44. [PMID: 18078900 DOI: 10.1016/j.brainres.2007.10.093] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2007] [Revised: 09/07/2007] [Accepted: 10/27/2007] [Indexed: 11/21/2022]
|
36
|
Abstract
The auditory system can segregate sounds that overlap in time and frequency, if the sounds differ in acoustic properties such as fundamental frequency (f0). However, the neural mechanisms that underlie this ability are poorly understood. Responses of neurons in the inferior colliculus (IC) of the anesthetized chinchilla were measured. The stimuli were harmonic tones, presented alone (single harmonic tones) and in the presence of a second harmonic tone with a different f0 (double harmonic tones). Responses to single harmonic tones exhibited no stimulus-related temporal pattern, or in some cases, a simple envelope modulated at f0. Responses to double harmonic tones exhibited complex slowly modulated discharge patterns. The discharge pattern varied with the difference in f0 and with characteristic frequency. The discharge pattern also varied with the relative levels of the two tones; complex temporal patterns were observed when levels were equal, but as the level difference increased, the discharge pattern reverted to that associated with single harmonic tones. The results indicated that IC neurons convey information about simultaneous sounds in their temporal discharge patterns and that the patterns are produced by interactions between adjacent components in the spectrum. The representation is "low-resolution," in that it does not convey information about single resolved components from either individual sound.
Collapse
Affiliation(s)
- Donal G Sinex
- Department of Psychology, Utah State University, Logan, UT 84322-2810, USA.
| | | |
Collapse
|
37
|
Johnson BW, Hautus MJ, Duff DJ, Clapp WC. Sequential processing of interaural timing differences for sound source segregation and spatial localization: Evidence from event-related cortical potentials. Psychophysiology 2007; 44:541-51. [PMID: 17521376 DOI: 10.1111/j.1469-8986.2007.00535.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Cortical processing of interaural timing differences (ITDs) was investigated with event-related potential (ERP) measurements in 16 human participants who were required in separate tasks to detect or to spatially localize dichotically embedded pitches. ITDs elicited three ERP components labeled ORN, N2, and P400. The ORN occurred at a latency of 150-250 ms and was elicited by ITDs regardless of location or task. In contrast, the N2 response (250-350 ms) was strongly modulated by location and showed larger amplitudes for the localization task than for the detection task. Finally, ITDs in the detection task elicited a P400 at a latency of 400-500 ms, but this response was entirely absent from ERPs elicited by identical stimuli in the localization task. These results are consistent with a sequential model of auditory perception in which segregation of concurrent sounds is followed by domain-specific processing of object location and identity.
Collapse
Affiliation(s)
- Blake W Johnson
- Research Centre for Cognitive Neuroscience, Department of Psychology, University of Auckland, Auckland, New Zealand.
| | | | | | | |
Collapse
|
38
|
Alain C, McDonald KL. Age-related differences in neuromagnetic brain activity underlying concurrent sound perception. J Neurosci 2007; 27:1308-14. [PMID: 17287505 PMCID: PMC6673581 DOI: 10.1523/jneurosci.5433-06.2007] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Deficits in parsing concurrent auditory events are believed to contribute to older adults' difficulties in understanding speech in adverse listening conditions (e.g., cocktail party). To explore the level at which aging impairs sound segregation, we measured auditory evoked fields (AEFs) using magnetoencephalography while young, middle-aged, and older adults were presented with complex sounds that either had all of their harmonics in tune or had the third harmonic mistuned by 4 or 16% of its original value. During the recording, participants were asked to ignore the stimuli and watch a muted subtitled movie of their choice. For each participant, the AEFs were modeled with a pair of dipoles in the superior temporal plane, and the effects of age and mistuning were examined on the amplitude and latency of the resulting source waveforms. Mistuned stimuli generated an early positivity (60-100 ms), an object-related negativity (ORN) (140-180 ms) that overlapped the N1 and P2 waves, and a positive displacement that peaked at approximately 230 ms (P230) after sound onset. The early mistuning-related enhancement was similar in all three age groups, whereas the subsequent modulations (ORN and P230) were reduced in older adults. These age differences in auditory cortical activity were associated with a reduced likelihood of hearing two sounds as a function of mistuning. The results reveal that inharmonicity is rapidly and automatically registered in all three age groups but that the perception of concurrent sounds declines with age.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Centre for Care, Toronto, Ontario, Canada M6A 2E1.
| | | |
Collapse
|
39
|
Alain C. Breaking the wave: effects of attention and learning on concurrent sound perception. Hear Res 2007; 229:225-36. [PMID: 17303355 DOI: 10.1016/j.heares.2007.01.011] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2006] [Revised: 12/06/2006] [Accepted: 01/03/2007] [Indexed: 11/19/2022]
Abstract
The auditory surrounding is often complex with many sound sources active simultaneously. Yet listeners are proficient in breaking apart the composite acoustic wave reaching the ears. This achievement is thought to be the result of bottom-up as well as top-down processes that reflect listeners' experience and knowledge of the auditory environment. Here, specific findings concerning the role of bottom-up and top-down (schema-driven) processes on concurrent sound perception are reviewed, with particular emphasis on studies that have used scalp recording of event-related brain potentials. Findings from several studies indicate that frequency periodicity, upon which concurrent sound perception partly depends, is quickly and automatically registered in primary auditory cortex. Moreover, success in identifying concurrent vowels is accompanied by enhanced neural activity, as revealed by functional magnetic resonance imaging, in thalamus, primary auditory cortex and planum temporale. Lastly, listeners' ability to segregate concurrent vowels improves with training and these neuroplastic changes occur rapidly, demonstrating the flexibility of human speech segregation mechanisms. Together, these studies suggest that the primary auditory cortex and the planum temporale play an important role in concurrent sound perception, and reveal a link between thalamo-cortical activation and the successful separation and identification of speech sounds presented simultaneously.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Department of Psychology, University of Toronto, Toronto, ONT, Canada.
| |
Collapse
|
40
|
Hiraumi H, Nagamine T, Morita T, Naito Y, Fukuyama H, Ito J. Right hemispheric predominance in the segregation of mistuned partials. Eur J Neurosci 2006; 22:1821-4. [PMID: 16197525 DOI: 10.1111/j.1460-9568.2005.04350.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
To elucidate the central mechanisms of sound segregation, we compared responses to a harmonic sound and a mistuned sound using a whole-head magnetoencephalography system. The harmonic sound was composed of a 200-Hz tone and its 2nd to 12th harmonics. The mistuned sound had, instead of the 600-Hz harmonic, a 696-Hz tone. In the right hemisphere, the amplitude of N100m responses evoked by the mistuned sound was significantly larger and the peak latency significantly longer than that evoked by the harmonic sound, suggesting that the right hemisphere plays a more important role than the left in detecting mistuned partials.
Collapse
Affiliation(s)
- Harukazu Hiraumi
- Department of Otolaryngology - Head and Neck Surgery, Graduate School of Medicine, Kyoto University, 54, Shogoin, Kyoto, 606-8507, Japan.
| | | | | | | | | | | |
Collapse
|
41
|
Chait M, Poeppel D, Simon JZ. Neural response correlates of detection of monaurally and binaurally created pitches in humans. ACTA ACUST UNITED AC 2005; 16:835-48. [PMID: 16151180 DOI: 10.1093/cercor/bhj027] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Recent magnetoencephalography (MEG) and functional magnetic resonance imaging studies of human auditory cortex are pointing to brain areas on lateral Heschl's gyrus as the 'pitch-processing center'. Here we describe results of a combined MEG-psychophysical study designed to investigate the timing of the formation of the percept of pitch and the generality of the hypothesized 'pitch-center'. We compared the cortical and behavioral responses to Huggins pitch (HP), a stimulus requiring binaural processing to elicit a pitch percept, with responses to tones embedded in noise (TN)-perceptually similar but physically very different signals. The stimuli were crafted to separate the electrophysiological responses to onset of the pitch percept from the onset of the initial stimulus. Our results demonstrate that responses to monaural pitch stimuli are affected by cross-correlational processes in the binaural pathway. Additionally, we show that MEG illuminates processes not simply observable in behavior. Crucially, the MEG data show that, although physically disparate, both HP and TN are mapped onto similar representations by 150 ms post-onset, and provide critical new evidence that the 'pitch onset response' reflects central pitch mechanisms, in agreement with models postulating a single, central pitch extractor.
Collapse
Affiliation(s)
- Maria Chait
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, MD 20742-7505, USA.
| | | | | |
Collapse
|
42
|
McDonald KL, Alain C. Contribution of harmonicity and location to auditory object formation in free field: evidence from event-related brain potentials. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:1593-604. [PMID: 16240820 DOI: 10.1121/1.2000747] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The contribution of location and harmonicity cues in sound segregation was investigated using behavioral reports and source waveforms derived from the scalp-recorded evoked potentials. Participants were presented with sounds composed of multiple harmonics in a free-field environment. The third harmonic was either tuned or mistuned and could be presented from the same or different location from the remaining harmonics. Presenting the third harmonic at a different location than the remaining harmonics increased the likelihood of hearing the tuned or slightly (i.e., 2%) mistuned harmonic as a separate object. Partials mistuned by 16% of their original value "pop out" of the complex and were paralleled by an object-related negativity (ORN) that superimposed the N1 and P2 components. For the 2% mistuned stimuli, the ORN was present only when the mistuned harmonic was presented at a different location than the remaining harmonics. Presenting the tuned harmonic at a different location also yielded changes in neural activity between 150 and 250 ms after sound onset. The behavioral and electrophysiological results indicate that listeners can segregate sounds based on harmonicity or location alone. The results also indicate that a conjunction of harmonicity and location cues contribute to sound segregation primarily when harmonicity is ambiguous.
Collapse
Affiliation(s)
- Kelly L McDonald
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Ontario, Canada
| | | |
Collapse
|
43
|
Alain C, Reinke K, He Y, Wang C, Lobaugh N. Hearing Two Things at Once: Neurophysiological Indices of Speech Segregation and Identification. J Cogn Neurosci 2005; 17:811-8. [PMID: 15904547 DOI: 10.1162/0898929053747621] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
The discrimination of concurrent sounds is paramount to speech perception. During social gatherings, listeners must extract information from a composite acoustic wave, which sums multiple individual voices that are simultaneously active. The observers' ability to identify two simultaneously presented vowels improves with increasing separation between the fundamental frequencies (f 0) of the two vowels. Event-related potentials to stimuli presented during attend and ignore conditions revealed activity between 130 and 170 msec after sound onset that reflected the f 0 differences between the two vowels. Another, more posterior and right-lateralized, negative wave maximal at 250 msec, and a central-parietal slow negativity were observed only during vowel identification and may index stimulus categorization. This sequence of neural events supports a multistage model of auditory scene analysis in which the spectral pattern of each vowel constituent is automatically extracted and then matched against representations of those vowels in working memory.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Toronto, Ontario, Canada.
| | | | | | | | | |
Collapse
|
44
|
Dyson BJ, Alain C, He Y. I've heard it all before: perceptual invariance represented by early cortical auditory-evoked responses. ACTA ACUST UNITED AC 2005; 23:457-60. [PMID: 15820654 DOI: 10.1016/j.cogbrainres.2004.11.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2004] [Revised: 11/22/2004] [Accepted: 11/24/2004] [Indexed: 11/18/2022]
Abstract
Sensitivity to acoustic invariance is critical for establishing stable representations in a shifting world of sound. By recording early auditory cortical responses to complex sounds in human listeners and categorising these responses according to the maintenance or change of stimulus attributes across consecutive presentations, we show that repetition within a constantly varying acoustic context produces enhanced neural responding in auditory cortices.
Collapse
Affiliation(s)
- Benjamin J Dyson
- The Rotman Research Institute, Baycrest Centre for Geriatric Care, 3560 Bathurst Street, Toronto, Ontario, Canada M6A 2E1.
| | | | | |
Collapse
|
45
|
Abstract
Humans normally listen in mixed environments, in which sounds originating from more than one source overlap in time and in frequency. The auditory system is able to extract information specific to the individual sources that contribute to the composite signal and process the information for each source separately; this is called “auditory scene analysis” or “sound-source determination.” Sounds that are simultaneously present but generated independently tend to differ along relatively simple acoustic dimensions. These dimensions may be temporal, as when sounds begin or end asynchronously, or spectral, as when the sounds have different fundamental frequencies. Psychophysical experiments have identified some of the ways in which human listeners use these dimensions to isolate sources of sound. A simple but useful stimulus, a harmonic complex tone with or without a mistuned component, can be used for parametric investigation of the processing of spectral structure. This “mistuned tone” stimulus has been used in several psychophysical experiments, and more recently in studies that specifically address the neural mechanisms that underlie segregation based on harmonicity. Studies of the responses of single neurons in the chinchilla auditory system to mistuned tones are reviewed here in detail. The results of those experiments support the view that neurons in the inferior colliculus (IC) exhibit responses to mistuned tones that are larger and temporally more complex than the same neurons’ responses to harmonic tones. Mistuning does not produce comparable changes in the discharge patterns of auditory nerve (AN) fibers, indicating that a major transformation in the neural representation of harmonic structure occurs in the auditory brainstem. The brainstem processing that accomplishes this transformation may contribute to the segregation of competing sounds and ultimately to the identification of sound sources.
Collapse
Affiliation(s)
- Donal G Sinex
- Department of Psychology, Utah State University, Logan, Utah 84322, USA
| |
Collapse
|
46
|
Hautus MJ, Johnson BW. Object-related brain potentials associated with the perceptual segregation of a dichotically embedded pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:275-280. [PMID: 15704420 DOI: 10.1121/1.1828499] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The cortical mechanisms of perceptual segregation of concurrent sound sources were examined, based on binaural detection of interaural timing differences. Auditory event-related potentials were measured from 11 healthy subjects. Binaural stimuli were created by introducing a dichotic delay of 500-ms duration to a narrow frequency region within a broadband noise, and resulted in a perception of a centrally located noise and a right-lateralized pitch (dichotic pitch). In separate listening conditions, subjects actively discriminated and responded to randomly interleaved binaural and control stimuli, or ignored random stimuli while watching silent cartoons. In a third listening condition subjects ignored stimuli presented in homogenous blocks. For all listening conditions, the dichotic pitch stimulus elicited an object-related negativity (ORN) at a latency of about 150-250 ms after stimulus onset. When subjects were required to actively respond to stimuli, the ORN was followed by a P400 wave with a latency of about 320-420 ms. These results support and extend a two-stage model of auditory scene analysis in which acoustic streams are automatically parsed into component sound sources based on source-relevant cues, followed by a controlled process involving identification and generation of a behavioral response.
Collapse
Affiliation(s)
- Michael J Hautus
- Department of Psychology, University of Auckland, Auckland, New Zealand.
| | | |
Collapse
|
47
|
Abstract
In everyday life we often listen to one sound, such as someone's voice, in a background of competing sounds. To do this, we must assign simultaneously occurring frequency components to the correct source, and organize sounds appropriately over time. The physical cues that we exploit to do so are well-established; more recent research has focussed on the underlying neural bases, where most progress has been made in the study of a form of sequential organization known as "auditory streaming". Listeners' sensitivity to streaming cues can be captured in the responses of neurons in the primary auditory cortex, and in EEG wave components with a short latency (< 200ms). However, streaming can be strongly affected by attention, suggesting that this early processing either receives input from non-auditory areas, or feeds into processes that do.
Collapse
Affiliation(s)
- Robert P Carlyon
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 2EF, UK.
| |
Collapse
|
48
|
Abstract
Objects are the building blocks of experience, but what do we mean by an object? Increasingly, neuroscientists refer to 'auditory objects', yet it is not clear what properties these should possess, how they might be represented in the brain, or how they might relate to the more familiar objects of vision. The concept of an auditory object challenges our understanding of object perception. Here, we offer a critical perspective on the concept and its basis in the brain.
Collapse
Affiliation(s)
- Timothy D Griffiths
- Auditory Group, University of Newcastle Medical School, Newcastle-upon-Tyne NE2 4HH, UK.
| | | |
Collapse
|