1
|
Kalra L, Altman S, Bee MA. Perceptually salient differences in a species recognition cue do not promote auditory streaming in eastern grey treefrogs (Hyla versicolor). J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2024:10.1007/s00359-024-01702-9. [PMID: 38733407 DOI: 10.1007/s00359-024-01702-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/13/2024]
Abstract
Auditory streaming underlies a receiver's ability to organize complex mixtures of auditory input into distinct perceptual "streams" that represent different sound sources in the environment. During auditory streaming, sounds produced by the same source are integrated through time into a single, coherent auditory stream that is perceptually segregated from other concurrent sounds. Based on human psychoacoustic studies, one hypothesis regarding auditory streaming is that any sufficiently salient perceptual difference may lead to stream segregation. Here, we used the eastern grey treefrog, Hyla versicolor, to test this hypothesis in the context of vocal communication in a non-human animal. In this system, females choose their mate based on perceiving species-specific features of a male's pulsatile advertisement calls in social environments (choruses) characterized by mixtures of overlapping vocalizations. We employed an experimental paradigm from human psychoacoustics to design interleaved pulsatile sequences (ABAB…) that mimicked key features of the species' advertisement call, and in which alternating pulses differed in pulse rise time, which is a robust species recognition cue in eastern grey treefrogs. Using phonotaxis assays, we found no evidence that perceptually salient differences in pulse rise time promoted the segregation of interleaved pulse sequences into distinct auditory streams. These results do not support the hypothesis that any perceptually salient acoustic difference can be exploited as a cue for stream segregation in all species. We discuss these findings in the context of cues used for species recognition and auditory streaming.
Collapse
Affiliation(s)
- Lata Kalra
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Shoshana Altman
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA
| | - Mark A Bee
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA
| |
Collapse
|
2
|
Rajasingam SL, Summers RJ, Roberts B. The dynamics of auditory stream segregation: Effects of sudden changes in frequency, level, or modulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3769. [PMID: 34241493 DOI: 10.1121/10.0005049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 05/03/2021] [Indexed: 06/13/2023]
Abstract
Three experiments explored the effects of abrupt changes in stimulus properties on streaming dynamics. Listeners monitored 20-s-long low- and high-frequency (LHL-) tone sequences and reported the number of streams heard throughout. Experiments 1 and 2 used pure tones and examined the effects of changing triplet base frequency and level, respectively. Abrupt changes in base frequency (±3-12 semitones) caused significant magnitude-related falls in segregation (resetting), regardless of transition direction, but an asymmetry occurred for changes in level (±12 dB). Rising-level transitions usually decreased segregation significantly, whereas falling-level transitions had little or no effect. Experiment 3 used pure tones (unmodulated) and narrowly spaced (±25 Hz) tone pairs (dyads); the two evoke similar excitation patterns, but dyads are strongly modulated with a distinctive timbre. Dyad-only sequences induced a strongly segregated percept, limiting scope for further build-up. Alternation between groups of pure tones and dyads produced large, asymmetric changes in streaming. Dyad-to-pure transitions caused substantial resetting, but pure-to-dyad transitions sometimes elicited even greater segregation than for the corresponding interval in dyad-only sequences (overshoot). The results indicate that abrupt changes in timbre can strongly affect the likelihood of stream segregation without introducing significant peripheral-channeling cues. These asymmetric effects of transition direction are reminiscent of subtractive adaptation in vision.
Collapse
Affiliation(s)
- Saima L Rajasingam
- Department of Vision and Hearing Sciences, Anglia Ruskin University, Cambridge CB1 1PT, United Kingdom
| | - Robert J Summers
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
3
|
Cai H, Dent ML. Attention capture in birds performing an auditory streaming task. PLoS One 2020; 15:e0235420. [PMID: 32589692 PMCID: PMC7319309 DOI: 10.1371/journal.pone.0235420] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 06/15/2020] [Indexed: 11/19/2022] Open
Abstract
Numerous animal models have been used to investigate the neural mechanisms of auditory processing in complex acoustic environments, but it is unclear whether an animal’s auditory attention is functionally similar to a human’s in processing competing auditory scenes. Here we investigated the effects of attention capture in birds performing an objective auditory streaming paradigm. The classical ABAB… patterned pure tone sequences were modified and used for the task. We trained the birds to selectively attend to a target stream and only respond to the deviant appearing in the target stream, even though their attention may be captured by a deviant in the background stream. When no deviant appeared in the background stream, the birds experience the buildup of streaming process in a qualitatively similar way as they did in a subjective paradigm. Although the birds were trained to selectively attend to the target stream, they failed to avoid the involuntary attention switch caused by the background deviant, especially when the background deviant was sequentially unpredictable. Their global performance deteriorated more with increasingly salient background deviants, where the buildup process was reset by the background distractor. Moreover, sequential predictability of the background deviant facilitated the recovery of the buildup process after attention capture. This is the first study that addresses the perceptual consequences of the joint effects of top-down and bottom-up attention in behaving animals.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
| | - Micheal L. Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
- * E-mail:
| |
Collapse
|
4
|
Auditory streaming and bistability paradigm extended to a dynamic environment. Hear Res 2019; 383:107807. [PMID: 31622836 DOI: 10.1016/j.heares.2019.107807] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 09/19/2019] [Accepted: 10/01/2019] [Indexed: 11/23/2022]
Abstract
We explore stream segregation with temporally modulated acoustic features using behavioral experiments and modelling. The auditory streaming paradigm in which alternating high- A and low-frequency tones B appear in a repeating ABA-pattern, has been shown to be perceptually bistable for extended presentations (order of minutes). For a fixed, repeating stimulus, perception spontaneously changes (switches) at random times, every 2-15 s, between an integrated interpretation with a galloping rhythm and segregated streams. Streaming in a natural auditory environment requires segregation of auditory objects with features that evolve over time. With the relatively idealized ABA-triplet paradigm, we explore perceptual switching in a non-static environment by considering slowly and periodically varying stimulus features. Our previously published model captures the dynamics of auditory bistability and predicts here how perceptual switches are entrained, tightly locked to the rising and falling phase of modulation. In psychoacoustic experiments we find that entrainment depends on both the period of modulation and the intrinsic switch characteristics of individual listeners. The extended auditory streaming paradigm with slowly modulated stimulus features presented here will be of significant interest for future imaging and neurophysiology experiments by reducing the need for subjective perceptual reports of ongoing perception.
Collapse
|
5
|
Elie JE, Theunissen FE. Invariant neural responses for sensory categories revealed by the time-varying information for communication calls. PLoS Comput Biol 2019; 15:e1006698. [PMID: 31557151 PMCID: PMC6762074 DOI: 10.1371/journal.pcbi.1006698] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 06/08/2019] [Indexed: 12/20/2022] Open
Abstract
Although information theoretic approaches have been used extensively in the analysis of the neural code, they have yet to be used to describe how information is accumulated in time while sensory systems are categorizing dynamic sensory stimuli such as speech sounds or visual objects. Here, we present a novel method to estimate the cumulative information for stimuli or categories. We further define a time-varying categorical information index that, by comparing the information obtained for stimuli versus categories of these same stimuli, quantifies invariant neural representations. We use these methods to investigate the dynamic properties of avian cortical auditory neurons recorded in zebra finches that were listening to a large set of call stimuli sampled from the complete vocal repertoire of this species. We found that the time-varying rates carry 5 times more information than the mean firing rates even in the first 100 ms. We also found that cumulative information has slow time constants (100–600 ms) relative to the typical integration time of single neurons, reflecting the fact that the behaviorally informative features of auditory objects are time-varying sound patterns. When we correlated firing rates and information values, we found that average information correlates with average firing rate but that higher-rates found at the onset response yielded similar information values as the lower-rates found in the sustained response: the onset and sustained response of avian cortical auditory neurons provide similar levels of independent information about call identity and call-type. Finally, our information measures allowed us to rigorously define categorical neurons; these categorical neurons show a high degree of invariance for vocalizations within a call-type. Peak invariance is found around 150 ms after stimulus onset. Surprisingly, call-type invariant neurons were found in both primary and secondary avian auditory areas. Just as the recognition of faces requires neural representations that are invariant to scale and rotation, the recognition of behaviorally relevant auditory objects, such as spoken words, requires neural representations that are invariant to the speaker uttering the word and to his or her location. Here, we used information theory to investigate the time course of the neural representation of bird communication calls and of behaviorally relevant categories of these same calls: the call-types of the bird’s repertoire. We found that neurons in both the primary and secondary avian auditory cortex exhibit invariant responses to call renditions within a call-type, suggestive of a potential role for extracting the meaning of these communication calls. We also found that time plays an important role: first, neural responses carry significantly more information when represented by temporal patterns calculated at the small time scale of 10 ms than when measured as average rates and, second, this information accumulates in a non-redundant fashion up to long integration times of 600 ms. This rich temporal neural representation is matched to the temporal richness found in the communication calls of this species.
Collapse
Affiliation(s)
- Julie E. Elie
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, California, United States of America
- Department of Bioengineering, University of California Berkeley, Berkeley, California, United States of America
- * E-mail:
| | - Frédéric E. Theunissen
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, California, United States of America
- Department of Psychology, University of California Berkeley, Berkeley, California, United States of America
| |
Collapse
|
6
|
Rajasingam SL, Summers RJ, Roberts B. Stream biasing by different induction sequences: Evaluating stream capture as an account of the segregation-promoting effects of constant-frequency inducers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:3409. [PMID: 30599694 DOI: 10.1121/1.5082300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 11/19/2018] [Indexed: 06/09/2023]
Abstract
Stream segregation for a test sequence comprising high-frequency (H) and low-frequency (L) pure tones, presented in a galloping rhythm, is much greater when preceded by a constant-frequency induction sequence matching one subset than by an inducer configured like the test sequence; this difference persists for several seconds. It has been proposed that constant-frequency inducers promote stream segregation by capturing the matching subset of test-sequence tones into an on-going, pre-established stream. This explanation was evaluated using 2-s induction sequences followed by longer test sequences (12-20 s). Listeners reported the number of streams heard throughout the test sequence. Experiment 1 used LHL- sequences and one or other subset of inducer tones was attenuated (0-24 dB in 6-dB steps, and ∞). Greater attenuation usually caused a progressive increase in segregation, towards that following the constant-frequency inducer. Experiment 2 used HLH- sequences and the L inducer tones were raised or lowered in frequency relative to their test-sequence counterparts (ΔfI = 0, 0.5, 1.0, or 1.5 × ΔfT ). Either change greatly increased segregation. These results are concordant with the notion of attention switching to new sounds but contradict the stream-capture hypothesis, unless a "proto-object" corresponding to the continuing subset is assumed to form during the induction sequence.
Collapse
Affiliation(s)
- Saima L Rajasingam
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Robert J Summers
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
7
|
Ruggles DR, Tausend AN, Shamma SA, Oxenham AJ. Cortical markers of auditory stream segregation revealed for streaming based on tonotopy but not pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2424. [PMID: 30404514 PMCID: PMC6909992 DOI: 10.1121/1.5065392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 06/08/2023]
Abstract
The brain decomposes mixtures of sounds, such as competing talkers, into perceptual streams that can be attended to individually. Attention can enhance the cortical representation of streams, but it is unknown what acoustic features the enhancement reflects, or where in the auditory pathways attentional enhancement is first observed. Here, behavioral measures of streaming were combined with simultaneous low- and high-frequency envelope-following responses (EFR) that are thought to originate primarily from cortical and subcortical regions, respectively. Repeating triplets of harmonic complex tones were presented with alternating fundamental frequencies. The tones were filtered to contain either low-numbered spectrally resolved harmonics, or only high-numbered unresolved harmonics. The behavioral results confirmed that segregation can be based on either tonotopic or pitch cues. The EFR results revealed no effects of streaming or attention on subcortical responses. Cortical responses revealed attentional enhancement under conditions of streaming, but only when tonotopic cues were available, not when streaming was based only on pitch cues. The results suggest that the attentional modulation of phase-locked responses is dominated by tonotopically tuned cortical neurons that are insensitive to pitch or periodicity cues.
Collapse
Affiliation(s)
- Dorea R Ruggles
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Alexis N Tausend
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Shihab A Shamma
- Electrical and Computer Engineering Department & Institute for Systems, University of Maryland, College Park, Maryland 20740, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
8
|
Cai H, Screven LA, Dent ML. Behavioral measurements of auditory streaming and build-up by budgerigars ( Melopsittacus undulatus). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1508. [PMID: 30424658 DOI: 10.1121/1.5054297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/27/2018] [Indexed: 06/09/2023]
Abstract
The perception of the build-up of auditory streaming has been widely investigated in humans, while it is unknown whether animals experience a similar perception when hearing high (H) and low (L) tonal pattern sequences. The paradigm previously used in European starlings (Sturnus vulgaris) was adopted in two experiments to address the build-up of auditory streaming in budgerigars (Melopsittacus undulatus). In experiment 1, different numbers of repetitions of low-high-low triplets were used in five conditions to study the build-up process. In experiment 2, 5 and 15 repetitions of high-low-high triplets were used to investigate the effects of repetition rate, frequency separation, and frequency range of the two tones on the birds' streaming perception. Similar to humans, budgerigars subjectively experienced the build-up process in auditory streaming; faster repetition rates and larger frequency separations enhanced the streaming perception, and these results were consistent across the two frequency ranges. Response latency analysis indicated that the budgerigars needed a longer amount of time to respond to stimuli that elicited a salient streaming perception. These results indicate, for the first time using a behavioral paradigm, that budgerigars experience a build-up of auditory streaming in a manner similar to humans.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Laurel A Screven
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Micheal L Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| |
Collapse
|
9
|
Knyazeva S, Selezneva E, Gorkin A, Aggelopoulos NC, Brosch M. Neuronal Correlates of Auditory Streaming in Monkey Auditory Cortex for Tone Sequences without Spectral Differences. Front Integr Neurosci 2018; 12:4. [PMID: 29440999 PMCID: PMC5797536 DOI: 10.3389/fnint.2018.00004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/16/2018] [Indexed: 11/13/2022] Open
Abstract
This study finds a neuronal correlate of auditory perceptual streaming in the primary auditory cortex for sequences of tone complexes that have the same amplitude spectrum but a different phase spectrum. Our finding is based on microelectrode recordings of multiunit activity from 270 cortical sites in three awake macaque monkeys. The monkeys were presented with repeated sequences of a tone triplet that consisted of an A tone, a B tone, another A tone and then a pause. The A and B tones were composed of unresolved harmonics formed by adding the harmonics in cosine phase, in alternating phase, or in random phase. A previous psychophysical study on humans revealed that when the A and B tones are similar, humans integrate them into a single auditory stream; when the A and B tones are dissimilar, humans segregate them into separate auditory streams. We found that the similarity of neuronal rate responses to the triplets was highest when all A and B tones had cosine phase. Similarity was intermediate when the A tones had cosine phase and the B tones had alternating phase. Similarity was lowest when the A tones had cosine phase and the B tones had random phase. The present study corroborates and extends previous reports, showing similar correspondences between neuronal activity in the primary auditory cortex and auditory streaming of sound sequences. It also is consistent with Fishman’s population separation model of auditory streaming.
Collapse
Affiliation(s)
- Stanislava Knyazeva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Elena Selezneva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Alexander Gorkin
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Laboratory of Psychophysiology, Institute of Psychology, Moscow, Russia
| | | | - Michael Brosch
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| |
Collapse
|
10
|
Itatani N, Klump GM. Interaction of spatial and non-spatial cues in auditory stream segregation in the European starling. Eur J Neurosci 2017; 51:1191-1200. [PMID: 28922512 DOI: 10.1111/ejn.13716] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Revised: 09/14/2017] [Accepted: 09/14/2017] [Indexed: 11/29/2022]
Abstract
Integrating sounds from the same source and segregating sounds from different sources in an acoustic scene are an essential function of the auditory system. Naturally, the auditory system simultaneously makes use of multiple cues. Here, we investigate the interaction between spatial cues and frequency cues in stream segregation of European starlings (Sturnus vulgaris) using an objective measure of perception. Neural responses to streaming sounds were recorded, while the bird was performing a behavioural task that results in a higher sensitivity during a one-stream than a two-stream percept. Birds were trained to detect an onset time shift of a B tone in an ABA- triplet sequence in which A and B could differ in frequency and/or spatial location. If the frequency difference or spatial separation between the signal sources or both were increased, the behavioural time shift detection performance deteriorated. Spatial separation had a smaller effect on the performance compared to the frequency difference and both cues additively affected the performance. Neural responses in the primary auditory forebrain were affected by the frequency and spatial cues. However, frequency and spatial cue differences being sufficiently large to elicit behavioural effects did not reveal correlated neural response differences. The difference between the neuronal response pattern and behavioural response is discussed with relation to the task given to the bird. Perceptual effects of combining different cues in auditory scene analysis indicate that these cues are analysed independently and given different weights suggesting that the streaming percept arises consecutively to initial cue analysis.
Collapse
Affiliation(s)
- Naoya Itatani
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
11
|
Rankin J, Osborn Popp PJ, Rinzel J. Stimulus Pauses and Perturbations Differentially Delay or Promote the Segregation of Auditory Objects: Psychoacoustics and Modeling. Front Neurosci 2017; 11:198. [PMID: 28473747 PMCID: PMC5397483 DOI: 10.3389/fnins.2017.00198] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 03/23/2017] [Indexed: 11/21/2022] Open
Abstract
Segregating distinct sound sources is fundamental for auditory perception, as in the cocktail party problem. In a process called the build-up of stream segregation, distinct sound sources that are perceptually integrated initially can be segregated into separate streams after several seconds. Previous research concluded that abrupt changes in the incoming sounds during build-up—for example, a step change in location, loudness or timing—reset the percept to integrated. Following this reset, the multisecond build-up process begins again. Neurophysiological recordings in auditory cortex (A1) show fast (subsecond) adaptation, but unified mechanistic explanations for the bias toward integration, multisecond build-up and resets remain elusive. Combining psychoacoustics and modeling, we show that initial unadapted A1 responses bias integration, that the slowness of build-up arises naturally from competition downstream, and that recovery of adaptation can explain resets. An early bias toward integrated perceptual interpretations arising from primary cortical stages that encode low-level features and feed into competition downstream could also explain similar phenomena in vision. Further, we report a previously overlooked class of perturbations that promote segregation rather than integration. Our results challenge current understanding for perturbation effects on the emergence of sound source segregation, leading to a new hypothesis for differential processing downstream of A1. Transient perturbations can momentarily redirect A1 responses as input to downstream competition units that favor segregation.
Collapse
Affiliation(s)
- James Rankin
- Department of Mathematics, University of ExeterExeter, UK.,Center for Neural Science, New York UniversityNew York, NY, USA
| | | | - John Rinzel
- Center for Neural Science, New York UniversityNew York, NY, USA.,Courant Institute of Mathematical SciencesNew York, NY, USA
| |
Collapse
|
12
|
Itatani N, Klump GM. Animal models for auditory streaming. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0112. [PMID: 28044022 DOI: 10.1098/rstb.2016.0112] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2016] [Indexed: 11/12/2022] Open
Abstract
Sounds in the natural environment need to be assigned to acoustic sources to evaluate complex auditory scenes. Separating sources will affect the analysis of auditory features of sounds. As the benefits of assigning sounds to specific sources accrue to all species communicating acoustically, the ability for auditory scene analysis is widespread among different animals. Animal studies allow for a deeper insight into the neuronal mechanisms underlying auditory scene analysis. Here, we will review the paradigms applied in the study of auditory scene analysis and streaming of sequential sounds in animal models. We will compare the psychophysical results from the animal studies to the evidence obtained in human psychophysics of auditory streaming, i.e. in a task commonly used for measuring the capability for auditory scene analysis. Furthermore, the neuronal correlates of auditory streaming will be reviewed in different animal models and the observations of the neurons' response measures will be related to perception. The across-species comparison will reveal whether similar demands in the analysis of acoustic scenes have resulted in similar perceptual and neuronal processing mechanisms in the wide range of species being capable of auditory scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Naoya Itatani
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| | - Georg M Klump
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
13
|
Abstract
UNLABELLED Stream segregation enables a listener to disentangle multiple competing sequences of sounds. A recent study from our laboratory demonstrated that cortical neurons in anesthetized cats exhibit spatial stream segregation (SSS) by synchronizing preferentially to one of two sequences of noise bursts that alternate between two source locations. Here, we examine the emergence of SSS along the ascending auditory pathway. Extracellular recordings were made in anesthetized rats from the inferior colliculus (IC), the nucleus of the brachium of the IC (BIN), the medial geniculate body (MGB), and the primary auditory cortex (A1). Stimuli consisted of interleaved sequences of broadband noise bursts that alternated between two source locations. At stimulus presentation rates of 5 and 10 bursts per second, at which human listeners report robust SSS, neural SSS is weak in the central nucleus of the IC (ICC), it appears in the nucleus of the brachium of the IC (BIN) and in approximately two-thirds of neurons in the ventral MGB (MGBv), and is prominent throughout A1. The enhancement of SSS at the cortical level reflects both increased spatial sensitivity and increased forward suppression. We demonstrate that forward suppression in A1 does not result from synaptic inhibition at the cortical level. Instead, forward suppression might reflect synaptic depression in the thalamocortical projection. Together, our findings indicate that auditory streams are increasingly segregated along the ascending auditory pathway as distinct mutually synchronized neural populations. SIGNIFICANCE STATEMENT Listeners are capable of disentangling multiple competing sequences of sounds that originate from distinct sources. This stream segregation is aided by differences in spatial location between the sources. A possible substrate of spatial stream segregation (SSS) has been described in the auditory cortex, but the mechanisms leading to those cortical responses are unknown. Here, we investigated SSS in three levels of the ascending auditory pathway with extracellular unit recordings in anesthetized rats. We found that neural SSS emerges within the ascending auditory pathway as a consequence of sharpening of spatial sensitivity and increasing forward suppression. Our results highlight brainstem mechanisms that culminate in SSS at the level of the auditory cortex.
Collapse
|
14
|
Functional magnetic resonance imaging confirms forward suppression for rapidly alternating sounds in human auditory cortex but not in the inferior colliculus. Hear Res 2016; 335:25-32. [PMID: 26899342 DOI: 10.1016/j.heares.2016.02.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 02/08/2016] [Accepted: 02/15/2016] [Indexed: 11/21/2022]
Abstract
Forward suppression at the level of the auditory cortex has been suggested to subserve auditory stream segregation. Recent results in non-streaming stimulation contexts have indicated that forward suppression can also be observed in the inferior colliculus; whether this holds for streaming-related contexts remains unclear. Here, we used cardiac-gated fMRI to examine forward suppression in the inferior colliculus (and the rest of the human auditory pathway) in response to canonical streaming stimuli (rapid tone sequences comprised of either one repetitive tone or two alternating tones). The first stimulus is typically perceived as a single stream, the second as two interleaved streams. In different experiments using either pure tones differing in frequency or bandpass-filtered noise differing in inter-aural time differences, we observed stronger auditory cortex activation in response to alternating vs. repetitive stimulation, consistent with the presence of forward suppression. In contrast, activity in the inferior colliculus and other subcortical nuclei did not significantly differ between alternating and monotonic stimuli. This finding could be explained by active amplification of forward suppression in auditory cortex, by a low rate (or absence) of cells showing forward suppression in inferior colliculus, or both.
Collapse
|
15
|
Dent ML, Martin AK, Flaherty MM, Neilans EG. Cues for auditory stream segregation of birdsong in budgerigars and zebra finches: Effects of location, timing, amplitude, and frequency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:674-83. [PMID: 26936551 DOI: 10.1121/1.4941322] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Deciphering the auditory scene is a problem faced by many organisms. However, when faced with numerous overlapping sounds from multiple locations, listeners are still able to attribute the individual sound objects to their individual sound-producing sources. Here, the characteristics of sounds important for integrating versus segregating in birds were determined. Budgerigars and zebra finches were trained using operant conditioning procedures on an identification task to peck one key when they heard a whole zebra finch song and to peck another when they heard a zebra finch song missing a middle syllable. Once the birds were trained to a criterion performance level on those stimuli, probe trials were introduced on a small proportion of trials. The probe songs contained modifications of the incomplete training song's missing syllable. When the bird responded as if the probe was a whole song, it suggests they streamed together the altered syllable and the rest of the song. When the bird responded as if the probe was a non-whole song, it suggests they segregated the altered probe from the rest of the song. Results show that some features, such as location and intensity, are more important for segregating than other features, such as timing and frequency.
Collapse
Affiliation(s)
- Micheal L Dent
- Department of Psychology, University at Buffalo, the State University of New York, Buffalo, New York 14260, USA
| | - Amanda K Martin
- Department of Psychology, University at Buffalo, the State University of New York, Buffalo, New York 14260, USA
| | - Mary M Flaherty
- Department of Psychology, University at Buffalo, the State University of New York, Buffalo, New York 14260, USA
| | - Erikson G Neilans
- Department of Psychology, University at Buffalo, the State University of New York, Buffalo, New York 14260, USA
| |
Collapse
|
16
|
Mate Searching Animals as Model Systems for Understanding Perceptual Grouping. PSYCHOLOGICAL MECHANISMS IN ANIMAL COMMUNICATION 2016. [DOI: 10.1007/978-3-319-48690-1_4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
17
|
Riecke L, Sack AT, Schroeder CE. Endogenous Delta/Theta Sound-Brain Phase Entrainment Accelerates the Buildup of Auditory Streaming. Curr Biol 2015; 25:3196-201. [PMID: 26628008 DOI: 10.1016/j.cub.2015.10.045] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 10/01/2015] [Accepted: 10/19/2015] [Indexed: 11/30/2022]
Abstract
In many natural listening situations, meaningful sounds (e.g., speech) fluctuate in slow rhythms among other sounds. When a slow rhythmic auditory stream is selectively attended, endogenous delta (1‒4 Hz) oscillations in auditory cortex may shift their timing so that higher-excitability neuronal phases become aligned with salient events in that stream [1, 2]. As a consequence of this stream-brain phase entrainment [3], these events are processed and perceived more readily than temporally non-overlapping events [4-11], essentially enhancing the neural segregation between the attended stream and temporally noncoherent streams [12]. Stream-brain phase entrainment is robust to acoustic interference [13-20] provided that target stream-evoked rhythmic activity can be segregated from noncoherent activity evoked by other sounds [21], a process that usually builds up over time [22-27]. However, it has remained unclear whether stream-brain phase entrainment functionally contributes to this buildup of rhythmic streams or whether it is merely an epiphenomenon of it. Here, we addressed this issue directly by experimentally manipulating endogenous stream-brain phase entrainment in human auditory cortex with non-invasive transcranial alternating current stimulation (TACS) [28-30]. We assessed the consequences of these manipulations on the perceptual buildup of the target stream (the time required to recognize its presence in a noisy background), using behavioral measures in 20 healthy listeners performing a naturalistic listening task. Experimentally induced cyclic 4-Hz variations in stream-brain phase entrainment reliably caused a cyclic 4-Hz pattern in perceptual buildup time. Our findings demonstrate that strong endogenous delta/theta stream-brain phase entrainment accelerates the perceptual emergence of task-relevant rhythmic streams in noisy environments.
Collapse
Affiliation(s)
- Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 Maastricht, the Netherlands.
| | - Alexander T Sack
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 Maastricht, the Netherlands
| | - Charles E Schroeder
- Cognitive Neuroscience and Schizophrenia Program, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY 10962, USA; Departments of Neurosurgery and Psychiatry, Columbia University College of Physicians and Surgeons, New York, NY 10032-2695, USA
| |
Collapse
|
18
|
Farley BJ, Noreña AJ. Membrane potential dynamics of populations of cortical neurons during auditory streaming. J Neurophysiol 2015; 114:2418-30. [PMID: 26269558 DOI: 10.1152/jn.00545.2015] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Accepted: 08/12/2015] [Indexed: 11/22/2022] Open
Abstract
How a mixture of acoustic sources is perceptually organized into discrete auditory objects remains unclear. One current hypothesis postulates that perceptual segregation of different sources is related to the spatiotemporal separation of cortical responses induced by each acoustic source or stream. In the present study, the dynamics of subthreshold membrane potential activity were measured across the entire tonotopic axis of the rodent primary auditory cortex during the auditory streaming paradigm using voltage-sensitive dye imaging. Consistent with the proposed hypothesis, we observed enhanced spatiotemporal segregation of cortical responses to alternating tone sequences as their frequency separation or presentation rate was increased, both manipulations known to promote stream segregation. However, across most streaming paradigm conditions tested, a substantial cortical region maintaining a response to both tones coexisted with more peripheral cortical regions responding more selectively to one of them. We propose that these coexisting subthreshold representation types could provide neural substrates to support the flexible switching between the integrated and segregated streaming percepts.
Collapse
Affiliation(s)
| | - Arnaud J Noreña
- Aix Marseille Université, Centre National de la Recherche Scientifique, Marseille, France; and
| |
Collapse
|
19
|
Stream segregation in the anesthetized auditory cortex. Hear Res 2015; 328:48-58. [PMID: 26163899 PMCID: PMC4582803 DOI: 10.1016/j.heares.2015.07.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 06/25/2015] [Accepted: 07/01/2015] [Indexed: 02/07/2023]
Abstract
Auditory stream segregation describes the way that sounds are perceptually segregated into groups or streams on the basis of perceptual attributes such as pitch or spectral content. For sequences of pure tones, segregation depends on the tones' proximity in frequency and time. In the auditory cortex (and elsewhere) responses to sequences of tones are dependent on stimulus conditions in a similar way to the perception of these stimuli. However, although highly dependent on stimulus conditions, perception is also clearly influenced by factors unrelated to the stimulus, such as attention. Exactly how ‘bottom-up’ sensory processes and non-sensory ‘top-down’ influences interact is still not clear. Here, we recorded responses to alternating tones (ABAB …) of varying frequency difference (FD) and rate of presentation (PR) in the auditory cortex of anesthetized guinea-pigs. These data complement previous studies, in that top-down processing resulting from conscious perception should be absent or at least considerably attenuated. Under anesthesia, the responses of cortical neurons to the tone sequences adapted rapidly, in a manner sensitive to both the FD and PR of the sequences. While the responses to tones at frequencies more distant from neuron best frequencies (BFs) decreased as the FD increased, the responses to tones near to BF increased, consistent with a release from adaptation, or forward suppression. Increases in PR resulted in reductions in responses to all tones, but the reduction was greater for tones further from BF. Although asymptotically adapted responses to tones showed behavior that was qualitatively consistent with perceptual stream segregation, responses reached asymptote within 2 s, and responses to all tones were very weak at high PRs (>12 tones per second). A signal-detection model, driven by the cortical population response, made decisions that were dependent on both FD and PR in ways consistent with perceptual stream segregation. This included showing a range of conditions over which decisions could be made either in favor of perceptual integration or segregation, depending on the model ‘decision criterion’. However, the rate of ‘build-up’ was more rapid than seen perceptually, and at high PR responses to tones were sometimes so weak as to be undetectable by the model. Under anesthesia, adaptation occurs rapidly, and at high PRs tones are generally poorly represented, which compromises the interpretation of the experiment. However, within these limitations, these results complement experiments in awake animals and humans. They generally support the hypothesis that ‘bottom-up’ sensory processing plays a major role in perceptual organization, and that processes underlying stream segregation are active in the absence of attention. We recorded responses of cortical neurons to sequences of tones under anesthesia. Fully adapted responses correlated reasonably with perceptual stream segregation. Responses to tone sequences were weak during rapid tone presentation (>12 Hz). Adaptation under anesthesia is too rapid to account for perceptual ‘build-up’. Neural correlates of stream segregation are not reliant on top-down influences.
Collapse
|
20
|
Eggermont JJ. Animal models of auditory temporal processing. Int J Psychophysiol 2015; 95:202-15. [DOI: 10.1016/j.ijpsycho.2014.03.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Revised: 03/27/2014] [Accepted: 03/27/2014] [Indexed: 10/25/2022]
|
21
|
Steele SA, Tranchina D, Rinzel J. An alternating renewal process describes the buildup of perceptual segregation. Front Comput Neurosci 2015; 8:166. [PMID: 25620927 PMCID: PMC4286718 DOI: 10.3389/fncom.2014.00166] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2014] [Accepted: 12/02/2014] [Indexed: 12/05/2022] Open
Abstract
For some ambiguous scenes perceptual conflict arises between integration and segregation. Initially, all stimulus features seem integrated. Then abruptly, perhaps after a few seconds, a segregated percept emerges. For example, segregation of acoustic features into streams may require several seconds. In behavioral experiments, when a subject's reports of stream segregation are averaged over repeated trials, one obtains a buildup function, a smooth time course for segregation probability. The buildup function has been said to reflect an underlying mechanism of evidence accumulation or adaptation. During long duration stimuli perception may alternate between integration and segregation. We present a statistical model based on an alternating renewal process (ARP) that generates buildup functions without an accumulative process. In our model, perception alternates during a trial between different groupings, as in perceptual bistability, with random and independent dominance durations sampled from different percept-specific probability distributions. Using this theory, we describe the short-term dynamics of buildup observed on short trials in terms of the long-term statistics of percept durations for the two alternating perceptual organizations. Our statistical-dynamics model describes well the buildup functions and alternations in simulations of pseudo-mechanistic neuronal network models with percept-selective populations competing through mutual inhibition. Even though the competition model can show history dependence through slow adaptation, our statistical switching model, that neglects history, predicts well the buildup function. We propose that accumulation is not a necessary feature to produce buildup. Generally, if alternations between two states exhibit independent durations with stationary statistics then the associated buildup function can be described by the statistical dynamics of an ARP.
Collapse
Affiliation(s)
- Sara A Steele
- Center for Neural Science, New York University New York, NY, USA
| | - Daniel Tranchina
- Courant Institute for Mathematical Sciences, New York University New York, NY, USA ; Department of Biology, New York University New York, NY, USA
| | - John Rinzel
- Center for Neural Science, New York University New York, NY, USA ; Courant Institute for Mathematical Sciences, New York University New York, NY, USA
| |
Collapse
|
22
|
Kettler L, Wagner H. Influence of double stimulation on sound-localization behavior in barn owls. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2014; 200:1033-44. [PMID: 25352361 DOI: 10.1007/s00359-014-0953-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Revised: 09/01/2014] [Accepted: 10/08/2014] [Indexed: 11/28/2022]
Abstract
Barn owls do not immediately approach a source after they hear a sound, but wait for a second sound before they strike. This represents a gain in striking behavior by avoiding responses to random incidents. However, the first stimulus is also expected to change the threshold for perceiving the subsequent second sound, thus possibly introducing some costs. We mimicked this situation in a behavioral double-stimulus paradigm utilizing saccadic head turns of owls. The first stimulus served as an adapter, was presented in frontal space, and did not elicit a head turn. The second stimulus, emitted from a peripheral source, elicited the head turn. The time interval between both stimuli was varied. Data obtained with double stimulation were compared with data collected with a single stimulus from the same positions as the second stimulus in the double-stimulus paradigm. Sound-localization performance was quantified by the response latency, accuracy, and precision of the head turns. Response latency was increased with double stimuli, while accuracy and precision were decreased. The effect depended on the inter-stimulus interval. These results suggest that waiting for a second stimulus may indeed impose costs on sound localization by adaptation and this reduces the gain obtained by waiting for a second stimulus.
Collapse
Affiliation(s)
- Lutz Kettler
- Department of Zoology and Animal Physiology, Aachen University, Worringerweg 3, 52074, Aachen, Germany,
| | | |
Collapse
|
23
|
Menardy F, Giret N, Del Negro C. The presence of an audience modulates responses to familiar call stimuli in the male zebra finch forebrain. Eur J Neurosci 2014; 40:3338-50. [PMID: 25145963 DOI: 10.1111/ejn.12696] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Revised: 06/23/2014] [Accepted: 07/15/2014] [Indexed: 12/22/2022]
Abstract
The ability to recognize familiar individuals is crucial for establishing social relationships. The zebra finch, a highly social songbird species that forms lifelong pair bonds, uses a vocalization, the distance call, to identify its mate. However, in males, this ability depends on social conditions, requiring the presence of an audience. To evaluate whether the presence of bystanders modulates the auditory processing underlying recognition abilities, we assessed, by using a lightweight telemetry system, whether electrophysiological responses driven by familiar and unfamiliar female calls in a high-level auditory area [the caudomedial nidopallium (NCM)] were modulated by the presence of conspecific males. Males had experienced the call of their mate for several months and the call of a familiar female for several days. When they were exposed to female calls in the presence of two male conspecifics, NCM neurons showed greater responses to the playback of familiar female calls, including the mate's call, than to unfamiliar ones. In contrast, no such discrimination was observed in males when they were alone or when call-evoked responses were collected under anaesthesia. Together, these results suggest that NCM neuronal activity is profoundly influenced by social conditions, providing new evidence that the properties of NCM neurons are not simply determined by the acoustic structure of auditory stimuli. They also show that neurons in the NCM form part of a network that can be shaped by experience and that probably plays an important role in the emergence of communication sound recognition.
Collapse
Affiliation(s)
- F Menardy
- CNPS, UMR CNRS 8195, University Paris-Sud, 91405, Orsay, France
| | | | | |
Collapse
|
24
|
Christiansen SK, Oxenham AJ. Assessing the effects of temporal coherence on auditory stream formation through comodulation masking release. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:3520-3529. [PMID: 24907815 PMCID: PMC4048442 DOI: 10.1121/1.4872300] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 03/29/2014] [Accepted: 04/07/2014] [Indexed: 05/29/2023]
Abstract
Recent studies of auditory streaming have suggested that repeated synchronous onsets and offsets over time, referred to as "temporal coherence," provide a strong grouping cue between acoustic components, even when they are spectrally remote. This study uses a measure of auditory stream formation, based on comodulation masking release (CMR), to assess the conditions under which a loss of temporal coherence across frequency can lead to auditory stream segregation. The measure relies on the assumption that the CMR, produced by flanking bands remote from the masker and target frequency, only occurs if the masking and flanking bands form part of the same perceptual stream. The masking and flanking bands consisted of sequences of narrowband noise bursts, and the temporal coherence between the masking and flanking bursts was manipulated in two ways: (a) By introducing a fixed temporal offset between the flanking and masking bands that varied from zero to 60 ms and (b) by presenting the flanking and masking bursts at different temporal rates, so that the asynchronies varied from burst to burst. The results showed reduced CMR in all conditions where the flanking and masking bands were temporally incoherent, in line with expectations of the temporal coherence hypothesis.
Collapse
Affiliation(s)
| | - Andrew J Oxenham
- Departments of Psychology and Otolaryngology, University of Minnesota, Minneapolis, Minnesota 55455
| |
Collapse
|
25
|
Noda T, Kanzaki R, Takahashi H. Stimulus phase locking of cortical oscillation for auditory stream segregation in rats. PLoS One 2013; 8:e83544. [PMID: 24376715 PMCID: PMC3869811 DOI: 10.1371/journal.pone.0083544] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 11/06/2013] [Indexed: 11/19/2022] Open
Abstract
The phase of cortical oscillations contains rich information and is valuable for encoding sound stimuli. Here we hypothesized that oscillatory phase modulation, instead of amplitude modulation, is a neural correlate of auditory streaming. Our behavioral evaluation provided compelling evidences for the first time that rats are able to organize auditory stream. Local field potentials (LFPs) were investigated in the cortical layer IV or deeper in the primary auditory cortex of anesthetized rats. In response to ABA- sequences with different inter-tone intervals and frequency differences, neurometric functions were characterized with phase locking as well as the band-specific amplitude evoked by test tones. Our results demonstrated that under large frequency differences and short inter-tone intervals, the neurometric function based on stimulus phase locking in higher frequency bands, particularly the gamma band, could better describe van Noorden's perceptual boundary than the LFP amplitude. Furthermore, the gamma-band neurometric function showed a build-up-like effect within around 3 seconds from sequence onset. These findings suggest that phase locking and amplitude have different roles in neural computation, and support our hypothesis that temporal modulation of cortical oscillations should be considered to be neurophysiological mechanisms of auditory streaming, in addition to forward suppression, tonotopic separation, and multi-second adaptation.
Collapse
Affiliation(s)
- Takahiro Noda
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Ryohei Kanzaki
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
- Department of Mechano-Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Hirokazu Takahashi
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
- Department of Mechano-Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
- Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, Saitama, Japan
- * E-mail:
| |
Collapse
|
26
|
Noda T, Kanzaki R, Takahashi H. Amplitude and phase-locking adaptation of neural oscillation in the rat auditory cortex in response to tone sequence. Neurosci Res 2013; 79:52-60. [PMID: 24239971 DOI: 10.1016/j.neures.2013.11.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2013] [Revised: 11/01/2013] [Accepted: 11/05/2013] [Indexed: 11/29/2022]
Abstract
Sensory adaptation allows stimulus sensitivity to be dynamically modulated according to stimulus statistics and plays pivotal roles in efficient neural computation. Here, it is hypothesized that in the auditory cortex, phase locking of local field potentials (LFPs) to test tones exhibits an adaptation property, i.e., phase-locking adaptation, which is distinct from the amplitude adaptation of oscillatory components. Series of alternating tone sequences were applied in which the inter-tone interval (ITI) and frequency difference (ΔF) between successive tones were varied. Then, adaptation was characterized by the temporal evolution of the band-specific amplitude and phase locking evoked by the test tones. Differences as well as similarities were revealed between amplitude and phase-locking adaptations. First, both amplitude and phase-locking adaptations were enhanced by short ITIs and small ΔFs. Second, the amplitude adaptation was more effective in a higher frequency band, while the phase-locking adaptation was more effective in a lower frequency band. Third, as with the adaptation of multiunit activities (MUAs), the amplitude adaptation occurred mainly within a second, while the phase-locking showed multi-second adaptation specifically in the gamma band for short ITI and small ΔF conditions. Fourth, the amplitude adaptation and phase-locking adaptation were co-modulated in a within-second time scale, while this co-modulation was not observed in a multi-second time scale. These findings suggest that the amplitude and phase-locking adaptations have different mechanisms and functions. The phase-locking adaptation is likely to play more crucial roles in encoding a temporal structure of stimulus than the amplitude adaptation.
Collapse
Affiliation(s)
- Takahiro Noda
- Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan
| | - Ryohei Kanzaki
- Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan
| | - Hirokazu Takahashi
- Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan; Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan.
| |
Collapse
|
27
|
Abstract
In a complex auditory scene, a "cocktail party" for example, listeners can disentangle multiple competing sequences of sounds. A recent psychophysical study in our laboratory demonstrated a robust spatial component of stream segregation showing ∼8° acuity. Here, we recorded single- and multiple-neuron responses from the primary auditory cortex of anesthetized cats while presenting interleaved sound sequences that human listeners would experience as segregated streams. Sequences of broadband sounds alternated between pairs of locations. Neurons synchronized preferentially to sounds from one or the other location, thereby segregating competing sound sequences. Neurons favoring one source location or the other tended to aggregate within the cortex, suggestive of modular organization. The spatial acuity of stream segregation was as narrow as ∼10°, markedly sharper than the broad spatial tuning for single sources that is well known in the literature. Spatial sensitivity was sharpest among neurons having high characteristic frequencies. Neural stream segregation was predicted well by a parameter-free model that incorporated single-source spatial sensitivity and a measured forward-suppression term. We found that the forward suppression was not due to post discharge adaptation in the cortex and, therefore, must have arisen in the subcortical pathway or at the level of thalamocortical synapses. A linear-classifier analysis of single-neuron responses to rhythmic stimuli like those used in our psychophysical study yielded thresholds overlapping those of human listeners. Overall, the results indicate that the ascending auditory system does the work of segregating auditory streams, bringing them to discrete modules in the cortex for selection by top-down processes.
Collapse
|
28
|
Noise-invariant neurons in the avian auditory cortex: hearing the song in noise. PLoS Comput Biol 2013; 9:e1002942. [PMID: 23505354 PMCID: PMC3591274 DOI: 10.1371/journal.pcbi.1002942] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2012] [Accepted: 01/10/2013] [Indexed: 11/30/2022] Open
Abstract
Given the extraordinary ability of humans and animals to recognize communication signals over a background of noise, describing noise invariant neural responses is critical not only to pinpoint the brain regions that are mediating our robust perceptions but also to understand the neural computations that are performing these tasks and the underlying circuitry. Although invariant neural responses, such as rotation-invariant face cells, are well described in the visual system, high-level auditory neurons that can represent the same behaviorally relevant signal in a range of listening conditions have yet to be discovered. Here we found neurons in a secondary area of the avian auditory cortex that exhibit noise-invariant responses in the sense that they responded with similar spike patterns to song stimuli presented in silence and over a background of naturalistic noise. By characterizing the neurons' tuning in terms of their responses to modulations in the temporal and spectral envelope of the sound, we then show that noise invariance is partly achieved by selectively responding to long sounds with sharp spectral structure. Finally, to demonstrate that such computations could explain noise invariance, we designed a biologically inspired noise-filtering algorithm that can be used to separate song or speech from noise. This novel noise-filtering method performs as well as other state-of-the-art de-noising algorithms and could be used in clinical or consumer oriented applications. Our biologically inspired model also shows how high-level noise-invariant responses could be created from neural responses typically found in primary auditory cortex. Birds and humans excel at the task of detecting important sounds, such as song and speech, in difficult listening environments such as in a large bird colony or in a crowded bar. How our brains achieve such a feat remains a mystery to both neuroscientists and audio engineers. In our research, we found a population of neurons in the brain of songbirds that are able to extract a song signal from a background of noise. We explain how the neurons are able to perform this task and show how a biologically inspired algorithm could outperform the best noise-reduction methods proposed by engineers.
Collapse
|
29
|
Ponnath A, Hoke KL, Farris HE. Stimulus change detection in phasic auditory units in the frog midbrain: frequency and ear specific adaptation. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2013; 199:295-313. [PMID: 23344947 DOI: 10.1007/s00359-013-0794-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2012] [Revised: 01/05/2013] [Accepted: 01/07/2013] [Indexed: 12/31/2022]
Abstract
Neural adaptation, a reduction in the response to a maintained stimulus, is an important mechanism for detecting stimulus change. Contributing to change detection is the fact that adaptation is often stimulus specific: adaptation to a particular stimulus reduces excitability to a specific subset of stimuli, while the ability to respond to other stimuli is unaffected. Phasic cells (e.g., cells responding to stimulus onset) are good candidates for detecting the most rapid changes in natural auditory scenes, as they exhibit fast and complete adaptation to an initial stimulus presentation. We made recordings of single phasic auditory units in the frog midbrain to determine if adaptation was specific to stimulus frequency and ear of input. In response to an instantaneous frequency step in a tone, 28% of phasic cells exhibited frequency specific adaptation based on a relative frequency change (delta-f=±16%). Frequency specific adaptation was not limited to frequency steps, however, as adaptation was also overcome during continuous frequency modulated stimuli and in response to spectral transients interrupting tones. The results suggest that adaptation is separated for peripheral (e.g., frequency) channels. This was tested directly using dichotic stimuli. In 45% of binaural phasic units, adaptation was ear specific: adaptation to stimulation of one ear did not affect responses to stimulation of the other ear. Thus, adaptation exhibited specificity for stimulus frequency and lateralization at the level of the midbrain. This mechanism could be employed to detect rapid stimulus change within and between sound sources in complex acoustic environments.
Collapse
Affiliation(s)
- Abhilash Ponnath
- Neuroscience Center, Department of Otorhinolaryngology, Louisiana State University Health Sciences Center, 2020 Gravier St., New Orleans, LA 70112, USA
| | | | | |
Collapse
|
30
|
Knudsen DP, Gentner TQ. Active recognition enhances the representation of behaviorally relevant information in single auditory forebrain neurons. J Neurophysiol 2013; 109:1690-703. [PMID: 23303858 DOI: 10.1152/jn.00461.2012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Sensory systems are dynamic. They must process a wide range of natural signals that facilitate adaptive behaviors in a manner that depends on an organism's constantly changing goals. A full understanding of the sensory physiology that underlies adaptive natural behaviors must therefore account for the activity of sensory systems in light of these behavioral goals. Here we present a novel technique that combines in vivo electrophysiological recording from awake, freely moving songbirds with operant conditioning techniques that allow control over birds' recognition of conspecific song, a widespread natural behavior in songbirds. We show that engaging in a vocal recognition task alters the response properties of neurons in the caudal mesopallium (CM), an avian analog of mammalian auditory cortex, in European starlings. Compared with awake, passive listening, active engagement of subjects in an auditory recognition task results in neurons responding to fewer song stimuli and a decrease in the trial-to-trial variability in their driven firing rates. Mean firing rates also change during active recognition, but not uniformly. Relative to nonengaged listening, active recognition causes increases in the driven firing rates in some neurons, decreases in other neurons, and stimulus-specific changes in other neurons. These changes lead to both an increase in stimulus selectivity and an increase in the information conveyed by the neurons about the animals' behavioral task. This study demonstrates the behavioral dependence of neural responses in the avian auditory forebrain and introduces the starling as a model for real-time monitoring of task-related neural processing of complex auditory objects.
Collapse
Affiliation(s)
- Daniel P Knudsen
- Neurosciences Graduate Program, University of California San Diego, La Jolla, CA, USA
| | | |
Collapse
|
31
|
Large-scale synchronized activity during vocal deviance detection in the zebra finch auditory forebrain. J Neurosci 2012; 32:10594-608. [PMID: 22855809 DOI: 10.1523/jneurosci.6045-11.2012] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Auditory systems bias responses to sounds that are unexpected on the basis of recent stimulus history, a phenomenon that has been widely studied using sequences of unmodulated tones (mismatch negativity; stimulus-specific adaptation). Such a paradigm, however, does not directly reflect problems that neural systems normally solve for adaptive behavior. We recorded multiunit responses in the caudomedial auditory forebrain of anesthetized zebra finches (Taeniopygia guttata) at 32 sites simultaneously, to contact calls that recur probabilistically at a rate that is used in communication. Neurons in secondary, but not primary, auditory areas respond preferentially to calls when they are unexpected (deviant) compared with the same calls when they are expected (standard). This response bias is predominantly due to sites more often not responding to standard events than to deviant events. When two call stimuli alternate between standard and deviant roles, most sites exhibit a response bias to deviant events of both stimuli. This suggests that biases are not based on a use-dependent decrease in response strength but involve a more complex mechanism that is sensitive to auditory deviance per se. Furthermore, between many secondary sites, responses are tightly synchronized, a phenomenon that is driven by internal neuronal interactions rather than by the timing of stimulus acoustic features. We hypothesize that this deviance-sensitive, internally synchronized network of neurons is involved in the involuntary capturing of attention by unexpected and behaviorally potentially relevant events in natural auditory scenes.
Collapse
|
32
|
Dolležal LV, Beutelmann R, Klump GM. Stream segregation in the perception of sinusoidally amplitude-modulated tones. PLoS One 2012; 7:e43615. [PMID: 22984436 PMCID: PMC3440405 DOI: 10.1371/journal.pone.0043615] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Accepted: 07/26/2012] [Indexed: 11/25/2022] Open
Abstract
Amplitude modulation can serve as a cue for segregating streams of sounds from different sources. Here we evaluate stream segregation in humans using ABA- sequences of sinusoidally amplitude modulated (SAM) tones. A and B represent SAM tones with the same carrier frequency (1000, 4000 Hz) and modulation depth (30, 100%). The modulation frequency of the A signals (fmodA) was 30, 100 or 300 Hz, respectively. The modulation frequency of the B signals was up to four octaves higher (Δfmod). Three different ABA- tone patterns varying in tone duration and stimulus onset asynchrony were presented to evaluate the effect of forward suppression. Subjects indicated their 1- or 2-stream percept on a touch screen at the end of each ABA- sequence (presentation time 5 or 15 s). Tone pattern, fmodA, Δfmod, carrier frequency, modulation depth and presentation time significantly affected the percentage of a 2-stream percept. The human psychophysical results are compared to responses of avian forebrain neurons evoked by different ABA- SAM tone conditions [1] that were broadly overlapping those of the present study. The neurons also showed significant effects of tone pattern and Δfmod that were comparable to effects observed in the present psychophysical study. Depending on the carrier frequency, modulation frequency, modulation depth and the width of the auditory filters, SAM tones may provide mainly temporal cues (sidebands fall within the range of the filter), spectral cues (sidebands fall outside the range of the filter) or possibly both. A computational model based on excitation pattern differences was used to predict the 50% threshold of 2-stream responses. In conditions for which the model predicts a considerably larger 50% threshold of 2-stream responses (i.e., larger Δfmod at threshold) than was observed, it is unlikely that spectral cues can provide an explanation of stream segregation by SAM.
Collapse
Affiliation(s)
- Lena-Vanessa Dolležal
- Animal Physiology and Behavior Group, Department of Biology and Environmental Sciences, Carl von Ossietzky University Oldenburg, Oldenburg, Germany.
| | | | | |
Collapse
|
33
|
Menardy F, Touiki K, Dutrieux G, Bozon B, Vignal C, Mathevon N, Del Negro C. Social experience affects neuronal responses to male calls in adult female zebra finches. Eur J Neurosci 2012; 35:1322-36. [PMID: 22512260 DOI: 10.1111/j.1460-9568.2012.08047.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Plasticity studies have consistently shown that behavioural relevance can change the neural representation of sounds in the auditory system, but what occurs in the context of natural acoustic communication where significance could be acquired through social interaction remains to be explored. The zebra finch, a highly social songbird species that forms lifelong pair bonds and uses a vocalization, the distance call, to identify its mate, offers an opportunity to address this issue. Here, we recorded spiking activity in females while presenting distance calls that differed in their degree of familiarity: calls produced by the mate, by a familiar male, or by an unfamiliar male. We focused on the caudomedial nidopallium (NCM), a secondary auditory forebrain region. Both the mate's call and the familiar call evoked responses that differed in magnitude from responses to the unfamiliar call. This distinction between responses was seen both in single unit recordings from anesthetized females and in multiunit recordings from awake freely moving females. In contrast, control females that had not heard them previously displayed responses of similar magnitudes to all three calls. In addition, more cells showed highly selective responses in mated than in control females, suggesting that experience-dependent plasticity in call-evoked responses resulted in enhanced discrimination of auditory stimuli. Our results as a whole demonstrate major changes in the representation of natural vocalizations in the NCM within the context of individual recognition. The functional properties of NCM neurons may thus change continuously to adapt to the social environment.
Collapse
Affiliation(s)
- F Menardy
- CNPS, UMR CNRS 8195, Paris-Sud University, Orsay, France
| | | | | | | | | | | | | |
Collapse
|
34
|
Fishman YI, Micheyl C, Steinschneider M. Neural mechanisms of rhythmic masking release in monkey primary auditory cortex: implications for models of auditory scene analysis. J Neurophysiol 2012; 107:2366-82. [PMID: 22323627 DOI: 10.1152/jn.01010.2011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The ability to detect and track relevant acoustic signals embedded in a background of other sounds is crucial for hearing in complex acoustic environments. This ability is exemplified by a perceptual phenomenon known as "rhythmic masking release" (RMR). To demonstrate RMR, a sequence of tones forming a target rhythm is intermingled with physically identical "Distracter" sounds that perceptually mask the rhythm. The rhythm can be "released from masking" by adding "Flanker" tones in adjacent frequency channels that are synchronous with the Distracters. RMR represents a special case of auditory stream segregation, whereby the target rhythm is perceptually segregated from the background of Distracters when they are accompanied by the synchronous Flankers. The neural basis of RMR is unknown. Previous studies suggest the involvement of primary auditory cortex (A1) in the perceptual organization of sound patterns. Here, we recorded neural responses to RMR sequences in A1 of awake monkeys in order to identify neural correlates and potential mechanisms of RMR. We also tested whether two current models of stream segregation, when applied to these responses, could account for the perceptual organization of RMR sequences. Results suggest a key role for suppression of Distracter-evoked responses by the simultaneous Flankers in the perceptual restoration of the target rhythm in RMR. Furthermore, predictions of stream segregation models paralleled the psychoacoustics of RMR in humans. These findings reinforce the view that preattentive or "primitive" aspects of auditory scene analysis may be explained by relatively basic neural mechanisms at the cortical level.
Collapse
Affiliation(s)
- Yonatan I Fishman
- Department of Neurology, Albert Einstein College of Medicine, Kennedy Center, 1410 Pelham Parkway, Bronx, NY 10461, USA.
| | | | | |
Collapse
|
35
|
Abstract
Twenty years ago, a new conceptual paradigm known as 'receiver psychology' was introduced to explain the evolution of animal communication systems. This paradigm advanced the idea that psychological processes in the receiver's nervous system influence a signal's detectability, discriminability and memorability, and thereby serve as powerful sources of selection shaping signal design. While advancing our understanding of signal diversity, more recent studies make clear that receiver psychology, as a paradigm, has been structured too narrowly and does not incorporate many of the perceptual and cognitive processes of signal reception that operate between sensory transduction and a receiver's response. Consequently, the past two decades of research on receiver psychology have emphasized considerations of signal evolution but failed to ask key questions about the mechanisms of signal reception and their evolution. The primary aim of this essay is to advocate for a broader receiver psychology paradigm that more explicitly includes a research focus on receivers' psychological landscapes. We review recent experimental studies of hearing and sound communication to illustrate how considerations of several general perceptual and cognitive processes will facilitate future research on animal signalling systems. We also emphasize how a rigorous comparative approach to receiver psychology is critical to explicating the full range of perceptual and cognitive processes involved in receiving and responding to signals.
Collapse
Affiliation(s)
- Cory T. Miller
- Department of Psychology, University of California, San Diego
| | - Mark A. Bee
- Department of Ecology, Evolution and Behavior, University of Minnesota, Twin Cities
| |
Collapse
|
36
|
Haywood NR, Roberts B. Effects of inducer continuity on auditory stream segregation: comparison of physical and perceived continuity in different contexts. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:2917-2927. [PMID: 22087920 DOI: 10.1121/1.3643811] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The factors influencing the stream segregation of discrete tones and the perceived continuity of discrete tones as continuing through an interrupting masker are well understood as separate phenomena. Two experiments tested whether perceived continuity can influence the build-up of stream segregation by manipulating the perception of continuity during an induction sequence and measuring streaming in a subsequent test sequence comprising three triplets of low and high frequency tones (LHL-[ellipsis (horizontal)]). For experiment 1, a 1.2-s standard induction sequence comprising six 100-ms L-tones strongly promoted segregation, whereas a single extended L-inducer (1.1 s plus 100-ms silence) did not. Segregation was similar to that following the single extended inducer when perceived continuity was evoked by inserting noise bursts between the individual tones. Reported segregation increased when the noise level was reduced such that perceived continuity no longer occurred. Experiment 2 presented a 1.3-s continuous inducer created by bridging the 100-ms silence between an extended L-inducer and the first test-sequence tone. This configuration strongly promoted segregation. Segregation was also increased by filling the silence after the extended inducer with noise, such that it was perceived like a bridging inducer. Like physical continuity, perceived continuity can promote or reduce test-sequence streaming, depending on stimulus context.
Collapse
Affiliation(s)
- Nicholas R Haywood
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom.
| | | |
Collapse
|
37
|
Dykstra AR, Halgren E, Thesen T, Carlson CE, Doyle W, Madsen JR, Eskandar EN, Cash SS. Widespread Brain Areas Engaged during a Classical Auditory Streaming Task Revealed by Intracranial EEG. Front Hum Neurosci 2011; 5:74. [PMID: 21886615 PMCID: PMC3154443 DOI: 10.3389/fnhum.2011.00074] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2011] [Accepted: 07/19/2011] [Indexed: 11/30/2022] Open
Abstract
The auditory system must constantly decompose the complex mixture of sound arriving at the ear into perceptually independent streams constituting accurate representations of individual sources in the acoustic environment. How the brain accomplishes this task is not well understood. The present study combined a classic behavioral paradigm with direct cortical recordings from neurosurgical patients with epilepsy in order to further describe the neural correlates of auditory streaming. Participants listened to sequences of pure tones alternating in frequency and indicated whether they heard one or two "streams." The intracranial EEG was simultaneously recorded from sub-dural electrodes placed over temporal, frontal, and parietal cortex. Like healthy subjects, patients heard one stream when the frequency separation between tones was small and two when it was large. Robust evoked-potential correlates of frequency separation were observed over widespread brain areas. Waveform morphology was highly variable across individual electrode sites both within and across gross brain regions. Surprisingly, few evoked-potential correlates of perceptual organization were observed after controlling for physical stimulus differences. The results indicate that the cortical areas engaged during the streaming task are more complex and widespread than has been demonstrated by previous work, and that, by-and-large, correlates of bistability during streaming are probably located on a spatial scale not assessed - or in a brain area not examined - by the present study.
Collapse
Affiliation(s)
- Andrew R. Dykstra
- Program in Speech and Hearing Bioscience and Technology, Harvard-MIT Division of Health Sciences and TechnologyCambridge, MA, USA
- Cortical Physiology Laboratory, Department of Neurology, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Eric Halgren
- Department of Radiology, University of California San DiegoSan Diego, CA, USA
- Department of Neurosciences, University of California San DiegoSan Diego, CA, USA
| | - Thomas Thesen
- Comprehensive Epilepsy Center, New York University School of MedicineNew York, NY, USA
| | - Chad E. Carlson
- Comprehensive Epilepsy Center, New York University School of MedicineNew York, NY, USA
| | - Werner Doyle
- Comprehensive Epilepsy Center, New York University School of MedicineNew York, NY, USA
| | - Joseph R. Madsen
- Department of Neurosurgery, Brigham and Women's Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Emad N. Eskandar
- Department of Neurosurgery, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Sydney S. Cash
- Cortical Physiology Laboratory, Department of Neurology, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| |
Collapse
|
38
|
Nityananda V, Bee MA. Finding your mate at a cocktail party: frequency separation promotes auditory stream segregation of concurrent voices in multi-species frog choruses. PLoS One 2011; 6:e21191. [PMID: 21698268 PMCID: PMC3115990 DOI: 10.1371/journal.pone.0021191] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 05/22/2011] [Indexed: 11/18/2022] Open
Abstract
Vocal communication in crowded social environments is a difficult problem for both humans and nonhuman animals. Yet many important social behaviors require listeners to detect, recognize, and discriminate among signals in a complex acoustic milieu comprising the overlapping signals of multiple individuals, often of multiple species. Humans exploit a relatively small number of acoustic cues to segregate overlapping voices (as well as other mixtures of concurrent sounds, like polyphonic music). By comparison, we know little about how nonhuman animals are adapted to solve similar communication problems. One important cue enabling source segregation in human speech communication is that of frequency separation between concurrent voices: differences in frequency promote perceptual segregation of overlapping voices into separate "auditory streams" that can be followed through time. In this study, we show that frequency separation (ΔF) also enables frogs to segregate concurrent vocalizations, such as those routinely encountered in mixed-species breeding choruses. We presented female gray treefrogs (Hyla chrysoscelis) with a pulsed target signal (simulating an attractive conspecific call) in the presence of a continuous stream of distractor pulses (simulating an overlapping, unattractive heterospecific call). When the ΔF between target and distractor was small (e.g., ≤3 semitones), females exhibited low levels of responsiveness, indicating a failure to recognize the target as an attractive signal when the distractor had a similar frequency. Subjects became increasingly more responsive to the target, as indicated by shorter latencies for phonotaxis, as the ΔF between target and distractor increased (e.g., ΔF = 6-12 semitones). These results support the conclusion that gray treefrogs, like humans, can exploit frequency separation as a perceptual cue to segregate concurrent voices in noisy social environments. The ability of these frogs to segregate concurrent voices based on frequency separation may involve ancient hearing mechanisms for source segregation shared with humans and other vertebrates.
Collapse
Affiliation(s)
- Vivek Nityananda
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Twin Cities, St. Paul, Minnesota, United States of America
| | - Mark A. Bee
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Twin Cities, St. Paul, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
39
|
Objective and subjective psychophysical measures of auditory stream integration and segregation. J Assoc Res Otolaryngol 2010; 11:709-24. [PMID: 20658165 DOI: 10.1007/s10162-010-0227-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2009] [Accepted: 06/30/2010] [Indexed: 10/19/2022] Open
Abstract
The perceptual organization of sound sequences into auditory streams involves the integration of sounds into one stream and the segregation of sounds into separate streams. "Objective" psychophysical measures of auditory streaming can be obtained using behavioral tasks where performance is facilitated by segregation and hampered by integration, or vice versa. Traditionally, these two types of tasks have been tested in separate studies involving different listeners, procedures, and stimuli. Here, we tested subjects in two complementary temporal-gap discrimination tasks involving similar stimuli and procedures. One task was designed so that performance in it would be facilitated by perceptual integration; the other, so that performance would be facilitated by perceptual segregation. Thresholds were measured in both tasks under a wide range of conditions produced by varying three stimulus parameters known to influence stream formation: frequency separation, tone-presentation rate, and sequence length. In addition to these performance-based measures, subjective judgments of perceived segregation were collected in the same listeners under corresponding stimulus conditions. The patterns of results obtained in the two temporal-discrimination tasks, and the relationships between thresholds and perceived-segregation judgments, were mostly consistent with the hypothesis that stream segregation helped performance in one task and impaired performance in the other task. The tasks and stimuli described here may prove useful in future behavioral or neurophysiological experiments, which seek to manipulate and measure neural correlates of auditory streaming while minimizing differences between the physical stimuli.
Collapse
|