1
|
Kalra L, Altman S, Bee MA. Perceptually salient differences in a species recognition cue do not promote auditory streaming in eastern grey treefrogs (Hyla versicolor). J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2024:10.1007/s00359-024-01702-9. [PMID: 38733407 DOI: 10.1007/s00359-024-01702-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/13/2024]
Abstract
Auditory streaming underlies a receiver's ability to organize complex mixtures of auditory input into distinct perceptual "streams" that represent different sound sources in the environment. During auditory streaming, sounds produced by the same source are integrated through time into a single, coherent auditory stream that is perceptually segregated from other concurrent sounds. Based on human psychoacoustic studies, one hypothesis regarding auditory streaming is that any sufficiently salient perceptual difference may lead to stream segregation. Here, we used the eastern grey treefrog, Hyla versicolor, to test this hypothesis in the context of vocal communication in a non-human animal. In this system, females choose their mate based on perceiving species-specific features of a male's pulsatile advertisement calls in social environments (choruses) characterized by mixtures of overlapping vocalizations. We employed an experimental paradigm from human psychoacoustics to design interleaved pulsatile sequences (ABAB…) that mimicked key features of the species' advertisement call, and in which alternating pulses differed in pulse rise time, which is a robust species recognition cue in eastern grey treefrogs. Using phonotaxis assays, we found no evidence that perceptually salient differences in pulse rise time promoted the segregation of interleaved pulse sequences into distinct auditory streams. These results do not support the hypothesis that any perceptually salient acoustic difference can be exploited as a cue for stream segregation in all species. We discuss these findings in the context of cues used for species recognition and auditory streaming.
Collapse
Affiliation(s)
- Lata Kalra
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Shoshana Altman
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA
| | - Mark A Bee
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA
| |
Collapse
|
2
|
Banno T, Shirley H, Fishman YI, Cohen YE. Changes in neural readout of response magnitude during auditory streaming do not correlate with behavioral choice in the auditory cortex. Cell Rep 2023; 42:113493. [PMID: 38039133 PMCID: PMC10784988 DOI: 10.1016/j.celrep.2023.113493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/01/2023] [Accepted: 11/09/2023] [Indexed: 12/03/2023] Open
Abstract
A fundamental goal of the auditory system is to group stimuli from the auditory environment into a perceptual unit (i.e., "stream") or segregate the stimuli into multiple different streams. Although previous studies have clarified the psychophysical and neural mechanisms that may underlie this ability, the relationship between these mechanisms remains elusive. Here, we recorded multiunit activity (MUA) from the auditory cortex of monkeys while they participated in an auditory-streaming task consisting of interleaved low- and high-frequency tone bursts. As the streaming stimulus unfolded over time, MUA amplitude habituated; the magnitude of this habituation was correlated with the frequency difference between the tone bursts. An ideal-observer model could classify these time- and frequency-dependent changes into reports of "one stream" or "two streams" in a manner consistent with the behavioral literature. However, because classification was not modulated by the monkeys' behavioral choices, this MUA habituation may not directly reflect perceptual reports.
Collapse
Affiliation(s)
- Taku Banno
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Harry Shirley
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Yonatan I Fishman
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yale E Cohen
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA; Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
3
|
Harley HE, Fellner W, Frances C, Thomas A, Losch B, Newton K, Feuerbach D. Information-seeking across auditory scenes by an echolocating dolphin. Anim Cogn 2022; 25:1109-1131. [PMID: 36018473 DOI: 10.1007/s10071-022-01679-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 07/31/2022] [Accepted: 08/08/2022] [Indexed: 11/29/2022]
Abstract
Dolphins gain information through echolocation, a publicly accessible sensory system in which dolphins produce clicks and process returning echoes, thereby both investigating and contributing to auditory scenes. How their knowledge of these scenes contributes to their echoic information-seeking is unclear. Here, we investigate their top-down cognitive processes in an echoic matching-to-sample task in which targets and auditory scenes vary in their decipherability and shift from being completely unfamiliar to familiar. A blind-folded adult male dolphin investigated a target sample positioned in front of a hydrophone to allow recording of clicks, a measure of information-seeking and effort; the dolphin received fish for choosing an object identical to the sample from 3 alternatives. We presented 20 three-object sets, unfamiliar in the first five 18-trial sessions with each set. Performance accuracy and click counts varied widely across sets. Click counts of the four lowest-performance-accuracy/low-discriminability sets (X = 41%) and the four highest-performance-accuracy/high-discriminability sets (X = 91%) were similar at the first sessions' starts and then decreased for both kinds of scenes, although the decrease was substantially greater for low-discriminability sets. In four challenging-but-doable sets, number of clicks remained relatively steady across the 5 sessions. Reduced echoic effort with low-discriminability sets was not due to overall motivation: the differential relationship between click number and object-set discriminability was maintained when difficult and easy trials were interleaved and when objects from originally difficult scenes were grouped with more discriminable objects. These data suggest that dolphins calibrate their echoic information-seeking effort based on their knowledge and expectations of auditory scenes.
Collapse
Affiliation(s)
- Heidi E Harley
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA. .,The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA.
| | - Wendi Fellner
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| | - Candice Frances
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA.,Basque Center on Cognition, Brain and Language, Donostia, Spain
| | - Amber Thomas
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA.,The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| | - Barbara Losch
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| | - Katherine Newton
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA.,Department of Fisheries and Wildlife, Oregon State University, Corvallis, USA
| | - David Feuerbach
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| |
Collapse
|
4
|
Johnson JS, Niwa M, O'Connor KN, Sutter ML. Amplitude modulation encoding in the auditory cortex: comparisons between the primary and middle lateral belt regions. J Neurophysiol 2020; 124:1706-1726. [PMID: 33026929 DOI: 10.1152/jn.00171.2020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
In macaques, the middle lateral auditory cortex (ML) is a belt region adjacent to the primary auditory cortex (A1) and believed to be at a hierarchically higher level. Although ML single-unit responses have been studied for several auditory stimuli, the ability of ML cells to encode amplitude modulation (AM)-an ability that has been widely studied in A1-has not yet been characterized. Here, we compared the responses of A1 and ML neurons to amplitude-modulated (AM) noise in awake macaques. Although several of the basic properties of A1 and ML responses to AM noise were similar, we found several key differences. ML neurons were less likely to phase lock, did not phase lock as strongly, and were more likely to respond in a nonsynchronized fashion than A1 cells, consistent with a temporal-to-rate transformation as information ascends the auditory hierarchy. ML neurons tended to have lower temporally (phase-locking) based best modulation frequencies than A1 neurons. Neurons that decreased their firing rate in response to AM noise relative to their firing rate in response to unmodulated noise became more common at the level of ML than they were in A1. In both A1 and ML, we found a prevalent class of neurons that usually have enhanced rate responses relative to responses to the unmodulated noise at lower modulation frequencies and suppressed rate responses relative to responses to the unmodulated noise at middle modulation frequencies.NEW & NOTEWORTHY ML neurons synchronized less than A1 neurons, consistent with a hierarchical temporal-to-rate transformation. Both A1 and ML had a class of modulation transfer functions previously unreported in the cortex with a low-modulation-frequency (MF) peak, a middle-MF trough, and responses similar to unmodulated noise responses at high MFs. The results support a hierarchical shift toward a two-pool opponent code, where subtraction of neural activity between two populations of oppositely tuned neurons encodes AM.
Collapse
Affiliation(s)
- Jeffrey S Johnson
- Center for Neuroscience, University of California, Davis, California
| | - Mamiko Niwa
- Center for Neuroscience, University of California, Davis, California
| | - Kevin N O'Connor
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| | - Mitchell L Sutter
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| |
Collapse
|
5
|
Chakrabarty D, Elhilali M. A Gestalt inference model for auditory scene segregation. PLoS Comput Biol 2019; 15:e1006711. [PMID: 30668568 PMCID: PMC6358108 DOI: 10.1371/journal.pcbi.1006711] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Revised: 02/01/2019] [Accepted: 12/12/2018] [Indexed: 11/18/2022] Open
Abstract
Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.
Collapse
Affiliation(s)
- Debmalya Chakrabarty
- Laboratory for Computational Audio Processing, Center for Speech and Language Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Mounya Elhilali
- Laboratory for Computational Audio Processing, Center for Speech and Language Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
- * E-mail:
| |
Collapse
|
6
|
Cai H, Screven LA, Dent ML. Behavioral measurements of auditory streaming and build-up by budgerigars ( Melopsittacus undulatus). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1508. [PMID: 30424658 DOI: 10.1121/1.5054297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/27/2018] [Indexed: 06/09/2023]
Abstract
The perception of the build-up of auditory streaming has been widely investigated in humans, while it is unknown whether animals experience a similar perception when hearing high (H) and low (L) tonal pattern sequences. The paradigm previously used in European starlings (Sturnus vulgaris) was adopted in two experiments to address the build-up of auditory streaming in budgerigars (Melopsittacus undulatus). In experiment 1, different numbers of repetitions of low-high-low triplets were used in five conditions to study the build-up process. In experiment 2, 5 and 15 repetitions of high-low-high triplets were used to investigate the effects of repetition rate, frequency separation, and frequency range of the two tones on the birds' streaming perception. Similar to humans, budgerigars subjectively experienced the build-up process in auditory streaming; faster repetition rates and larger frequency separations enhanced the streaming perception, and these results were consistent across the two frequency ranges. Response latency analysis indicated that the budgerigars needed a longer amount of time to respond to stimuli that elicited a salient streaming perception. These results indicate, for the first time using a behavioral paradigm, that budgerigars experience a build-up of auditory streaming in a manner similar to humans.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Laurel A Screven
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Micheal L Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| |
Collapse
|
7
|
Lai J, Bartlett EL. Masking Differentially Affects Envelope-following Responses in Young and Aged Animals. Neuroscience 2018; 386:150-165. [PMID: 29953908 PMCID: PMC6076866 DOI: 10.1016/j.neuroscience.2018.06.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Revised: 05/31/2018] [Accepted: 06/04/2018] [Indexed: 11/21/2022]
Abstract
Age-related hearing decline typically includes threshold shifts as well as reduced wave I auditory brainstem response (ABR) amplitudes due to cochlear synaptopathy/neuropathy, which may compromise precise coding of suprathreshold speech envelopes. This is supported by findings with older listeners, who have difficulties in envelope and speech processing, especially in noise. However, separating the effects of threshold elevation, synaptopathy, and degradation by noise on physiological representations may be difficult. In the present study, the effects of notched, low- and high-pass noise on envelope-following responses (EFRs) in aging were compared when sound levels (aged: 85-dB SPL; young: 60- to 80-dB SPL) were matched between groups peripherally, by matching wave I ABR amplitudes, or centrally by matching EFR amplitudes. Low-level notched noise reduced EFRs to sinusoidally amplitude-modulated (SAM) tones in young animals for notch widths up to 2 octaves. High-pass noise above the carrier frequency reduced EFRs. Young animals showed EFR reductions at lower noise levels. Low-pass noise did not reduce EFRs in either young or aged animals. High-pass noise may affect EFR amplitudes in young animals more than aged by reducing the contributions of high-frequency-sensitive inputs. EFRs to SAM tones in modulated noise (NAM) suggest that neurons of young animals can synchronize to NAM at lower sound levels and maintain dual AM representations better than older animals. The overall results show that EFR amplitudes are strongly influenced by aging and the presence of a competing sound that likely reduces or shifts the pool of responsive neurons.
Collapse
Affiliation(s)
- Jesyin Lai
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Oregon Hearing Research Center, Oregon Health and Science University, Portland, OR 97239, USA
| | - Edward L Bartlett
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|
8
|
Noda T, Takahashi H. Behavioral evaluation of auditory stream segregation in rats. Neurosci Res 2018; 141:52-62. [PMID: 29580889 DOI: 10.1016/j.neures.2018.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 03/08/2018] [Accepted: 03/22/2018] [Indexed: 10/17/2022]
Abstract
Perceptual organization of sound sequences into separate sound sources or streams is called auditory stream segregation. Neural substrates for this process in both the spectral and temporal domains remain to be elucidated. Despite abundant knowledge about their auditory physiology, behavioral evidence for auditory streaming in rodents is still limited. We provided behavioral evidence for auditory streaming in the go/no-go discrimination task, but not in the two-alternative choice task. In the go/no-go discrimination phase, rats were able to discriminate different rhythms corresponding to segregated or integrated tone sequences in both short inter-tone interval (ITI) and long ITI conditions. Nevertheless, performance was poorer in the long ITI group. In probe testing, which assessed the ability to discriminate one of the segregated tone sequences from ABA- tone sequences, the detection rate increased with the difference in frequency (ΔF) for short (100 ms), but not long (200 ms) ITIs. Our results indicate that auditory streaming in rats on both the spectral and temporal features in the ABA- tone paradigm is qualitatively analogous to that observed in human psychophysics studies. This suggests that rodents are a valuable model for investigating the neural substrates of auditory streaming.
Collapse
Affiliation(s)
- Takahiro Noda
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Hirokazu Takahashi
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
9
|
A Crucial Test of the Population Separation Model of Auditory Stream Segregation in Macaque Primary Auditory Cortex. J Neurosci 2017; 37:10645-10655. [PMID: 28954867 DOI: 10.1523/jneurosci.0792-17.2017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 08/29/2017] [Accepted: 09/05/2017] [Indexed: 11/21/2022] Open
Abstract
An important aspect of auditory scene analysis is auditory stream segregation-the organization of sound sequences into perceptual streams reflecting different sound sources in the environment. Several models have been proposed to account for stream segregation. According to the "population separation" (PS) model, alternating ABAB tone sequences are perceived as a single stream or as two separate streams when "A" and "B" tones activate the same or distinct frequency-tuned neuronal populations in primary auditory cortex (A1), respectively. A crucial test of the PS model is whether it can account for the observation that A and B tones are generally perceived as a single stream when presented synchronously, rather than in an alternating pattern, even if they are widely separated in frequency. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in A1 of male macaques. Consistent with predictions of the PS model, a greater effective tonotopic separation of A and B tone responses was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. While other models of stream segregation, such as temporal coherence, are not excluded by the present findings, we conclude that PS is sufficient to account for the perceptual organization of ALT and SYNC sequences and thus remains a viable model of auditory stream segregation.SIGNIFICANCE STATEMENT According to the population separation (PS) model of auditory stream segregation, sounds that activate the same or separate neural populations in primary auditory cortex (A1) are perceived as one or two streams, respectively. It is unclear, however, whether the PS model can account for the perception of sounds as a single stream when they are presented synchronously. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in macaque A1. A greater effective separation of tonotopic activity patterns was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. Based on these findings, we conclude that PS remains a plausible neurophysiological model of auditory stream segregation.
Collapse
|
10
|
Comparison of perceptual properties of auditory streaming between spectral and amplitude modulation domains. Hear Res 2017; 350:244-250. [PMID: 28323019 DOI: 10.1016/j.heares.2017.03.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 02/20/2017] [Accepted: 03/15/2017] [Indexed: 11/21/2022]
Abstract
The two-tone sequence (ABA_), which comprises two different sounds (A and B) and a silent gap, has been used to investigate how the auditory system organizes sequential sounds depending on various stimulus conditions or brain states. Auditory streaming can be evoked by differences not only in the tone frequency ("spectral cue": ΔFTONE, TONE condition) but also in the amplitude modulation rate ("AM cue": ΔFAM, AM condition). The aim of the present study was to explore the relationship between the perceptual properties of auditory streaming for the TONE and AM conditions. A sequence with a long duration (400 repetitions of ABA_) was used to examine the property of the bistability of streaming. The ratio of feature differences that evoked an equivalent probability of the segregated percept was close to the ratio of the Q-values of the auditory and modulation filters, consistent with a "channeling theory" of auditory streaming. On the other hand, for values of ΔFAM and ΔFTONE evoking equal probabilities of the segregated percept, the number of perceptual switches was larger for the TONE condition than for the AM condition, indicating that the mechanism(s) that determine the bistability of auditory streaming are different between or sensitive to the two domains. Nevertheless, the number of switches for individual listeners was positively correlated between the spectral and AM domains. The results suggest a possibility that the neural substrates for spectral and AM processes share a common switching mechanism but differ in location and/or in the properties of neural activity or the strength of internal noise at each level.
Collapse
|
11
|
Itatani N, Klump GM. Animal models for auditory streaming. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0112. [PMID: 28044022 DOI: 10.1098/rstb.2016.0112] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2016] [Indexed: 11/12/2022] Open
Abstract
Sounds in the natural environment need to be assigned to acoustic sources to evaluate complex auditory scenes. Separating sources will affect the analysis of auditory features of sounds. As the benefits of assigning sounds to specific sources accrue to all species communicating acoustically, the ability for auditory scene analysis is widespread among different animals. Animal studies allow for a deeper insight into the neuronal mechanisms underlying auditory scene analysis. Here, we will review the paradigms applied in the study of auditory scene analysis and streaming of sequential sounds in animal models. We will compare the psychophysical results from the animal studies to the evidence obtained in human psychophysics of auditory streaming, i.e. in a task commonly used for measuring the capability for auditory scene analysis. Furthermore, the neuronal correlates of auditory streaming will be reviewed in different animal models and the observations of the neurons' response measures will be related to perception. The across-species comparison will reveal whether similar demands in the analysis of acoustic scenes have resulted in similar perceptual and neuronal processing mechanisms in the wide range of species being capable of auditory scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Naoya Itatani
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| | - Georg M Klump
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
12
|
Eggermont JJ. Animal models of auditory temporal processing. Int J Psychophysiol 2015; 95:202-15. [DOI: 10.1016/j.ijpsycho.2014.03.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Revised: 03/27/2014] [Accepted: 03/27/2014] [Indexed: 10/25/2022]
|
13
|
Kettler L, Wagner H. Influence of double stimulation on sound-localization behavior in barn owls. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2014; 200:1033-44. [PMID: 25352361 DOI: 10.1007/s00359-014-0953-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Revised: 09/01/2014] [Accepted: 10/08/2014] [Indexed: 11/28/2022]
Abstract
Barn owls do not immediately approach a source after they hear a sound, but wait for a second sound before they strike. This represents a gain in striking behavior by avoiding responses to random incidents. However, the first stimulus is also expected to change the threshold for perceiving the subsequent second sound, thus possibly introducing some costs. We mimicked this situation in a behavioral double-stimulus paradigm utilizing saccadic head turns of owls. The first stimulus served as an adapter, was presented in frontal space, and did not elicit a head turn. The second stimulus, emitted from a peripheral source, elicited the head turn. The time interval between both stimuli was varied. Data obtained with double stimulation were compared with data collected with a single stimulus from the same positions as the second stimulus in the double-stimulus paradigm. Sound-localization performance was quantified by the response latency, accuracy, and precision of the head turns. Response latency was increased with double stimuli, while accuracy and precision were decreased. The effect depended on the inter-stimulus interval. These results suggest that waiting for a second stimulus may indeed impose costs on sound localization by adaptation and this reduces the gain obtained by waiting for a second stimulus.
Collapse
Affiliation(s)
- Lutz Kettler
- Department of Zoology and Animal Physiology, Aachen University, Worringerweg 3, 52074, Aachen, Germany,
| | | |
Collapse
|
14
|
Neural correlates of auditory streaming in an objective behavioral task. Proc Natl Acad Sci U S A 2014; 111:10738-43. [PMID: 25002519 DOI: 10.1073/pnas.1321487111] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Segregating streams of sounds from sources in complex acoustic scenes is crucial for perception in real world situations. We analyzed an objective psychophysical measure of stream segregation obtained while simultaneously recording forebrain neurons in the European starlings to investigate neural correlates of segregating a stream of A tones from a stream of B tones presented at one-half the rate. The objective measure, sensitivity for time shift detection of the B tone, was higher when the A and B tones were of the same frequency (one stream) compared with when there was a 6- or 12-semitone difference between them (two streams). The sensitivity for representing time shifts in spiking patterns was correlated with the behavioral sensitivity. The spiking patterns reflected the stimulus characteristics but not the behavioral response, indicating that the birds' primary cortical field represents the segregated streams, but not the decision process.
Collapse
|
15
|
Dolležal LV, Brechmann A, Klump GM, Deike S. Evaluating auditory stream segregation of SAM tone sequences by subjective and objective psychoacoustical tasks, and brain activity. Front Neurosci 2014; 8:119. [PMID: 24936170 PMCID: PMC4047832 DOI: 10.3389/fnins.2014.00119] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 05/03/2014] [Indexed: 11/13/2022] Open
Abstract
Auditory stream segregation refers to a segregated percept of signal streams with different acoustic features. Different approaches have been pursued in studies of stream segregation. In psychoacoustics, stream segregation has mostly been investigated with a subjective task asking the subjects to report their percept. Few studies have applied an objective task in which stream segregation is evaluated indirectly by determining thresholds for a percept that depends on whether auditory streams are segregated or not. Furthermore, both perceptual measures and physiological measures of brain activity have been employed but only little is known about their relation. How the results from different tasks and measures are related is evaluated in the present study using examples relying on the ABA- stimulation paradigm that apply the same stimuli. We presented A and B signals that were sinusoidally amplitude modulated (SAM) tones providing purely temporal, spectral or both types of cues to evaluate perceptual stream segregation and its physiological correlate. Which types of cues are most prominent was determined by the choice of carrier and modulation frequencies (f mod) of the signals. In the subjective task subjects reported their percept and in the objective task we measured their sensitivity for detecting time-shifts of B signals in an ABA- sequence. As a further measure of processes underlying stream segregation we employed functional magnetic resonance imaging (fMRI). SAM tone parameters were chosen to evoke an integrated (1-stream), a segregated (2-stream), or an ambiguous percept by adjusting the f mod difference between A and B tones (Δf mod). The results of both psychoacoustical tasks are significantly correlated. BOLD responses in fMRI depend on Δf mod between A and B SAM tones. The effect of Δf mod, however, differs between auditory cortex and frontal regions suggesting differences in representation related to the degree of perceptual ambiguity of the sequences.
Collapse
Affiliation(s)
- Lena-Vanessa Dolležal
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Center of Excellence "Hearing4all," Carl von Ossietzky University Oldenburg Oldenburg, Germany
| | - André Brechmann
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology Magdeburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Center of Excellence "Hearing4all," Carl von Ossietzky University Oldenburg Oldenburg, Germany
| | - Susann Deike
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology Magdeburg, Germany
| |
Collapse
|
16
|
Alain C, Zendel BR, Hutka S, Bidelman GM. Turning down the noise: The benefit of musical training on the aging auditory brain. Hear Res 2014. [DOI: 10.10.1016/j.heares.2013.06.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
17
|
Noda T, Kanzaki R, Takahashi H. Stimulus phase locking of cortical oscillation for auditory stream segregation in rats. PLoS One 2013; 8:e83544. [PMID: 24376715 PMCID: PMC3869811 DOI: 10.1371/journal.pone.0083544] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 11/06/2013] [Indexed: 11/19/2022] Open
Abstract
The phase of cortical oscillations contains rich information and is valuable for encoding sound stimuli. Here we hypothesized that oscillatory phase modulation, instead of amplitude modulation, is a neural correlate of auditory streaming. Our behavioral evaluation provided compelling evidences for the first time that rats are able to organize auditory stream. Local field potentials (LFPs) were investigated in the cortical layer IV or deeper in the primary auditory cortex of anesthetized rats. In response to ABA- sequences with different inter-tone intervals and frequency differences, neurometric functions were characterized with phase locking as well as the band-specific amplitude evoked by test tones. Our results demonstrated that under large frequency differences and short inter-tone intervals, the neurometric function based on stimulus phase locking in higher frequency bands, particularly the gamma band, could better describe van Noorden's perceptual boundary than the LFP amplitude. Furthermore, the gamma-band neurometric function showed a build-up-like effect within around 3 seconds from sequence onset. These findings suggest that phase locking and amplitude have different roles in neural computation, and support our hypothesis that temporal modulation of cortical oscillations should be considered to be neurophysiological mechanisms of auditory streaming, in addition to forward suppression, tonotopic separation, and multi-second adaptation.
Collapse
Affiliation(s)
- Takahiro Noda
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Ryohei Kanzaki
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
- Department of Mechano-Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Hirokazu Takahashi
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
- Department of Mechano-Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
- Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, Saitama, Japan
- * E-mail:
| |
Collapse
|
18
|
Alain C, Zendel BR, Hutka S, Bidelman GM. Turning down the noise: the benefit of musical training on the aging auditory brain. Hear Res 2013; 308:162-73. [PMID: 23831039 DOI: 10.1016/j.heares.2013.06.008] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Revised: 06/19/2013] [Accepted: 06/24/2013] [Indexed: 11/29/2022]
Abstract
Age-related decline in hearing abilities is a ubiquitous part of aging, and commonly impacts speech understanding, especially when there are competing sound sources. While such age effects are partially due to changes within the cochlea, difficulties typically exist beyond measurable hearing loss, suggesting that central brain processes, as opposed to simple peripheral mechanisms (e.g., hearing sensitivity), play a critical role in governing hearing abilities late into life. Current training regimens aimed to improve central auditory processing abilities have experienced limited success in promoting listening benefits. Interestingly, recent studies suggest that in young adults, musical training positively modifies neural mechanisms, providing robust, long-lasting improvements to hearing abilities as well as to non-auditory tasks that engage cognitive control. These results offer the encouraging possibility that musical training might be used to counteract age-related changes in auditory cognition commonly observed in older adults. Here, we reviewed studies that have examined the effects of age and musical experience on auditory cognition with an emphasis on auditory scene analysis. We infer that musical training may offer potential benefits to complex listening and might be utilized as a means to delay or even attenuate declines in auditory perception and cognition that often emerge later in life.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Canada; Department of Psychology, University of Toronto, Canada.
| | - Benjamin Rich Zendel
- International Laboratory for Brain, Music and Sound Research (BRAMS), Département de Psychologie, Université de Montréal, Québec, Canada; Centre de Recherche, Institut Universitaire de Gériatrie de Montréal, Québec, Canada
| | - Stefanie Hutka
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Canada; Department of Psychology, University of Toronto, Canada
| | - Gavin M Bidelman
- Institute for Intelligent Systems & School of Communication Sciences and Disorders, University of Memphis, USA
| |
Collapse
|
19
|
Mill RW, Bőhm TM, Bendixen A, Winkler I, Denham SL. Modelling the emergence and dynamics of perceptual organisation in auditory streaming. PLoS Comput Biol 2013; 9:e1002925. [PMID: 23516340 PMCID: PMC3597549 DOI: 10.1371/journal.pcbi.1002925] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 12/31/2012] [Indexed: 11/29/2022] Open
Abstract
Many sound sources can only be recognised from the pattern of sounds they emit, and not from the individual sound events that make up their emission sequences. Auditory scene analysis addresses the difficult task of interpreting the sound world in terms of an unknown number of discrete sound sources (causes) with possibly overlapping signals, and therefore of associating each event with the appropriate source. There are potentially many different ways in which incoming events can be assigned to different causes, which means that the auditory system has to choose between them. This problem has been studied for many years using the auditory streaming paradigm, and recently it has become apparent that instead of making one fixed perceptual decision, given sufficient time, auditory perception switches back and forth between the alternatives—a phenomenon known as perceptual bi- or multi-stability. We propose a new model of auditory scene analysis at the core of which is a process that seeks to discover predictable patterns in the ongoing sound sequence. Representations of predictable fragments are created on the fly, and are maintained, strengthened or weakened on the basis of their predictive success, and conflict with other representations. Auditory perceptual organisation emerges spontaneously from the nature of the competition between these representations. We present detailed comparisons between the model simulations and data from an auditory streaming experiment, and show that the model accounts for many important findings, including: the emergence of, and switching between, alternative organisations; the influence of stimulus parameters on perceptual dominance, switching rate and perceptual phase durations; and the build-up of auditory streaming. The principal contribution of the model is to show that a two-stage process of pattern discovery and competition between incompatible patterns can account for both the contents (perceptual organisations) and the dynamics of human perception in auditory streaming. The sound waves produced by objects in the environment mix together before reaching the ears. Before we can make sense of an auditory scene, our brains must solve the puzzle of how to disassemble the sound waveform into groupings that correspond to the original source signals. How is this feat accomplished? We propose that the auditory system continually scans the structure of incoming signals in search of clues to indicate which pieces belong together. For instance, sound events may belong together if they have similar features, or form part of a clear temporal pattern. However this process is complicated by lack of knowledge of future events and the many possible ways in which even a simple sound sequence can be decomposed. The biological solution is multistability: one possible interpretation of a sound is perceived initially, which then gives way to another interpretation, and so on. We propose a model of auditory multistability, in which fragmental descriptions of the signal compete and cooperate to explain the sound scene. We demonstrate, using simplified experimental stimuli, that the model can account for both the contents (perceptual organisations) and the dynamics of human perception in auditory streaming.
Collapse
Affiliation(s)
- Robert W. Mill
- MRC Institute of Hearing Research, Nottingham, United Kingdom
| | - Tamás M. Bőhm
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, MTA, Budapest, Hungary
- Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
- * E-mail:
| | | | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, MTA, Budapest, Hungary
- Institute for Psychology, University of Szeged, Szeged, Hungary
| | - Susan L. Denham
- Cognition Institute and School of Psychology, University of Plymouth, Plymouth, United Kingdom
| |
Collapse
|
20
|
Knudsen DP, Gentner TQ. Active recognition enhances the representation of behaviorally relevant information in single auditory forebrain neurons. J Neurophysiol 2013; 109:1690-703. [PMID: 23303858 DOI: 10.1152/jn.00461.2012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Sensory systems are dynamic. They must process a wide range of natural signals that facilitate adaptive behaviors in a manner that depends on an organism's constantly changing goals. A full understanding of the sensory physiology that underlies adaptive natural behaviors must therefore account for the activity of sensory systems in light of these behavioral goals. Here we present a novel technique that combines in vivo electrophysiological recording from awake, freely moving songbirds with operant conditioning techniques that allow control over birds' recognition of conspecific song, a widespread natural behavior in songbirds. We show that engaging in a vocal recognition task alters the response properties of neurons in the caudal mesopallium (CM), an avian analog of mammalian auditory cortex, in European starlings. Compared with awake, passive listening, active engagement of subjects in an auditory recognition task results in neurons responding to fewer song stimuli and a decrease in the trial-to-trial variability in their driven firing rates. Mean firing rates also change during active recognition, but not uniformly. Relative to nonengaged listening, active recognition causes increases in the driven firing rates in some neurons, decreases in other neurons, and stimulus-specific changes in other neurons. These changes lead to both an increase in stimulus selectivity and an increase in the information conveyed by the neurons about the animals' behavioral task. This study demonstrates the behavioral dependence of neural responses in the avian auditory forebrain and introduces the starling as a model for real-time monitoring of task-related neural processing of complex auditory objects.
Collapse
Affiliation(s)
- Daniel P Knudsen
- Neurosciences Graduate Program, University of California San Diego, La Jolla, CA, USA
| | | |
Collapse
|
21
|
Oberfeld D, Stahn P. Sequential grouping modulates the effect of non-simultaneous masking on auditory intensity resolution. PLoS One 2012; 7:e48054. [PMID: 23110174 PMCID: PMC3480468 DOI: 10.1371/journal.pone.0048054] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Accepted: 09/26/2012] [Indexed: 11/22/2022] Open
Abstract
The presence of non-simultaneous maskers can result in strong impairment in auditory intensity resolution relative to a condition without maskers, and causes a complex pattern of effects that is difficult to explain on the basis of peripheral processing. We suggest that the failure of selective attention to the target tones is a useful framework for understanding these effects. Two experiments tested the hypothesis that the sequential grouping of the targets and the maskers into separate auditory objects facilitates selective attention and therefore reduces the masker-induced impairment in intensity resolution. In Experiment 1, a condition favoring the processing of the maskers and the targets as two separate auditory objects due to grouping by temporal proximity was contrasted with the usual forward masking setting where the masker and the target presented within each observation interval of the two-interval task can be expected to be grouped together. As expected, the former condition resulted in a significantly smaller masker-induced elevation of the intensity difference limens (DLs). In Experiment 2, embedding the targets in an isochronous sequence of maskers led to a significantly smaller DL-elevation than control conditions not favoring the perception of the maskers as a separate auditory stream. The observed effects of grouping are compatible with the assumption that a precise representation of target intensity is available at the decision stage, but that this information is used only in a suboptimal fashion due to limitations of selective attention. The data can be explained within a framework of object-based attention. The results impose constraints on physiological models of intensity discrimination. We discuss candidate structures for physiological correlates of the psychophysical data.
Collapse
Affiliation(s)
- Daniel Oberfeld
- Department of Psychology, Section Experimental Psychology, Johannes Gutenberg-Universität Mainz, Mainz, Germany.
| | | |
Collapse
|
22
|
Dolležal LV, Beutelmann R, Klump GM. Stream segregation in the perception of sinusoidally amplitude-modulated tones. PLoS One 2012; 7:e43615. [PMID: 22984436 PMCID: PMC3440405 DOI: 10.1371/journal.pone.0043615] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Accepted: 07/26/2012] [Indexed: 11/25/2022] Open
Abstract
Amplitude modulation can serve as a cue for segregating streams of sounds from different sources. Here we evaluate stream segregation in humans using ABA- sequences of sinusoidally amplitude modulated (SAM) tones. A and B represent SAM tones with the same carrier frequency (1000, 4000 Hz) and modulation depth (30, 100%). The modulation frequency of the A signals (fmodA) was 30, 100 or 300 Hz, respectively. The modulation frequency of the B signals was up to four octaves higher (Δfmod). Three different ABA- tone patterns varying in tone duration and stimulus onset asynchrony were presented to evaluate the effect of forward suppression. Subjects indicated their 1- or 2-stream percept on a touch screen at the end of each ABA- sequence (presentation time 5 or 15 s). Tone pattern, fmodA, Δfmod, carrier frequency, modulation depth and presentation time significantly affected the percentage of a 2-stream percept. The human psychophysical results are compared to responses of avian forebrain neurons evoked by different ABA- SAM tone conditions [1] that were broadly overlapping those of the present study. The neurons also showed significant effects of tone pattern and Δfmod that were comparable to effects observed in the present psychophysical study. Depending on the carrier frequency, modulation frequency, modulation depth and the width of the auditory filters, SAM tones may provide mainly temporal cues (sidebands fall within the range of the filter), spectral cues (sidebands fall outside the range of the filter) or possibly both. A computational model based on excitation pattern differences was used to predict the 50% threshold of 2-stream responses. In conditions for which the model predicts a considerably larger 50% threshold of 2-stream responses (i.e., larger Δfmod at threshold) than was observed, it is unlikely that spectral cues can provide an explanation of stream segregation by SAM.
Collapse
Affiliation(s)
- Lena-Vanessa Dolležal
- Animal Physiology and Behavior Group, Department of Biology and Environmental Sciences, Carl von Ossietzky University Oldenburg, Oldenburg, Germany.
| | | | | |
Collapse
|
23
|
Kashino M, Kondo HM. Functional brain networks underlying perceptual switching: auditory streaming and verbal transformations. Philos Trans R Soc Lond B Biol Sci 2012; 367:977-87. [PMID: 22371619 DOI: 10.1098/rstb.2011.0370] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Recent studies have shown that auditory scene analysis involves distributed neural sites below, in, and beyond the auditory cortex (AC). However, it remains unclear what role each site plays and how they interact in the formation and selection of auditory percepts. We addressed this issue through perceptual multistability phenomena, namely, spontaneous perceptual switching in auditory streaming (AS) for a sequence of repeated triplet tones, and perceptual changes for a repeated word, known as verbal transformations (VTs). An event-related fMRI analysis revealed brain activity timelocked to perceptual switching in the cerebellum for AS, in frontal areas for VT, and the AC and thalamus for both. The results suggest that motor-based prediction, produced by neural networks outside the auditory system, plays essential roles in the segmentation of acoustic sequences both in AS and VT. The frequency of perceptual switching was determined by a balance between the activation of two sites, which are proposed to be involved in exploring novel perceptual organization and stabilizing current perceptual organization. The effect of the gene polymorphism of catechol-O-methyltransferase (COMT) on individual variations in switching frequency suggests that the balance of exploration and stabilization is modulated by catecholamines such as dopamine and noradrenalin. These mechanisms would support the noteworthy flexibility of auditory scene analysis.
Collapse
Affiliation(s)
- Makio Kashino
- NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa 243-0198, Japan.
| | | |
Collapse
|
24
|
Fishman YI, Micheyl C, Steinschneider M. Neural mechanisms of rhythmic masking release in monkey primary auditory cortex: implications for models of auditory scene analysis. J Neurophysiol 2012; 107:2366-82. [PMID: 22323627 DOI: 10.1152/jn.01010.2011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The ability to detect and track relevant acoustic signals embedded in a background of other sounds is crucial for hearing in complex acoustic environments. This ability is exemplified by a perceptual phenomenon known as "rhythmic masking release" (RMR). To demonstrate RMR, a sequence of tones forming a target rhythm is intermingled with physically identical "Distracter" sounds that perceptually mask the rhythm. The rhythm can be "released from masking" by adding "Flanker" tones in adjacent frequency channels that are synchronous with the Distracters. RMR represents a special case of auditory stream segregation, whereby the target rhythm is perceptually segregated from the background of Distracters when they are accompanied by the synchronous Flankers. The neural basis of RMR is unknown. Previous studies suggest the involvement of primary auditory cortex (A1) in the perceptual organization of sound patterns. Here, we recorded neural responses to RMR sequences in A1 of awake monkeys in order to identify neural correlates and potential mechanisms of RMR. We also tested whether two current models of stream segregation, when applied to these responses, could account for the perceptual organization of RMR sequences. Results suggest a key role for suppression of Distracter-evoked responses by the simultaneous Flankers in the perceptual restoration of the target rhythm in RMR. Furthermore, predictions of stream segregation models paralleled the psychoacoustics of RMR in humans. These findings reinforce the view that preattentive or "primitive" aspects of auditory scene analysis may be explained by relatively basic neural mechanisms at the cortical level.
Collapse
Affiliation(s)
- Yonatan I Fishman
- Department of Neurology, Albert Einstein College of Medicine, Kennedy Center, 1410 Pelham Parkway, Bronx, NY 10461, USA.
| | | | | |
Collapse
|
25
|
Dykstra AR, Halgren E, Thesen T, Carlson CE, Doyle W, Madsen JR, Eskandar EN, Cash SS. Widespread Brain Areas Engaged during a Classical Auditory Streaming Task Revealed by Intracranial EEG. Front Hum Neurosci 2011; 5:74. [PMID: 21886615 PMCID: PMC3154443 DOI: 10.3389/fnhum.2011.00074] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2011] [Accepted: 07/19/2011] [Indexed: 11/30/2022] Open
Abstract
The auditory system must constantly decompose the complex mixture of sound arriving at the ear into perceptually independent streams constituting accurate representations of individual sources in the acoustic environment. How the brain accomplishes this task is not well understood. The present study combined a classic behavioral paradigm with direct cortical recordings from neurosurgical patients with epilepsy in order to further describe the neural correlates of auditory streaming. Participants listened to sequences of pure tones alternating in frequency and indicated whether they heard one or two "streams." The intracranial EEG was simultaneously recorded from sub-dural electrodes placed over temporal, frontal, and parietal cortex. Like healthy subjects, patients heard one stream when the frequency separation between tones was small and two when it was large. Robust evoked-potential correlates of frequency separation were observed over widespread brain areas. Waveform morphology was highly variable across individual electrode sites both within and across gross brain regions. Surprisingly, few evoked-potential correlates of perceptual organization were observed after controlling for physical stimulus differences. The results indicate that the cortical areas engaged during the streaming task are more complex and widespread than has been demonstrated by previous work, and that, by-and-large, correlates of bistability during streaming are probably located on a spatial scale not assessed - or in a brain area not examined - by the present study.
Collapse
Affiliation(s)
- Andrew R. Dykstra
- Program in Speech and Hearing Bioscience and Technology, Harvard-MIT Division of Health Sciences and TechnologyCambridge, MA, USA
- Cortical Physiology Laboratory, Department of Neurology, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Eric Halgren
- Department of Radiology, University of California San DiegoSan Diego, CA, USA
- Department of Neurosciences, University of California San DiegoSan Diego, CA, USA
| | - Thomas Thesen
- Comprehensive Epilepsy Center, New York University School of MedicineNew York, NY, USA
| | - Chad E. Carlson
- Comprehensive Epilepsy Center, New York University School of MedicineNew York, NY, USA
| | - Werner Doyle
- Comprehensive Epilepsy Center, New York University School of MedicineNew York, NY, USA
| | - Joseph R. Madsen
- Department of Neurosurgery, Brigham and Women's Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Emad N. Eskandar
- Department of Neurosurgery, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Sydney S. Cash
- Cortical Physiology Laboratory, Department of Neurology, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| |
Collapse
|
26
|
Arnott SR, Bardouille T, Ross B, Alain C. Neural generators underlying concurrent sound segregation. Brain Res 2011; 1387:116-24. [PMID: 21362407 DOI: 10.1016/j.brainres.2011.02.062] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Revised: 02/11/2011] [Accepted: 02/19/2011] [Indexed: 11/25/2022]
Abstract
Although an object-based account of auditory attention has become an increasingly popular model for understanding how temporally overlapping sounds are segregated, relatively little is known about the cortical circuit that supports such ability. In the present study, we applied a beamformer spatial filter to magnetoencephalography (MEG) data recorded during an auditory paradigm that used inharmonicity to promote the formation of multiple auditory objects. Using this unconstrained, data-driven approach, the evoked field component linked with the perception of multiple auditory objects (i.e., the object-related negativity; ORNm), was found to be associated with bilateral auditory cortex sources that were distinct from those coinciding with the P1m, N1m, and P2m responses elicited by sound onset. The right hemispheric ORNm source in particular was consistently positioned anterior to the other sources across two experiments. These findings are consistent with earlier proposals of multiple auditory object detection being associated with generators in the auditory cortex and further suggest that these neural populations are distinct from the long latency evoked responses reflecting the detection of sound onset.
Collapse
Affiliation(s)
- Stephen R Arnott
- Rotman Research Institute, Baycrest Centre, Toronto, Ontario, Canada M6A 2E1.
| | | | | | | |
Collapse
|
27
|
Ma L, Micheyl C, Yin P, Oxenham AJ, Shamma SA. Behavioral measures of auditory streaming in ferrets (Mustela putorius). ACTA ACUST UNITED AC 2011; 124:317-30. [PMID: 20695663 DOI: 10.1037/a0018273] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
An important aspect of the analysis of auditory "scenes" relates to the perceptual organization of sound sequences into auditory "streams." In this study, we adapted two auditory perception tasks, used in recent human psychophysical studies, to obtain behavioral measures of auditory streaming in ferrets (Mustela putorius). One task involved the detection of shifts in the frequency of tones within an alternating tone sequence. The other task involved the detection of a stream of regularly repeating target tones embedded within a randomly varying multitone background. In both tasks, performance was measured as a function of various stimulus parameters, which previous psychophysical studies in humans have shown to influence auditory streaming. Ferret performance in the two tasks was found to vary as a function of these parameters in a way that is qualitatively consistent with the human data. These results suggest that auditory streaming occurs in ferrets, and that the two tasks described here may provide a valuable tool in future behavioral and neurophysiological studies of the phenomenon.
Collapse
Affiliation(s)
- Ling Ma
- Neural Systems Laboratory, Department of Bioengineering, University of Maryland, College Park, MD 20742, USA.
| | | | | | | | | |
Collapse
|
28
|
Itatani N, Klump GM. Neural Correlates of Auditory Streaming of Harmonic Complex Sounds With Different Phase Relations in the Songbird Forebrain. J Neurophysiol 2011; 105:188-99. [DOI: 10.1152/jn.00496.2010] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
It has been suggested that successively presented sounds that are perceived as separate auditory streams are represented by separate populations of neurons. Mostly, spectral separation in different peripheral filters has been identified as the cue for segregation. However, stream segregation based on temporal cues is also possible without spectral separation. Here we present sequences of ABA- triplet stimuli providing only temporal cues to neurons in the European starling auditory forebrain. A and B sounds (125 ms duration) were harmonic complexes (fundamentals 100, 200, or 400 Hz; center frequency and bandwidth chosen to fit the neurons' tuning characteristic) with identical amplitude spectra but different phase relations between components (cosine, alternating, or random phase) and presented at different rates. Differences in both rate responses and temporal response patterns of the neurons when stimulated with harmonic complexes with different phase relations provide first evidence for a mechanism allowing a separate neural representation of such stimuli. Recording sites responding >1 kHz showed enhanced rate and temporal differences compared with those responding at lower frequencies. These results demonstrate a neural correlate of streaming by temporal cues due to the variation of phase that shows striking parallels to observations in previous psychophysical studies.
Collapse
Affiliation(s)
- Naoya Itatani
- Animal Physiology and Behaviour Group, Institute for Biology and Environmental Sciences, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M. Klump
- Animal Physiology and Behaviour Group, Institute for Biology and Environmental Sciences, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
29
|
Objective and subjective psychophysical measures of auditory stream integration and segregation. J Assoc Res Otolaryngol 2010; 11:709-24. [PMID: 20658165 DOI: 10.1007/s10162-010-0227-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2009] [Accepted: 06/30/2010] [Indexed: 10/19/2022] Open
Abstract
The perceptual organization of sound sequences into auditory streams involves the integration of sounds into one stream and the segregation of sounds into separate streams. "Objective" psychophysical measures of auditory streaming can be obtained using behavioral tasks where performance is facilitated by segregation and hampered by integration, or vice versa. Traditionally, these two types of tasks have been tested in separate studies involving different listeners, procedures, and stimuli. Here, we tested subjects in two complementary temporal-gap discrimination tasks involving similar stimuli and procedures. One task was designed so that performance in it would be facilitated by perceptual integration; the other, so that performance would be facilitated by perceptual segregation. Thresholds were measured in both tasks under a wide range of conditions produced by varying three stimulus parameters known to influence stream formation: frequency separation, tone-presentation rate, and sequence length. In addition to these performance-based measures, subjective judgments of perceived segregation were collected in the same listeners under corresponding stimulus conditions. The patterns of results obtained in the two temporal-discrimination tasks, and the relationships between thresholds and perceived-segregation judgments, were mostly consistent with the hypothesis that stream segregation helped performance in one task and impaired performance in the other task. The tasks and stimuli described here may prove useful in future behavioral or neurophysiological experiments, which seek to manipulate and measure neural correlates of auditory streaming while minimizing differences between the physical stimuli.
Collapse
|
30
|
Bee MA, Micheyl C, Oxenham AJ, Klump GM. Neural adaptation to tone sequences in the songbird forebrain: patterns, determinants, and relation to the build-up of auditory streaming. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2010; 196:543-57. [PMID: 20563587 DOI: 10.1007/s00359-010-0542-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2010] [Revised: 05/08/2010] [Accepted: 05/28/2010] [Indexed: 11/29/2022]
Abstract
Neural responses to tones in the mammalian primary auditory cortex (A1) exhibit adaptation over the course of several seconds. Important questions remain about the taxonomic distribution of multi-second adaptation and its possible roles in hearing. It has been hypothesized that neural adaptation could explain the gradual "build-up" of auditory stream segregation. We investigated the influence of several stimulus-related factors on neural adaptation in the avian homologue of mammalian A1 (field L2) in starlings (Sturnus vulgaris). We presented awake birds with sequences of repeated triplets of two interleaved tones (ABA-ABA-...) in which we varied the frequency separation between the A and B tones (DeltaF), the stimulus onset asynchrony (time from tone onset to onset within a triplet), and tone duration. We found that stimulus onset asynchrony generally had larger effects on adaptation compared with DeltaF and tone duration over the parameter range tested. Using a simple model, we show how time-dependent changes in neural responses can be transformed into neurometric functions that make testable predictions about the dependence of the build-up of stream segregation on various spectral and temporal stimulus properties.
Collapse
Affiliation(s)
- Mark A Bee
- Department of Ecology, Evolution, and Behavior, University of Minnesota, 100 Ecology, 1987 Upper Buford Circle, St. Paul, MN 55108, USA.
| | | | | | | |
Collapse
|
31
|
Shamma SA, Micheyl C. Behind the scenes of auditory perception. Curr Opin Neurobiol 2010; 20:361-6. [PMID: 20456940 DOI: 10.1016/j.conb.2010.03.009] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2010] [Revised: 03/16/2010] [Accepted: 03/29/2010] [Indexed: 11/30/2022]
Abstract
'Auditory scenes' often contain contributions from multiple acoustic sources. These are usually heard as separate auditory 'streams', which can be selectively followed over time. How and where these auditory streams are formed in the auditory system is one of the most fascinating questions facing auditory scientists today. Findings published within the past two years indicate that both cortical and subcortical processes contribute to the formation of auditory streams, and they raise important questions concerning the roles of primary and secondary areas of auditory cortex in this phenomenon. In addition, these findings underline the importance of taking into account the relative timing of neural responses, and the influence of selective attention, in the search for neural correlates of the perception of auditory streams.
Collapse
Affiliation(s)
- Shihab A Shamma
- Department of Electrical and Computer Engineering & Institute for Systems Research, University of Maryland College Park, United States.
| | | |
Collapse
|
32
|
Schadwinkel S, Gutschalk A. Activity associated with stream segregation in human auditory cortex is similar for spatial and pitch cues. Cereb Cortex 2010; 20:2863-73. [PMID: 20237241 DOI: 10.1093/cercor/bhq037] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Streaming is a perceptual mechanism by which the brain segregates information from multiple sound sources in our environment and assigns them to distinct auditory streams. Examples for streaming cues are differences in frequency spectrum, pitch, or space, and potential neural correlates for streaming based on spectral and pitch cues have been identified in the auditory cortex. Here, magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) were used to evaluate if response enhancement in auditory cortex associated with streaming represents a general pattern that is independent of the stimulus cue. Interaural time differences (ITDs) were used as a spatial streaming cue and were compared with streaming based on fundamental frequency (f(0)) differences. The MEG results showed enhancement of the P(1)m after 60-90 ms that was similar during streaming based on ITD and pitch. Sustained fMRI activity was enhanced at identical sites in Heschl's gyrus and planum temporale for both cues; no topographical specificity for space or pitch was found for the streaming-associated enhancement. These results support the hypothesis of an early convergence of the neural representation for auditory streams that is independent of the acoustic cue that the streaming is based on.
Collapse
Affiliation(s)
- Stefan Schadwinkel
- Department of Neurology, University of Heidelberg, Im Neuenheimer Feld 400, Heidelberg,Germany.
| | | |
Collapse
|