1
|
Hamersky GR, Shaheen LA, Espejo ML, Wingert JC, David SV. Reduced Neural Responses to Natural Foreground versus Background Sounds in the Auditory Cortex. J Neurosci 2025; 45:e0121242024. [PMID: 39837664 PMCID: PMC11884389 DOI: 10.1523/jneurosci.0121-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 11/27/2024] [Accepted: 12/03/2024] [Indexed: 01/23/2025] Open
Abstract
In everyday hearing, listeners face the challenge of understanding behaviorally relevant foreground stimuli (speech, vocalizations) in complex backgrounds (environmental, mechanical noise). Prior studies have shown that high-order areas of human auditory cortex (AC) preattentively form an enhanced representation of foreground stimuli in the presence of background noise. This enhancement requires identifying and grouping the features that comprise the background so they can be removed from the foreground representation. To study the cortical computations supporting this process, we recorded single-unit activity in AC of male and female ferrets during the presentation of concurrent natural sounds from these two categories. In contrast to expectations from studies in high-order AC, single-unit responses to foreground sounds were strongly reduced relative to the paired background in primary and secondary AC. The degree of reduction could not be explained by a neuron's preference for the foreground or background stimulus in isolation but could be partially explained by spectrotemporal statistics that distinguish foreground and background categories. Responses to synthesized sounds with statistics either matched or randomized relative to natural sounds showed progressively decreased reduction of foreground responses as natural sound statistics were removed. These results challenge the expectation that cortical foreground representations emerge directly from a mixed representation in the auditory periphery. Instead, they suggest the early AC maintains a robust representation of background noise. Strong background representations may produce a distributed code, facilitating selection of foreground signals from a relatively small subpopulation of AC neurons at later processing stages.
Collapse
Affiliation(s)
- Gregory R Hamersky
- Neurosicence Graduate Program, Oregon Health and Science University, Portland, Oregon 97239
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, Oregon 97239
| | - Luke A Shaheen
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, Oregon 97239
| | - Mateo López Espejo
- Neurosicence Graduate Program, Oregon Health and Science University, Portland, Oregon 97239
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, Oregon 97239
| | - Jereme C Wingert
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, Oregon 97239
- Behavioral and Systems Neuroscience Graduate Program, Oregon Health and Science University, Portland, Oregon 97239
| | - Stephen V David
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, Oregon 97239
| |
Collapse
|
2
|
Mukherjee S, Babadi B, Shamma S. Sparse high-dimensional decomposition of non-primary auditory cortical receptive fields. PLoS Comput Biol 2025; 21:e1012721. [PMID: 39746112 PMCID: PMC11774495 DOI: 10.1371/journal.pcbi.1012721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 01/28/2025] [Accepted: 12/16/2024] [Indexed: 01/04/2025] Open
Abstract
Characterizing neuronal responses to natural stimuli remains a central goal in sensory neuroscience. In auditory cortical neurons, the stimulus selectivity of elicited spiking activity is summarized by a spectrotemporal receptive field (STRF) that relates neuronal responses to the stimulus spectrogram. Though effective in characterizing primary auditory cortical responses, STRFs of non-primary auditory neurons can be quite intricate, reflecting their mixed selectivity. The complexity of non-primary STRFs hence impedes understanding how acoustic stimulus representations are transformed along the auditory pathway. Here, we focus on the relationship between ferret primary auditory cortex (A1) and a secondary region, dorsal posterior ectosylvian gyrus (PEG). We propose estimating receptive fields in PEG with respect to a well-established high-dimensional computational model of primary-cortical stimulus representations. These "cortical receptive fields" (CortRF) are estimated greedily to identify the salient primary-cortical features modulating spiking responses and in turn related to corresponding spectrotemporal features. Hence, they provide biologically plausible hierarchical decompositions of STRFs in PEG. Such CortRF analysis was applied to PEG neuronal responses to speech and temporally orthogonal ripple combination (TORC) stimuli and, for comparison, to A1 neuronal responses. CortRFs of PEG neurons captured their selectivity to more complex spectrotemporal features than A1 neurons; moreover, CortRF models were more predictive of PEG (but not A1) responses to speech. Our results thus suggest that secondary-cortical stimulus representations can be computed as sparse combinations of primary-cortical features that facilitate encoding natural stimuli. Thus, by adding the primary-cortical representation, we can account for PEG single-unit responses to natural sounds better than bypassing it and considering as input the auditory spectrogram. These results confirm with explicit details the presumed hierarchical organization of the auditory cortex.
Collapse
Affiliation(s)
- Shoutik Mukherjee
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
| | - Behtash Babadi
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
| | - Shihab Shamma
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Laboratoire des Systèmes Perceptifs, Department des Études Cognitive, École Normale Supériure, Paris Sciences et Lettres University, Paris, France
| |
Collapse
|
3
|
Rançon U, Masquelier T, Cottereau BR. A general model unifying the adaptive, transient and sustained properties of ON and OFF auditory neural responses. PLoS Comput Biol 2024; 20:e1012288. [PMID: 39093852 PMCID: PMC11324186 DOI: 10.1371/journal.pcbi.1012288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 08/14/2024] [Accepted: 06/29/2024] [Indexed: 08/04/2024] Open
Abstract
Sounds are temporal stimuli decomposed into numerous elementary components by the auditory nervous system. For instance, a temporal to spectro-temporal transformation modelling the frequency decomposition performed by the cochlea is a widely adopted first processing step in today's computational models of auditory neural responses. Similarly, increments and decrements in sound intensity (i.e., of the raw waveform itself or of its spectral bands) constitute critical features of the neural code, with high behavioural significance. However, despite the growing attention of the scientific community on auditory OFF responses, their relationship with transient ON, sustained responses and adaptation remains unclear. In this context, we propose a new general model, based on a pair of linear filters, named AdapTrans, that captures both sustained and transient ON and OFF responses into a unifying and easy to expand framework. We demonstrate that filtering audio cochleagrams with AdapTrans permits to accurately render known properties of neural responses measured in different mammal species such as the dependence of OFF responses on the stimulus fall time and on the preceding sound duration. Furthermore, by integrating our framework into gold standard and state-of-the-art machine learning models that predict neural responses from audio stimuli, following a supervised training on a large compilation of electrophysiology datasets (ready-to-deploy PyTorch models and pre-processed datasets shared publicly), we show that AdapTrans systematically improves the prediction accuracy of estimated responses within different cortical areas of the rat and ferret auditory brain. Together, these results motivate the use of our framework for computational and systems neuroscientists willing to increase the plausibility and performances of their models of audition.
Collapse
Affiliation(s)
- Ulysse Rançon
- CerCo UMR 5549, CNRS – Université Toulouse III, Toulouse, France
| | | | - Benoit R. Cottereau
- CerCo UMR 5549, CNRS – Université Toulouse III, Toulouse, France
- IPAL, CNRS IRL62955, Singapore, Singapore
| |
Collapse
|
4
|
López Espejo M, David SV. A sparse code for natural sound context in auditory cortex. CURRENT RESEARCH IN NEUROBIOLOGY 2023; 6:100118. [PMID: 38152461 PMCID: PMC10749876 DOI: 10.1016/j.crneur.2023.100118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 10/27/2023] [Accepted: 11/14/2023] [Indexed: 12/29/2023] Open
Abstract
Accurate sound perception can require integrating information over hundreds of milliseconds or even seconds. Spectro-temporal models of sound coding by single neurons in auditory cortex indicate that the majority of sound-evoked activity can be attributed to stimuli with a few tens of milliseconds. It remains uncertain how the auditory system integrates information about sensory context on a longer timescale. Here we characterized long-lasting contextual effects in auditory cortex (AC) using a diverse set of natural sound stimuli. We measured context effects as the difference in a neuron's response to a single probe sound following two different context sounds. Many AC neurons showed context effects lasting longer than the temporal window of a traditional spectro-temporal receptive field. The duration and magnitude of context effects varied substantially across neurons and stimuli. This diversity of context effects formed a sparse code across the neural population that encoded a wider range of contexts than any constituent neuron. Encoding model analysis indicates that context effects can be explained by activity in the local neural population, suggesting that recurrent local circuits support a long-lasting representation of sensory context in auditory cortex.
Collapse
Affiliation(s)
- Mateo López Espejo
- Neuroscience Graduate Program, Oregon Health & Science University, Portland, OR, USA
| | - Stephen V. David
- Otolaryngology, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
5
|
Grijseels DM, Prendergast BJ, Gorman JC, Miller CT. The neurobiology of vocal communication in marmosets. Ann N Y Acad Sci 2023; 1528:13-28. [PMID: 37615212 PMCID: PMC10592205 DOI: 10.1111/nyas.15057] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]
Abstract
An increasingly popular animal model for studying the neural basis of social behavior, cognition, and communication is the common marmoset (Callithrix jacchus). Interest in this New World primate across neuroscience is now being driven by their proclivity for prosociality across their repertoire, high volubility, and rapid development, as well as their amenability to naturalistic testing paradigms and freely moving neural recording and imaging technologies. The complement of these characteristics set marmosets up to be a powerful model of the primate social brain in the years to come. Here, we focus on vocal communication because it is the area that has both made the most progress and illustrates the prodigious potential of this species. We review the current state of the field with a focus on the various brain areas and networks involved in vocal perception and production, comparing the findings from marmosets to other animals, including humans.
Collapse
Affiliation(s)
- Dori M Grijseels
- Cortical Systems and Behavior Laboratory, University of California, San Diego, La Jolla, California, USA
| | - Brendan J Prendergast
- Cortical Systems and Behavior Laboratory, University of California, San Diego, La Jolla, California, USA
| | - Julia C Gorman
- Cortical Systems and Behavior Laboratory, University of California, San Diego, La Jolla, California, USA
- Neurosciences Graduate Program, University of California, San Diego, La Jolla, California, USA
| | - Cory T Miller
- Cortical Systems and Behavior Laboratory, University of California, San Diego, La Jolla, California, USA
- Neurosciences Graduate Program, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
6
|
Pennington JR, David SV. A convolutional neural network provides a generalizable model of natural sound coding by neural populations in auditory cortex. PLoS Comput Biol 2023; 19:e1011110. [PMID: 37146065 DOI: 10.1371/journal.pcbi.1011110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 05/17/2023] [Accepted: 04/17/2023] [Indexed: 05/07/2023] Open
Abstract
Convolutional neural networks (CNNs) can provide powerful and flexible models of neural sensory processing. However, the utility of CNNs in studying the auditory system has been limited by their requirement for large datasets and the complex response properties of single auditory neurons. To address these limitations, we developed a population encoding model: a CNN that simultaneously predicts activity of several hundred neurons recorded during presentation of a large set of natural sounds. This approach defines a shared spectro-temporal space and pools statistical power across neurons. Population models of varying architecture performed consistently and substantially better than traditional linear-nonlinear models on data from primary and non-primary auditory cortex. Moreover, population models were highly generalizable. The output layer of a model pre-trained on one population of neurons could be fit to data from novel single units, achieving performance equivalent to that of neurons in the original fit data. This ability to generalize suggests that population encoding models capture a complete representational space across neurons in an auditory cortical field.
Collapse
Affiliation(s)
- Jacob R Pennington
- Washington State University, Vancouver, Washington, United States of America
| | - Stephen V David
- Oregon Hearing Research Center, Oregon Health and Science University, Oregon, United States of America
| |
Collapse
|
7
|
Bigelow J, Morrill RJ, Olsen T, Hasenstaub AR. Visual modulation of firing and spectrotemporal receptive fields in mouse auditory cortex. CURRENT RESEARCH IN NEUROBIOLOGY 2022; 3:100040. [PMID: 36518337 PMCID: PMC9743056 DOI: 10.1016/j.crneur.2022.100040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/26/2022] [Accepted: 05/06/2022] [Indexed: 10/18/2022] Open
Abstract
Recent studies have established significant anatomical and functional connections between visual areas and primary auditory cortex (A1), which may be important for cognitive processes such as communication and spatial perception. These studies have raised two important questions: First, which cell populations in A1 respond to visual input and/or are influenced by visual context? Second, which aspects of sound encoding are affected by visual context? To address these questions, we recorded single-unit activity across cortical layers in awake mice during exposure to auditory and visual stimuli. Neurons responsive to visual stimuli were most prevalent in the deep cortical layers and included both excitatory and inhibitory cells. The overwhelming majority of these neurons also responded to sound, indicating unimodal visual neurons are rare in A1. Other neurons for which sound-evoked responses were modulated by visual context were similarly excitatory or inhibitory but more evenly distributed across cortical layers. These modulatory influences almost exclusively affected sustained sound-evoked firing rate (FR) responses or spectrotemporal receptive fields (STRFs); transient FR changes at stimulus onset were rarely modified by visual context. Neuron populations with visually modulated STRFs and sustained FR responses were mostly non-overlapping, suggesting spectrotemporal feature selectivity and overall excitability may be differentially sensitive to visual context. The effects of visual modulation were heterogeneous, increasing and decreasing STRF gain in roughly equal proportions of neurons. Our results indicate visual influences are surprisingly common and diversely expressed throughout layers and cell types in A1, affecting nearly one in five neurons overall.
Collapse
Affiliation(s)
- James Bigelow
- Coleman Memorial Laboratory, University of California, San Francisco, USA
- Department of Otolaryngology–Head and Neck Surgery, University of California, San Francisco, 94143, USA
| | - Ryan J. Morrill
- Coleman Memorial Laboratory, University of California, San Francisco, USA
- Neuroscience Graduate Program, University of California, San Francisco, USA
- Department of Otolaryngology–Head and Neck Surgery, University of California, San Francisco, 94143, USA
| | - Timothy Olsen
- Coleman Memorial Laboratory, University of California, San Francisco, USA
- Department of Otolaryngology–Head and Neck Surgery, University of California, San Francisco, 94143, USA
| | - Andrea R. Hasenstaub
- Coleman Memorial Laboratory, University of California, San Francisco, USA
- Neuroscience Graduate Program, University of California, San Francisco, USA
- Department of Otolaryngology–Head and Neck Surgery, University of California, San Francisco, 94143, USA
| |
Collapse
|
8
|
Wu Z, Rockwell H, Zhang Y, Tang S, Lee TS. Complexity and diversity in sparse code priors improve receptive field characterization of Macaque V1 neurons. PLoS Comput Biol 2021; 17:e1009528. [PMID: 34695120 PMCID: PMC8589190 DOI: 10.1371/journal.pcbi.1009528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 11/12/2021] [Accepted: 10/05/2021] [Indexed: 11/18/2022] Open
Abstract
System identification techniques-projection pursuit regression models (PPRs) and convolutional neural networks (CNNs)-provide state-of-the-art performance in predicting visual cortical neurons' responses to arbitrary input stimuli. However, the constituent kernels recovered by these methods are often noisy and lack coherent structure, making it difficult to understand the underlying component features of a neuron's receptive field. In this paper, we show that using a dictionary of diverse kernels with complex shapes learned from natural scenes based on efficient coding theory, as the front-end for PPRs and CNNs can improve their performance in neuronal response prediction as well as algorithmic data efficiency and convergence speed. Extensive experimental results also indicate that these sparse-code kernels provide important information on the component features of a neuron's receptive field. In addition, we find that models with the complex-shaped sparse code front-end are significantly better than models with a standard orientation-selective Gabor filter front-end for modeling V1 neurons that have been found to exhibit complex pattern selectivity. We show that the relative performance difference due to these two front-ends can be used to produce a sensitive metric for detecting complex selectivity in V1 neurons.
Collapse
Affiliation(s)
- Ziniu Wu
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Mathematics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Harold Rockwell
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Yimeng Zhang
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Shiming Tang
- Center for Life Sciences, Peking University, Beijing, China
| | - Tai Sing Lee
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
9
|
Montes-Lourido P, Kar M, David SV, Sadagopan S. Neuronal selectivity to complex vocalization features emerges in the superficial layers of primary auditory cortex. PLoS Biol 2021; 19:e3001299. [PMID: 34133413 PMCID: PMC8238193 DOI: 10.1371/journal.pbio.3001299] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 06/28/2021] [Accepted: 05/24/2021] [Indexed: 01/11/2023] Open
Abstract
Early in auditory processing, neural responses faithfully reflect acoustic input. At higher stages of auditory processing, however, neurons become selective for particular call types, eventually leading to specialized regions of cortex that preferentially process calls at the highest auditory processing stages. We previously proposed that an intermediate step in how nonselective responses are transformed into call-selective responses is the detection of informative call features. But how neural selectivity for informative call features emerges from nonselective inputs, whether feature selectivity gradually emerges over the processing hierarchy, and how stimulus information is represented in nonselective and feature-selective populations remain open question. In this study, using unanesthetized guinea pigs (GPs), a highly vocal and social rodent, as an animal model, we characterized the neural representation of calls in 3 auditory processing stages-the thalamus (ventral medial geniculate body (vMGB)), and thalamorecipient (L4) and superficial layers (L2/3) of primary auditory cortex (A1). We found that neurons in vMGB and A1 L4 did not exhibit call-selective responses and responded throughout the call durations. However, A1 L2/3 neurons showed high call selectivity with about a third of neurons responding to only 1 or 2 call types. These A1 L2/3 neurons only responded to restricted portions of calls suggesting that they were highly selective for call features. Receptive fields of these A1 L2/3 neurons showed complex spectrotemporal structures that could underlie their high call feature selectivity. Information theoretic analysis revealed that in A1 L4, stimulus information was distributed over the population and was spread out over the call durations. In contrast, in A1 L2/3, individual neurons showed brief bursts of high stimulus-specific information and conveyed high levels of information per spike. These data demonstrate that a transformation in the neural representation of calls occurs between A1 L4 and A1 L2/3, leading to the emergence of a feature-based representation of calls in A1 L2/3. Our data thus suggest that observed cortical specializations for call processing emerge in A1 and set the stage for further mechanistic studies.
Collapse
Affiliation(s)
- Pilar Montes-Lourido
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Manaswini Kar
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Stephen V. David
- Department of Otolaryngology, Oregon Health and Science University, Portland, Oregon, United States of America
| | - Srivatsun Sadagopan
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
10
|
Fehrman C, Robbins TD, Meliza CD. Nonlinear effects of intrinsic dynamics on temporal encoding in a model of avian auditory cortex. PLoS Comput Biol 2021; 17:e1008768. [PMID: 33617539 PMCID: PMC7932506 DOI: 10.1371/journal.pcbi.1008768] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 03/04/2021] [Accepted: 02/04/2021] [Indexed: 11/18/2022] Open
Abstract
Neurons exhibit diverse intrinsic dynamics, which govern how they integrate synaptic inputs to produce spikes. Intrinsic dynamics are often plastic during development and learning, but the effects of these changes on stimulus encoding properties are not well known. To examine this relationship, we simulated auditory responses to zebra finch song using a linear-dynamical cascade model, which combines a linear spectrotemporal receptive field with a dynamical, conductance-based neuron model, then used generalized linear models to estimate encoding properties from the resulting spike trains. We focused on the effects of a low-threshold potassium current (KLT) that is present in a subset of cells in the zebra finch caudal mesopallium and is affected by early auditory experience. We found that KLT affects both spike adaptation and the temporal filtering properties of the receptive field. The direction of the effects depended on the temporal modulation tuning of the linear (input) stage of the cascade model, indicating a strongly nonlinear relationship. These results suggest that small changes in intrinsic dynamics in tandem with differences in synaptic connectivity can have dramatic effects on the tuning of auditory neurons. Experience-dependent developmental plasticity involves changes not only to synaptic connections, but to voltage-gated currents as well. Using biophysical models, it is straightforward to predict the effects of this intrinsic plasticity on the firing patterns of individual neurons, but it remains difficult to understand the consequences for sensory coding. We investigated this in the context of the zebra finch auditory cortex, where early exposure to a complex acoustic environment causes increased expression of a low-threshold potassium current. We simulated responses to song using a detailed biophysical model and then characterized encoding properties using generalized linear models. This analysis revealed that this potassium current has strong, nonlinear effects on how the model encodes the song’s temporal structure, and that the sign of these effects depend on the temporal tuning of the synaptic inputs. This nonlinearity gives intrinsic plasticity broad scope as a mechanism for developmental learning in the auditory system.
Collapse
Affiliation(s)
- Christof Fehrman
- Psychology Department, University of Virginia, Charlottesville, Virginia, United States of America
| | - Tyler D. Robbins
- Cognitive Science Program, University of Virginia, Charlottesville, Virginia, United States of America
| | - C. Daniel Meliza
- Psychology Department, University of Virginia, Charlottesville, Virginia, United States of America
- Neuroscience Graduate Program, University of Virginia, Charlottesville, Virginia, United States of America
- * E-mail:
| |
Collapse
|
11
|
Saderi D, Schwartz ZP, Heller CR, Pennington JR, David SV. Dissociation of task engagement and arousal effects in auditory cortex and midbrain. eLife 2021; 10:e60153. [PMID: 33570493 PMCID: PMC7909948 DOI: 10.7554/elife.60153] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 02/10/2021] [Indexed: 12/18/2022] Open
Abstract
Both generalized arousal and engagement in a specific task influence sensory neural processing. To isolate effects of these state variables in the auditory system, we recorded single-unit activity from primary auditory cortex (A1) and inferior colliculus (IC) of ferrets during a tone detection task, while monitoring arousal via changes in pupil size. We used a generalized linear model to assess the influence of task engagement and pupil size on sound-evoked activity. In both areas, these two variables affected independent neural populations. Pupil size effects were more prominent in IC, while pupil and task engagement effects were equally likely in A1. Task engagement was correlated with larger pupil; thus, some apparent effects of task engagement should in fact be attributed to fluctuations in pupil size. These results indicate a hierarchy of auditory processing, where generalized arousal enhances activity in midbrain, and effects specific to task engagement become more prominent in cortex.
Collapse
Affiliation(s)
- Daniela Saderi
- Oregon Hearing Research Center, Oregon Health and Science UniversityPortlandUnited States
- Neuroscience Graduate Program, Oregon Health and Science UniversityPortlandUnited States
| | - Zachary P Schwartz
- Oregon Hearing Research Center, Oregon Health and Science UniversityPortlandUnited States
- Neuroscience Graduate Program, Oregon Health and Science UniversityPortlandUnited States
| | - Charles R Heller
- Oregon Hearing Research Center, Oregon Health and Science UniversityPortlandUnited States
- Neuroscience Graduate Program, Oregon Health and Science UniversityPortlandUnited States
| | - Jacob R Pennington
- Department of Mathematics and Statistics, Washington State UniversityVancouverUnited States
| | - Stephen V David
- Oregon Hearing Research Center, Oregon Health and Science UniversityPortlandUnited States
| |
Collapse
|
12
|
Pennington JR, David SV. Complementary Effects of Adaptation and Gain Control on Sound Encoding in Primary Auditory Cortex. eNeuro 2020; 7:ENEURO.0205-20.2020. [PMID: 33109632 PMCID: PMC7675144 DOI: 10.1523/eneuro.0205-20.2020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 08/15/2020] [Accepted: 09/05/2020] [Indexed: 11/24/2022] Open
Abstract
An important step toward understanding how the brain represents complex natural sounds is to develop accurate models of auditory coding by single neurons. A commonly used model is the linear-nonlinear spectro-temporal receptive field (STRF; LN model). The LN model accounts for many features of auditory tuning, but it cannot account for long-lasting effects of sensory context on sound-evoked activity. Two mechanisms that may support these contextual effects are short-term plasticity (STP) and contrast-dependent gain control (GC), which have inspired expanded versions of the LN model. Both models improve performance over the LN model, but they have never been compared directly. Thus, it is unclear whether they account for distinct processes or describe one phenomenon in different ways. To address this question, we recorded activity of neurons in primary auditory cortex (A1) of awake ferrets during presentation of natural sounds. We then fit models incorporating one nonlinear mechanism (GC or STP) or both (GC+STP) using this single dataset, and measured the correlation between the models' predictions and the recorded neural activity. Both the STP and GC models performed significantly better than the LN model, but the GC+STP model outperformed both individual models. We also quantified the equivalence of STP and GC model predictions and found only modest similarity. Consistent results were observed for a dataset collected in clean and noisy acoustic contexts. These results establish general methods for evaluating the equivalence of arbitrarily complex encoding models and suggest that the STP and GC models describe complementary processes in the auditory system.
Collapse
Affiliation(s)
- Jacob R Pennington
- Department of Mathematics, Washington State University, Vancouver, WA, 98686
| | - Stephen V David
- Department of Otolaryngology, Oregon Health and Science University, Portland, OR, 97239
| |
Collapse
|
13
|
Rahman M, Willmore BDB, King AJ, Harper NS. Simple transformations capture auditory input to cortex. Proc Natl Acad Sci U S A 2020; 117:28442-28451. [PMID: 33097665 PMCID: PMC7668077 DOI: 10.1073/pnas.1922033117] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Sounds are processed by the ear and central auditory pathway. These processing steps are biologically complex, and many aspects of the transformation from sound waveforms to cortical response remain unclear. To understand this transformation, we combined models of the auditory periphery with various encoding models to predict auditory cortical responses to natural sounds. The cochlear models ranged from detailed biophysical simulations of the cochlea and auditory nerve to simple spectrogram-like approximations of the information processing in these structures. For three different stimulus sets, we tested the capacity of these models to predict the time course of single-unit neural responses recorded in ferret primary auditory cortex. We found that simple models based on a log-spaced spectrogram with approximately logarithmic compression perform similarly to the best-performing biophysically detailed models of the auditory periphery, and more consistently well over diverse natural and synthetic sounds. Furthermore, we demonstrated that including approximations of the three categories of auditory nerve fiber in these simple models can substantially improve prediction, particularly when combined with a network encoding model. Our findings imply that the properties of the auditory periphery and central pathway may together result in a simpler than expected functional transformation from ear to cortex. Thus, much of the detailed biological complexity seen in the auditory periphery does not appear to be important for understanding the cortical representation of sound.
Collapse
Affiliation(s)
- Monzilur Rahman
- Department of Physiology, Anatomy and Genetics, University of Oxford, OX1 3PT Oxford, United Kingdom
| | - Ben D B Willmore
- Department of Physiology, Anatomy and Genetics, University of Oxford, OX1 3PT Oxford, United Kingdom
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of Oxford, OX1 3PT Oxford, United Kingdom
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of Oxford, OX1 3PT Oxford, United Kingdom
| |
Collapse
|
14
|
Oscillations in the auditory system and their possible role. Neurosci Biobehav Rev 2020; 113:507-528. [PMID: 32298712 DOI: 10.1016/j.neubiorev.2020.03.030] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/25/2020] [Accepted: 03/30/2020] [Indexed: 12/26/2022]
Abstract
GOURÉVITCH, B., C. Martin, O. Postal, J.J. Eggermont. Oscillations in the auditory system, their possible role. NEUROSCI BIOBEHAV REV XXX XXX-XXX, 2020. - Neural oscillations are thought to have various roles in brain processing such as, attention modulation, neuronal communication, motor coordination, memory consolidation, decision-making, or feature binding. The role of oscillations in the auditory system is less clear, especially due to the large discrepancy between human and animal studies. Here we describe many methodological issues that confound the results of oscillation studies in the auditory field. Moreover, we discuss the relationship between neural entrainment and oscillations that remains unclear. Finally, we aim to identify which kind of oscillations could be specific or salient to the auditory areas and their processing. We suggest that the role of oscillations might dramatically differ between the primary auditory cortex and the more associative auditory areas. Despite the moderate presence of intrinsic low frequency oscillations in the primary auditory cortex, rhythmic components in the input seem crucial for auditory processing. This allows the phase entrainment between the oscillatory phase and rhythmic input, which is an integral part of stimulus selection within the auditory system.
Collapse
|
15
|
Streaming of Repeated Noise in Primary and Secondary Fields of Auditory Cortex. J Neurosci 2020; 40:3783-3798. [PMID: 32273487 DOI: 10.1523/jneurosci.2105-19.2020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 02/06/2020] [Accepted: 02/11/2020] [Indexed: 11/21/2022] Open
Abstract
Statistical regularities in natural sounds facilitate the perceptual segregation of auditory sources, or streams. Repetition is one cue that drives stream segregation in humans, but the neural basis of this perceptual phenomenon remains unknown. We demonstrated a similar perceptual ability in animals by training ferrets of both sexes to detect a stream of repeating noise samples (foreground) embedded in a stream of random samples (background). During passive listening, we recorded neural activity in primary auditory cortex (A1) and secondary auditory cortex (posterior ectosylvian gyrus, PEG). We used two context-dependent encoding models to test for evidence of streaming of the repeating stimulus. The first was based on average evoked activity per noise sample and the second on the spectro-temporal receptive field. Both approaches tested whether differences in neural responses to repeating versus random stimuli were better modeled by scaling the response to both streams equally (global gain) or by separately scaling the response to the foreground versus background stream (stream-specific gain). Consistent with previous observations of adaptation, we found an overall reduction in global gain when the stimulus began to repeat. However, when we measured stream-specific changes in gain, responses to the foreground were enhanced relative to the background. This enhancement was stronger in PEG than A1. In A1, enhancement was strongest in units with low sparseness (i.e., broad sensory tuning) and with tuning selective for the repeated sample. Enhancement of responses to the foreground relative to the background provides evidence for stream segregation that emerges in A1 and is refined in PEG.SIGNIFICANCE STATEMENT To interact with the world successfully, the brain must parse behaviorally important information from a complex sensory environment. Complex mixtures of sounds often arrive at the ears simultaneously or in close succession, yet they are effortlessly segregated into distinct perceptual sources. This process breaks down in hearing-impaired individuals and speech recognition devices. By identifying the underlying neural mechanisms that facilitate perceptual segregation, we can develop strategies for ameliorating hearing loss and improving speech recognition technology in the presence of background noise. Here, we present evidence to support a hierarchical process, present in primary auditory cortex and refined in secondary auditory cortex, in which sound repetition facilitates segregation.
Collapse
|
16
|
Heelan C, Lee J, O’Shea R, Lynch L, Brandman DM, Truccolo W, Nurmikko AV. Decoding speech from spike-based neural population recordings in secondary auditory cortex of non-human primates. Commun Biol 2019; 2:466. [PMID: 31840111 PMCID: PMC6906475 DOI: 10.1038/s42003-019-0707-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 11/15/2019] [Indexed: 11/21/2022] Open
Abstract
Direct electronic communication with sensory areas of the neocortex is a challenging ambition for brain-computer interfaces. Here, we report the first successful neural decoding of English words with high intelligibility from intracortical spike-based neural population activity recorded from the secondary auditory cortex of macaques. We acquired 96-channel full-broadband population recordings using intracortical microelectrode arrays in the rostral and caudal parabelt regions of the superior temporal gyrus (STG). We leveraged a new neural processing toolkit to investigate the choice of decoding algorithm, neural preprocessing, audio representation, channel count, and array location on neural decoding performance. The presented spike-based machine learning neural decoding approach may further be useful in informing future encoding strategies to deliver direct auditory percepts to the brain as specific patterns of microstimulation.
Collapse
Affiliation(s)
- Christopher Heelan
- School of Engineering, Brown University, Providence, RI USA
- Connexon Systems, Providence, RI USA
| | - Jihun Lee
- School of Engineering, Brown University, Providence, RI USA
| | - Ronan O’Shea
- School of Engineering, Brown University, Providence, RI USA
| | - Laurie Lynch
- School of Engineering, Brown University, Providence, RI USA
| | - David M. Brandman
- Department of Surgery (Neurosurgery), Dalhousie University, Halifax, Nova Scotia Canada
| | - Wilson Truccolo
- Department of Neuroscience, Brown University, Providence, RI USA
- Carney Institute for Brain Science, Brown University, Providence, RI USA
| | - Arto V. Nurmikko
- School of Engineering, Brown University, Providence, RI USA
- Carney Institute for Brain Science, Brown University, Providence, RI USA
| |
Collapse
|
17
|
Lopez Espejo M, Schwartz ZP, David SV. Spectral tuning of adaptation supports coding of sensory context in auditory cortex. PLoS Comput Biol 2019; 15:e1007430. [PMID: 31626624 PMCID: PMC6821137 DOI: 10.1371/journal.pcbi.1007430] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 10/30/2019] [Accepted: 09/23/2019] [Indexed: 12/19/2022] Open
Abstract
Perception of vocalizations and other behaviorally relevant sounds requires integrating acoustic information over hundreds of milliseconds. Sound-evoked activity in auditory cortex typically has much shorter latency, but the acoustic context, i.e., sound history, can modulate sound evoked activity over longer periods. Contextual effects are attributed to modulatory phenomena, such as stimulus-specific adaption and contrast gain control. However, an encoding model that links context to natural sound processing has yet to be established. We tested whether a model in which spectrally tuned inputs undergo adaptation mimicking short-term synaptic plasticity (STP) can account for contextual effects during natural sound processing. Single-unit activity was recorded from primary auditory cortex of awake ferrets during presentation of noise with natural temporal dynamics and fully natural sounds. Encoding properties were characterized by a standard linear-nonlinear spectro-temporal receptive field (LN) model and variants that incorporated STP-like adaptation. In the adapting models, STP was applied either globally across all input spectral channels or locally to subsets of channels. For most neurons, models incorporating local STP predicted neural activity as well or better than LN and global STP models. The strength of nonlinear adaptation varied across neurons. Within neurons, adaptation was generally stronger for spectral channels with excitatory than inhibitory gain. Neurons showing improved STP model performance also tended to undergo stimulus-specific adaptation, suggesting a common mechanism for these phenomena. When STP models were compared between passive and active behavior conditions, response gain often changed, but average STP parameters were stable. Thus, spectrally and temporally heterogeneous adaptation, subserved by a mechanism with STP-like dynamics, may support representation of the complex spectro-temporal patterns that comprise natural sounds across wide-ranging sensory contexts.
Collapse
Affiliation(s)
- Mateo Lopez Espejo
- Neuroscience Graduate Program, Oregon Health and Science University, Portland, OR, United States of America
| | - Zachary P. Schwartz
- Neuroscience Graduate Program, Oregon Health and Science University, Portland, OR, United States of America
| | - Stephen V. David
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, OR, United States of America
| |
Collapse
|
18
|
Shi Q, Gupta P, Boukhvalova AK, Singer JH, Butts DA. Functional characterization of retinal ganglion cells using tailored nonlinear modeling. Sci Rep 2019; 9:8713. [PMID: 31213620 PMCID: PMC6581951 DOI: 10.1038/s41598-019-45048-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Accepted: 05/31/2019] [Indexed: 01/30/2023] Open
Abstract
The mammalian retina encodes the visual world in action potentials generated by 20-50 functionally and anatomically-distinct types of retinal ganglion cell (RGC). Individual RGC types receive synaptic input from distinct presynaptic circuits; therefore, their responsiveness to specific features in the visual scene arises from the information encoded in synaptic input and shaped by postsynaptic signal integration and spike generation. Unfortunately, there is a dearth of tools for characterizing the computations reflected in RGC spike output. Therefore, we developed a statistical model, the separable Nonlinear Input Model, to characterize the excitatory and suppressive components of RGC receptive fields. We recorded RGC responses to a correlated noise ("cloud") stimulus in an in vitro preparation of mouse retina and found that our model accurately predicted RGC responses at high spatiotemporal resolution. It identified multiple receptive fields reflecting the main excitatory and suppressive components of the response of each neuron. Significantly, our model accurately identified ON-OFF cells and distinguished their distinct ON and OFF receptive fields, and it demonstrated a diversity of suppressive receptive fields in the RGC population. In total, our method offers a rich description of RGC computation and sets a foundation for relating it to retinal circuitry.
Collapse
Affiliation(s)
- Qing Shi
- Department of Biology, University of Maryland, College Park, MD, United States.
| | - Pranjal Gupta
- Department of Biology, University of Maryland, College Park, MD, United States
| | | | - Joshua H Singer
- Department of Biology, University of Maryland, College Park, MD, United States
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, United States
| | - Daniel A Butts
- Department of Biology, University of Maryland, College Park, MD, United States
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, United States
| |
Collapse
|
19
|
Norman-Haignere SV, McDermott JH. Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex. PLoS Biol 2018; 16:e2005127. [PMID: 30507943 PMCID: PMC6292651 DOI: 10.1371/journal.pbio.2005127] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 12/13/2018] [Accepted: 11/08/2018] [Indexed: 11/19/2022] Open
Abstract
A central goal of sensory neuroscience is to construct models that can explain neural responses to natural stimuli. As a consequence, sensory models are often tested by comparing neural responses to natural stimuli with model responses to those stimuli. One challenge is that distinct model features are often correlated across natural stimuli, and thus model features can predict neural responses even if they do not in fact drive them. Here, we propose a simple alternative for testing a sensory model: we synthesize a stimulus that yields the same model response as each of a set of natural stimuli, and test whether the natural and "model-matched" stimuli elicit the same neural responses. We used this approach to test whether a common model of auditory cortex-in which spectrogram-like peripheral input is processed by linear spectrotemporal filters-can explain fMRI responses in humans to natural sounds. Prior studies have that shown that this model has good predictive power throughout auditory cortex, but this finding could reflect feature correlations in natural stimuli. We observed that fMRI responses to natural and model-matched stimuli were nearly equivalent in primary auditory cortex (PAC) but that nonprimary regions, including those selective for music or speech, showed highly divergent responses to the two sound sets. This dissociation between primary and nonprimary regions was less clear from model predictions due to the influence of feature correlations across natural stimuli. Our results provide a signature of hierarchical organization in human auditory cortex, and suggest that nonprimary regions compute higher-order stimulus properties that are not well captured by traditional models. Our methodology enables stronger tests of sensory models and could be broadly applied in other domains.
Collapse
Affiliation(s)
- Sam V. Norman-Haignere
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Zuckerman Institute of Mind, Brain and Behavior, Columbia University, New York, New York, United States of America
- Laboratoire des Sytèmes Perceptifs, Département d’Études Cognitives, ENS, PSL University, CNRS, Paris France
| | - Josh H. McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, Massachusetts, United States of America
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|
20
|
Maheswaranathan N, Kastner DB, Baccus SA, Ganguli S. Inferring hidden structure in multilayered neural circuits. PLoS Comput Biol 2018; 14:e1006291. [PMID: 30138312 PMCID: PMC6124781 DOI: 10.1371/journal.pcbi.1006291] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 09/05/2018] [Accepted: 06/09/2018] [Indexed: 01/26/2023] Open
Abstract
A central challenge in sensory neuroscience involves understanding how neural circuits shape computations across cascaded cell layers. Here we attempt to reconstruct the response properties of experimentally unobserved neurons in the interior of a multilayered neural circuit, using cascaded linear-nonlinear (LN-LN) models. We combine non-smooth regularization with proximal consensus algorithms to overcome difficulties in fitting such models that arise from the high dimensionality of their parameter space. We apply this framework to retinal ganglion cell processing, learning LN-LN models of retinal circuitry consisting of thousands of parameters, using 40 minutes of responses to white noise. Our models demonstrate a 53% improvement in predicting ganglion cell spikes over classical linear-nonlinear (LN) models. Internal nonlinear subunits of the model match properties of retinal bipolar cells in both receptive field structure and number. Subunits have consistently high thresholds, supressing all but a small fraction of inputs, leading to sparse activity patterns in which only one subunit drives ganglion cell spiking at any time. From the model’s parameters, we predict that the removal of visual redundancies through stimulus decorrelation across space, a central tenet of efficient coding theory, originates primarily from bipolar cell synapses. Furthermore, the composite nonlinear computation performed by retinal circuitry corresponds to a boolean OR function applied to bipolar cell feature detectors. Our methods are statistically and computationally efficient, enabling us to rapidly learn hierarchical non-linear models as well as efficiently compute widely used descriptive statistics such as the spike triggered average (STA) and covariance (STC) for high dimensional stimuli. This general computational framework may aid in extracting principles of nonlinear hierarchical sensory processing across diverse modalities from limited data. Computation in neural circuits arises from the cascaded processing of inputs through multiple cell layers. Each of these cell layers performs operations such as filtering and thresholding in order to shape a circuit’s output. It remains a challenge to describe both the computations and the mechanisms that mediate them given limited data recorded from a neural circuit. A standard approach to describing circuit computation involves building quantitative encoding models that predict the circuit response given its input, but these often fail to map in an interpretable way onto mechanisms within the circuit. In this work, we build two layer linear-nonlinear cascade models (LN-LN) in order to describe how the retinal output is shaped by nonlinear mechanisms in the inner retina. We find that these LN-LN models, fit to ganglion cell recordings alone, identify filters and nonlinearities that are readily mapped onto individual circuit components inside the retina, namely bipolar cells and the bipolar-to-ganglion cell synaptic threshold. This work demonstrates how combining simple prior knowledge of circuit properties with partial experimental recordings of a neural circuit’s output can yield interpretable models of the entire circuit computation, including parts of the circuit that are hidden or not directly observed in neural recordings.
Collapse
Affiliation(s)
- Niru Maheswaranathan
- Neurosciences Graduate Program, Stanford University, Stanford, California, United States of America
| | - David B. Kastner
- Neurosciences Graduate Program, Stanford University, Stanford, California, United States of America
| | - Stephen A. Baccus
- Department of Neurobiology, Stanford University, Stanford, California, United States of America
| | - Surya Ganguli
- Department of Applied Physics, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
21
|
Wong DDE, Fuglsang SA, Hjortkjær J, Ceolini E, Slaney M, de Cheveigné A. A Comparison of Regularization Methods in Forward and Backward Models for Auditory Attention Decoding. Front Neurosci 2018; 12:531. [PMID: 30131670 PMCID: PMC6090837 DOI: 10.3389/fnins.2018.00531] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Accepted: 07/16/2018] [Indexed: 11/17/2022] Open
Abstract
The decoding of selective auditory attention from noninvasive electroencephalogram (EEG) data is of interest in brain computer interface and auditory perception research. The current state-of-the-art approaches for decoding the attentional selection of listeners are based on linear mappings between features of sound streams and EEG responses (forward model), or vice versa (backward model). It has been shown that when the envelope of attended speech and EEG responses are used to derive such mapping functions, the model estimates can be used to discriminate between attended and unattended talkers. However, the predictive/reconstructive performance of the models is dependent on how the model parameters are estimated. There exist a number of model estimation methods that have been published, along with a variety of datasets. It is currently unclear if any of these methods perform better than others, as they have not yet been compared side by side on a single standardized dataset in a controlled fashion. Here, we present a comparative study of the ability of different estimation methods to classify attended speakers from multi-channel EEG data. The performance of the model estimation methods is evaluated using different performance metrics on a set of labeled EEG data from 18 subjects listening to mixtures of two speech streams. We find that when forward models predict the EEG from the attended audio, regularized models do not improve regression or classification accuracies. When backward models decode the attended speech from the EEG, regularization provides higher regression and classification accuracies.
Collapse
Affiliation(s)
- Daniel D. E. Wong
- Laboratoire des Systèmes Perceptifs, CNRS, UMR 8248, Paris, France
- Département d'Études Cognitives, École Normale Supérieure, PSL Research University, Paris, France
| | - Søren A. Fuglsang
- Department of Electrical Engineering, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark
| | - Jens Hjortkjær
- Department of Electrical Engineering, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark
- Danish Research Centre for Magnetic Resonance, Copenhagen University Hospital Hvidovre, Hvidovre, Denmark
| | - Enea Ceolini
- Institute of Neuroinformatics, University of Zürich, Zurich, Switzerland
| | - Malcolm Slaney
- AI Machine Perception, Google, Mountain View, CA, United States
| | - Alain de Cheveigné
- Laboratoire des Systèmes Perceptifs, CNRS, UMR 8248, Paris, France
- Département d'Études Cognitives, École Normale Supérieure, PSL Research University, Paris, France
- Ear Institute, University College London, London, United Kingdom
| |
Collapse
|
22
|
Schwartz ZP, David SV. Focal Suppression of Distractor Sounds by Selective Attention in Auditory Cortex. Cereb Cortex 2018; 28:323-339. [PMID: 29136104 PMCID: PMC6057511 DOI: 10.1093/cercor/bhx288] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Indexed: 11/15/2022] Open
Abstract
Auditory selective attention is required for parsing crowded acoustic environments, but cortical systems mediating the influence of behavioral state on auditory perception are not well characterized. Previous neurophysiological studies suggest that attention produces a general enhancement of neural responses to important target sounds versus irrelevant distractors. However, behavioral studies suggest that in the presence of masking noise, attention provides a focal suppression of distractors that compete with targets. Here, we compared effects of attention on cortical responses to masking versus non-masking distractors, controlling for effects of listening effort and general task engagement. We recorded single-unit activity from primary auditory cortex (A1) of ferrets during behavior and found that selective attention decreased responses to distractors masking targets in the same spectral band, compared with spectrally distinct distractors. This suppression enhanced neural target detection thresholds, suggesting that limited attention resources serve to focally suppress responses to distractors that interfere with target detection. Changing effort by manipulating target salience consistently modulated spontaneous but not evoked activity. Task engagement and changing effort tended to affect the same neurons, while attention affected an independent population, suggesting that distinct feedback circuits mediate effects of attention and effort in A1.
Collapse
Affiliation(s)
- Zachary P Schwartz
- Neuroscience Graduate Program, Oregon Health and Science University, OR, USA
| | - Stephen V David
- Oregon Hearing Research Center, Oregon Health and Science University, OR, USA
- Address Correspondence to Stephen V. David, Oregon Hearing Research Center, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, MC L335A, Portland, OR 97239, USA.
| |
Collapse
|
23
|
Heelan C, Nurmikko AV, Truccolo W. FPGA implementation of deep-learning recurrent neural networks with sub-millisecond real-time latency for BCI-decoding of large-scale neural sensors (104 nodes). ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2018; 2018:1070-1073. [PMID: 30440576 DOI: 10.1109/embc.2018.8512415] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Advances in neurotechnology are expected to provide access to thousands of neural channel recordings including neuronal spiking, multiunit activity and local field potentials. In addition, recent studies have shown that deep learning, in particular recurrent neural networks (RNNs), provide promising approaches for decoding of large-scale neural data. These approaches involve computationally intensive algorithms with millions of parameters. In this context, an important challenge in the application of neural decoding to next generation brain-computer interfaces for complex human tasks is the development of low-latency real-time implementations. We demonstrate a Field-Programmable Gate Array (FPGA) implementation of Long Short-Term Memory (LSTM) RNNs for decoding 10,000 channels of neural data on a mobile lowpower embedded system platform called "NeuroCoder". We provide a proof of concept in the context of decoding 20dimensional spectrotemporal representation of spoken words from simulated 10,000 neural channels. In this particular case, the LSTM model included 4,042,420 parameters. In addition to providing multiple communication interfaces for the BCI system, the NeuroCoder platform can achieve sub-millisecond real-time latencies.
Collapse
|
24
|
See JZ, Atencio CA, Sohal VS, Schreiner CE. Coordinated neuronal ensembles in primary auditory cortical columns. eLife 2018; 7:e35587. [PMID: 29869986 PMCID: PMC6017807 DOI: 10.7554/elife.35587] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 06/03/2018] [Indexed: 12/15/2022] Open
Abstract
The synchronous activity of groups of neurons is increasingly thought to be important in cortical information processing and transmission. However, most studies of processing in the primary auditory cortex (AI) have viewed neurons as independent filters; little is known about how coordinated AI neuronal activity is expressed throughout cortical columns and how it might enhance the processing of auditory information. To address this, we recorded from populations of neurons in AI cortical columns of anesthetized rats and, using dimensionality reduction techniques, identified multiple coordinated neuronal ensembles (cNEs), which are groups of neurons with reliable synchronous activity. We show that cNEs reflect local network configurations with enhanced information encoding properties that cannot be accounted for by stimulus-driven synchronization alone. Furthermore, similar cNEs were identified in both spontaneous and evoked activity, indicating that columnar cNEs are stable functional constructs that may represent principal units of information processing in AI.
Collapse
Affiliation(s)
- Jermyn Z See
- UCSF Center for Integrative NeuroscienceUniversity of California, San FranciscoSan FranciscoUnited States
- Coleman Memorial LaboratoryUniversity of California, San FranciscoSan FranciscoUnited States
- Department of Otolaryngology – Head and Neck SurgeryUniversity of California, San FranciscoSan FranciscoUnited States
- Department of PsychiatryUniversity of CaliforniaSan FranciscoUnited States
| | - Craig A Atencio
- UCSF Center for Integrative NeuroscienceUniversity of California, San FranciscoSan FranciscoUnited States
- Coleman Memorial LaboratoryUniversity of California, San FranciscoSan FranciscoUnited States
- Department of Otolaryngology – Head and Neck SurgeryUniversity of California, San FranciscoSan FranciscoUnited States
| | - Vikaas S Sohal
- UCSF Center for Integrative NeuroscienceUniversity of California, San FranciscoSan FranciscoUnited States
- Department of PsychiatryUniversity of CaliforniaSan FranciscoUnited States
| | - Christoph E Schreiner
- UCSF Center for Integrative NeuroscienceUniversity of California, San FranciscoSan FranciscoUnited States
- Coleman Memorial LaboratoryUniversity of California, San FranciscoSan FranciscoUnited States
- Department of Otolaryngology – Head and Neck SurgeryUniversity of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
25
|
McFarland DJ. How neuroscience can inform the study of individual differences in cognitive abilities. Rev Neurosci 2018; 28:343-362. [PMID: 28195556 DOI: 10.1515/revneuro-2016-0073] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Accepted: 12/17/2016] [Indexed: 02/06/2023]
Abstract
Theories of human mental abilities should be consistent with what is known in neuroscience. Currently, tests of human mental abilities are modeled by cognitive constructs such as attention, working memory, and speed of information processing. These constructs are in turn related to a single general ability. However, brains are very complex systems and whether most of the variability between the operations of different brains can be ascribed to a single factor is questionable. Research in neuroscience suggests that psychological processes such as perception, attention, decision, and executive control are emergent properties of interacting distributed networks. The modules that make up these networks use similar computational processes that involve multiple forms of neural plasticity, each having different time constants. Accordingly, these networks might best be characterized in terms of the information they process rather than in terms of abstract psychological processes such as working memory and executive control.
Collapse
|
26
|
David SV. Incorporating behavioral and sensory context into spectro-temporal models of auditory encoding. Hear Res 2018; 360:107-123. [PMID: 29331232 PMCID: PMC6292525 DOI: 10.1016/j.heares.2017.12.021] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 12/18/2017] [Accepted: 12/26/2017] [Indexed: 01/11/2023]
Abstract
For several decades, auditory neuroscientists have used spectro-temporal encoding models to understand how neurons in the auditory system represent sound. Derived from early applications of systems identification tools to the auditory periphery, the spectro-temporal receptive field (STRF) and more sophisticated variants have emerged as an efficient means of characterizing representation throughout the auditory system. Most of these encoding models describe neurons as static sensory filters. However, auditory neural coding is not static. Sensory context, reflecting the acoustic environment, and behavioral context, reflecting the internal state of the listener, can both influence sound-evoked activity, particularly in central auditory areas. This review explores recent efforts to integrate context into spectro-temporal encoding models. It begins with a brief tutorial on the basics of estimating and interpreting STRFs. Then it describes three recent studies that have characterized contextual effects on STRFs, emerging over a range of timescales, from many minutes to tens of milliseconds. An important theme of this work is not simply that context influences auditory coding, but also that contextual effects span a large continuum of internal states. The added complexity of these context-dependent models introduces new experimental and theoretical challenges that must be addressed in order to be used effectively. Several new methodological advances promise to address these limitations and allow the development of more comprehensive context-dependent models in the future.
Collapse
Affiliation(s)
- Stephen V David
- Oregon Hearing Research Center, Oregon Health & Science University, 3181 SW Sam Jackson Park Rd, MC L335A, Portland, OR 97239, United States.
| |
Collapse
|
27
|
Holdgraf CR, Rieger JW, Micheli C, Martin S, Knight RT, Theunissen FE. Encoding and Decoding Models in Cognitive Electrophysiology. Front Syst Neurosci 2017; 11:61. [PMID: 29018336 PMCID: PMC5623038 DOI: 10.3389/fnsys.2017.00061] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 08/07/2017] [Indexed: 11/13/2022] Open
Abstract
Cognitive neuroscience has seen rapid growth in the size and complexity of data recorded from the human brain as well as in the computational tools available to analyze this data. This data explosion has resulted in an increased use of multivariate, model-based methods for asking neuroscience questions, allowing scientists to investigate multiple hypotheses with a single dataset, to use complex, time-varying stimuli, and to study the human brain under more naturalistic conditions. These tools come in the form of "Encoding" models, in which stimulus features are used to model brain activity, and "Decoding" models, in which neural features are used to generated a stimulus output. Here we review the current state of encoding and decoding models in cognitive electrophysiology and provide a practical guide toward conducting experiments and analyses in this emerging field. Our examples focus on using linear models in the study of human language and audition. We show how to calculate auditory receptive fields from natural sounds as well as how to decode neural recordings to predict speech. The paper aims to be a useful tutorial to these approaches, and a practical introduction to using machine learning and applied statistics to build models of neural activity. The data analytic approaches we discuss may also be applied to other sensory modalities, motor systems, and cognitive systems, and we cover some examples in these areas. In addition, a collection of Jupyter notebooks is publicly available as a complement to the material covered in this paper, providing code examples and tutorials for predictive modeling in python. The aim is to provide a practical understanding of predictive modeling of human brain data and to propose best-practices in conducting these analyses.
Collapse
Affiliation(s)
- Christopher R. Holdgraf
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Office of the Vice Chancellor for Research, Berkeley Institute for Data Science, University of California, Berkeley, Berkeley, CA, United States
| | - Jochem W. Rieger
- Department of Psychology, Carl-von-Ossietzky University, Oldenburg, Germany
| | - Cristiano Micheli
- Department of Psychology, Carl-von-Ossietzky University, Oldenburg, Germany
- Institut des Sciences Cognitives Marc Jeannerod, Lyon, France
| | - Stephanie Martin
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Defitech Chair in Brain-Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Robert T. Knight
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| | - Frederic E. Theunissen
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Department of Psychology, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
28
|
Cluster-based analysis improves predictive validity of spike-triggered receptive field estimates. PLoS One 2017; 12:e0183914. [PMID: 28877194 PMCID: PMC5587334 DOI: 10.1371/journal.pone.0183914] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 08/14/2017] [Indexed: 11/19/2022] Open
Abstract
Spectrotemporal receptive field (STRF) characterization is a central goal of auditory physiology. STRFs are often approximated by the spike-triggered average (STA), which reflects the average stimulus preceding a spike. In many cases, the raw STA is subjected to a threshold defined by gain values expected by chance. However, such correction methods have not been universally adopted, and the consequences of specific gain-thresholding approaches have not been investigated systematically. Here, we evaluate two classes of statistical correction techniques, using the resulting STRF estimates to predict responses to a novel validation stimulus. The first, more traditional technique eliminated STRF pixels (time-frequency bins) with gain values expected by chance. This correction method yielded significant increases in prediction accuracy, including when the threshold setting was optimized for each unit. The second technique was a two-step thresholding procedure wherein clusters of contiguous pixels surviving an initial gain threshold were then subjected to a cluster mass threshold based on summed pixel values. This approach significantly improved upon even the best gain-thresholding techniques. Additional analyses suggested that allowing threshold settings to vary independently for excitatory and inhibitory subfields of the STRF resulted in only marginal additional gains, at best. In summary, augmenting reverse correlation techniques with principled statistical correction choices increased prediction accuracy by over 80% for multi-unit STRFs and by over 40% for single-unit STRFs, furthering the interpretational relevance of the recovered spectrotemporal filters for auditory systems analysis.
Collapse
|
29
|
Harper NS, Schoppe O, Willmore BDB, Cui Z, Schnupp JWH, King AJ. Network Receptive Field Modeling Reveals Extensive Integration and Multi-feature Selectivity in Auditory Cortical Neurons. PLoS Comput Biol 2016; 12:e1005113. [PMID: 27835647 PMCID: PMC5105998 DOI: 10.1371/journal.pcbi.1005113] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2015] [Accepted: 08/22/2016] [Indexed: 11/28/2022] Open
Abstract
Cortical sensory neurons are commonly characterized using the receptive field, the linear dependence of their response on the stimulus. In primary auditory cortex neurons can be characterized by their spectrotemporal receptive fields, the spectral and temporal features of a sound that linearly drive a neuron. However, receptive fields do not capture the fact that the response of a cortical neuron results from the complex nonlinear network in which it is embedded. By fitting a nonlinear feedforward network model (a network receptive field) to cortical responses to natural sounds, we reveal that primary auditory cortical neurons are sensitive over a substantially larger spectrotemporal domain than is seen in their standard spectrotemporal receptive fields. Furthermore, the network receptive field, a parsimonious network consisting of 1-7 sub-receptive fields that interact nonlinearly, consistently better predicts neural responses to auditory stimuli than the standard receptive fields. The network receptive field reveals separate excitatory and inhibitory sub-fields with different nonlinear properties, and interaction of the sub-fields gives rise to important operations such as gain control and conjunctive feature detection. The conjunctive effects, where neurons respond only if several specific features are present together, enable increased selectivity for particular complex spectrotemporal structures, and may constitute an important stage in sound recognition. In conclusion, we demonstrate that fitting auditory cortical neural responses with feedforward network models expands on simple linear receptive field models in a manner that yields substantially improved predictive power and reveals key nonlinear aspects of cortical processing, while remaining easy to interpret in a physiological context.
Collapse
Affiliation(s)
- Nicol S. Harper
- Dept. of Physiology, Anatomy and Genetics (DPAG), Sherrington Building, University of Oxford, United Kingdom
- Institute of Biomedical Engineering, Department of Engineering Science, Old Road Campus Research Building, University of Oxford, Headington, United Kingdom
| | - Oliver Schoppe
- Dept. of Physiology, Anatomy and Genetics (DPAG), Sherrington Building, University of Oxford, United Kingdom
- Bio-Inspired Information Processing, Technische Universität München, Germany
| | - Ben D. B. Willmore
- Dept. of Physiology, Anatomy and Genetics (DPAG), Sherrington Building, University of Oxford, United Kingdom
| | - Zhanfeng Cui
- Institute of Biomedical Engineering, Department of Engineering Science, Old Road Campus Research Building, University of Oxford, Headington, United Kingdom
| | - Jan W. H. Schnupp
- Dept. of Physiology, Anatomy and Genetics (DPAG), Sherrington Building, University of Oxford, United Kingdom
- Department of Biomedical Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Andrew J. King
- Dept. of Physiology, Anatomy and Genetics (DPAG), Sherrington Building, University of Oxford, United Kingdom
| |
Collapse
|