1
|
Contemori G, Guenot J, Cottereau BR, Trotter Y, Battaglini L, Bertamini M. Neural and Perceptual Adaptations in Bilateral Macular Degeneration: An Integrative Review. Neuropsychologia 2025:109165. [PMID: 40345486 DOI: 10.1016/j.neuropsychologia.2025.109165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 04/03/2025] [Accepted: 05/06/2025] [Indexed: 05/11/2025]
Abstract
Bilateral age-related macular degeneration (AMD) results in central vision loss, affecting the fovea-associated cortical regions. This review examines neuroimaging and psychophysical evidence of spontaneous neural adaptation in acquired bilateral central scotoma. Early visual brain areas show reduced cortical thickness and axonal integrity due to postsynaptic (anterograde) degeneration. Contrary to animal models, evidence for spontaneous adaptation in the primary visual cortex (V1) is limited. Activity in the lesion projection zone (LPZ), previously seen as extensive cortical remapping, may result from non-retinotopic peripheral-to-foveal feedback, sharing substrates with healthy retinal feedforward processes. Preferred retinal loci (PRLs) are influenced more by location and task than by residual vision quality. Reduced lateral masking in the PRL may reflect decreased contrast sensitivity from retinal damage, rather than genuine adaptive mechanisms. Weakened crowding in the PRL is explained by transient adaptation in healthy subjects to artificial scotomas, not by long-term plasticity. Higher visual areas may show compensatory mechanisms enhancing complex tasks like symmetry, face, and motion discrimination. Leveraging spontaneous adaptation through perceptual learning-based treatments can preserve residual visual abilities. Because of limited evidence for spontaneous reorganization in AMD, behavioural training and emerging techniques are crucial for optimal treatment efficacy.
Collapse
Affiliation(s)
- Giulio Contemori
- Department of General Psychology, University of Padova, Padova, Italy.
| | - Jade Guenot
- Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA
| | - Benoit R Cottereau
- CerCo UMR 5549, CNRS - Université Toulouse III, Toulouse, France; IPAL, CNRS IRL 2955, Singapore, Singapore
| | - Yves Trotter
- CerCo UMR 5549, CNRS - Université Toulouse III, Toulouse, France
| | - Luca Battaglini
- Department of General Psychology, University of Padova, Padova, Italy; Centro di Ateneo dei Servizi Clinici Universitari Psicologici (SCUP), University of Padova, Padova, Italy; Neuro.Vis.U.S, University of Padova, Padova, Italy
| | - Marco Bertamini
- Department of General Psychology, University of Padova, Padova, Italy
| |
Collapse
|
2
|
Shahidi N, Rozenblit F, Khani MH, Schreyer HM, Mietsch M, Protti DA, Gollisch T. Filter-based models of suppression in retinal ganglion cells: Comparison and generalization across species and stimuli. PLoS Comput Biol 2025; 21:e1013031. [PMID: 40315420 DOI: 10.1371/journal.pcbi.1013031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Accepted: 04/07/2025] [Indexed: 05/04/2025] Open
Abstract
The dichotomy of excitation and suppression is one of the canonical mechanisms explaining the complexity of neural activity. Computational models of the interplay of excitation and suppression in single neurons aim at investigating how this interaction affects a neuron's spiking responses and shapes the encoding of sensory stimuli. Here, we compare the performance of three filter-based stimulus-encoding models for predicting retinal ganglion cell responses recorded from axolotl, mouse, and marmoset retina to different types of temporally varying visual stimuli. Suppression in these models is implemented via subtractive or divisive interactions of stimulus filters or by a response-driven feedback module. For the majority of ganglion cells, the subtractive and divisive models perform similarly and outperform the feedback model as well as a linear-nonlinear (LN) model with no suppression. Comparison between the subtractive and the divisive model depends on cell type, species, and stimulus components, with the divisive model generalizing best across temporal stimulus frequencies and visual contrast and the subtractive model capturing in particular responses for slow temporal stimulus dynamics and for slow axolotl cells. Overall, we conclude that the divisive and subtractive models are well suited for capturing interactions of excitation and suppression in ganglion cells and perform best for different temporal regimes of these interactions.
Collapse
Affiliation(s)
- Neda Shahidi
- Department of Ophthalmology, University Medical Center Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- Georg-Elias-Müller-Institute for Psychology, Georg-August-Universität Göttingen, Göttingen, Germany
- Cognitive Neuroscience Lab, German Primate Center, Göttingen, Germany
| | - Fernando Rozenblit
- Department of Ophthalmology, University Medical Center Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Mohammad H Khani
- Department of Ophthalmology, University Medical Center Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Helene M Schreyer
- Department of Ophthalmology, University Medical Center Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Matthias Mietsch
- Laboratory Animal Science Unit, German Primate Center, Göttingen, Germany
- German Center for Cardiovascular Research, Partner Site Göttingen, Göttingen, Germany
| | - Dario A Protti
- School of Medical Sciences (Neuroscience), The University of Sydney, Sydney, New South Wales, Australia
| | - Tim Gollisch
- Department of Ophthalmology, University Medical Center Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- Cluster of Excellence "Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells" (MBExC), University of Göttingen, Göttingen, Germany
- Else Kröner Fresenius Center for Optogenetic Therapies, University Medical Center Göttingen, Göttingen, Germany
| |
Collapse
|
3
|
Hedjar L, Martinovic J, Andersen SK, Shapiro AG. Separation of luminance and contrast modulation in steady-state visual evoked potentials. Vision Res 2025; 230:108567. [PMID: 40054086 DOI: 10.1016/j.visres.2025.108567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 02/12/2025] [Accepted: 02/17/2025] [Indexed: 04/23/2025]
Abstract
Neurons in the retina and early visual cortex respond primarily to local luminance contrast rather than overall luminance energy. The distinction between luminance and contrast processing is revealed in its most striking form by the contrast asynchrony paradigm: two discs with bright and dark surrounds modulate in luminance. When the discs modulate at 3-6 Hz, there is a percept of antiphase flicker even though the luminance modulation of the patches is in phase. To establish the neural basis of this perceptual phenomenon, we conducted a study using steady-state visual evoked potentials (SSVEPs) aiming to identify specific contrast and luminance signals. Deconstructing contrast asynchrony into its constituent elements, we displayed eight discs modulating sinusoidally from dark to bright on one of three backgrounds (bright, midgray, dark). In the first experiment, disc modulation and background luminances spanned a narrow range (30-34 cd/m2) to avoid VEP saturation (Weber contrast ≤15.5%) at two frequencies: 3 Hz, falling inside the contrast asynchrony temporal range, and 7.14 Hz, falling outside this range. In the second experiment, luminances and contrasts spanned a large range (0-64 cd/m2) at three frequencies (3, 5, 7.14 Hz) to evaluate the degree to which VEP response non-linearities would affect observed data patterns. With lower contrast modulation at 3 Hz, SSVEP amplitudes and phases correspond to the temporal signatures of contrast - not luminance - modulation. However, at higher frequencies and/or contrasts, this orderly pattern was largely replaced by more complex patterns that no longer directly corresponded to the luminance or contrast of the stimulus.
Collapse
Affiliation(s)
- Laysa Hedjar
- Justus-Liebig-Universität Gießen, Gießen, Germany.
| | | | - Søren K Andersen
- University of Southern Denmark, Denmark; University of Aberdeen, Scotland, United Kingdom
| | | |
Collapse
|
4
|
Luna R, Serrano-Pedraza I, Bertalmío M. Overcoming the limitations of motion sensor models by considering dendritic computations. Sci Rep 2025; 15:9213. [PMID: 40097493 PMCID: PMC11914070 DOI: 10.1038/s41598-025-90095-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 02/10/2025] [Indexed: 03/19/2025] Open
Abstract
The estimation of motion is an essential process for any sighted animal. Computational models of motion sensors have a long and successful history but they still suffer from basic shortcomings, as they disagree with physiological evidence and each model is dedicated to a specific type of motion, which is controversial from a biological standpoint. In this work, we propose a new approach to modeling motion sensors that considers dendritic computations, a key aspect for predicting single-neuron responses that had previously been absent from motion models. We show how, by taking into account the dynamic and input-dependent nature of dendritic nonlinearities, our motion sensor model is able to overcome the fundamental limitations of standard approaches.
Collapse
Affiliation(s)
- Raúl Luna
- Department of Psychobiology and Methodology for Behavioural Sciences, Faculty of Psychology, Universidad Complutense de Madrid, Madrid, Spain.
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain.
| | - Ignacio Serrano-Pedraza
- Department of Experimental Psychology, Faculty of Psychology, Universidad Complutense de Madrid, Madrid, Spain
| | - Marcelo Bertalmío
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain.
| |
Collapse
|
5
|
Di Santo S, Dipoppa M, Keller A, Roth M, Scanziani M, Miller KD. Contextual modulation emerges by integrating feedforward and feedback processing in mouse visual cortex. Cell Rep 2025; 44:115088. [PMID: 39709599 DOI: 10.1016/j.celrep.2024.115088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 09/27/2024] [Accepted: 11/27/2024] [Indexed: 12/24/2024] Open
Abstract
Sensory systems use context to infer meaning. Accordingly, context profoundly influences neural responses to sensory stimuli. However, a cohesive understanding of the circuit mechanisms governing contextual effects across different stimulus conditions is still lacking. Here we present a unified circuit model of mouse visual cortex that accounts for the main standard forms of contextual modulation. This data-driven and biologically realistic circuit, including three primary inhibitory cell types, sheds light on how bottom-up, top-down, and recurrent inputs are integrated across retinotopic space to generate contextual effects in layer 2/3. We establish causal relationships between neural responses, geometrical features of the inputs, and the connectivity patterns. The model not only reveals how a single canonical cortical circuit differently modulates sensory response depending on context but also generates multiple testable predictions, offering insights that apply to broader neural circuitry.
Collapse
Affiliation(s)
- Serena Di Santo
- Center for Theoretical Neuroscience and Mortimer B Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA; Departamento de Electromagnetismo y Física de la Materia and Instituto Carlos I de Física Teórica y Computacional, Universidad de Granada, 18071 Granada, Spain.
| | - Mario Dipoppa
- Center for Theoretical Neuroscience and Mortimer B Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA; Department of Neurobiology, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA
| | - Andreas Keller
- Department of Biomedicine, University of Basel, 4056 Basel, Switzerland; Department of Physiology, University of California, San Francisco, San Francisco, CA 94143, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Morgane Roth
- Department of Biomedicine, University of Basel, 4056 Basel, Switzerland; Department of Physiology, University of California, San Francisco, San Francisco, CA 94143, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Massimo Scanziani
- Department of Physiology, University of California, San Francisco, San Francisco, CA 94143, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Kenneth D Miller
- Center for Theoretical Neuroscience and Mortimer B Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA; Department of Neuroscience, Swartz Program in Theoretical Neuroscience, Kavli Institute for Brain Science, College of Physicians and Surgeons and Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA
| |
Collapse
|
6
|
Zheng J, Meister M. The unbearable slowness of being: Why do we live at 10 bits/s? Neuron 2025; 113:192-204. [PMID: 39694032 PMCID: PMC11758279 DOI: 10.1016/j.neuron.2024.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 10/31/2024] [Accepted: 11/12/2024] [Indexed: 12/20/2024]
Abstract
This article is about the neural conundrum behind the slowness of human behavior. The information throughput of a human being is about 10 bits/s. In comparison, our sensory systems gather data at ∼109 bits/s. The stark contrast between these numbers remains unexplained and touches on fundamental aspects of brain function: what neural substrate sets this speed limit on the pace of our existence? Why does the brain need billions of neurons to process 10 bits/s? Why can we only think about one thing at a time? The brain seems to operate in two distinct modes: the "outer" brain handles fast high-dimensional sensory and motor signals, whereas the "inner" brain processes the reduced few bits needed to control behavior. Plausible explanations exist for the large neuron numbers in the outer brain, but not for the inner brain, and we propose new research directions to remedy this.
Collapse
Affiliation(s)
- Jieyu Zheng
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Markus Meister
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
7
|
Zhou J, Chen Y, Whitmire M, Tan PK, Wu J, Veeraraghavan A, Robinson JT, Geisler W, Pieribone VA, Seidemann E. Fast neural population dynamics in primate V1 captured by a genetically-encoded voltage indicator. RESEARCH SQUARE 2025:rs.3.rs-5851261. [PMID: 39975906 PMCID: PMC11838743 DOI: 10.21203/rs.3.rs-5851261/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Genetically encoded voltage indicators (GEVIs) can measure millisecond-scale subthreshold neural responses with cell type specificity. Here, we successfully expressed, for the first time, a GEVI in excitatory V1 neurons in macaque monkeys. We then used widefield fluorescent imaging to measure V1 dynamics in response to visual stimuli with diverse temporal waveforms and contrasts, and compared these responses to signals measured using a genetically encoded calcium indicator (GECI) and a synthetic voltage-sensitive dye (VSD). When compared to GECI, GEVI captures faster response dynamics, tracks higher temporal frequencies, and responds to lower contrasts. To quantitatively characterize these three signals, we developed a simple nonlinear model that predicts the response dynamics to stimuli with arbitrary temporal waveforms and contrasts. Our results are consistent with the hypothesis that GEVI signals reflect the dynamics of locally summed membrane potentials, thus opening the door for a new class of experiments in behaving primates.
Collapse
Affiliation(s)
- Jingyang Zhou
- Center for Computational Neuroscience, Flatiron Institute, New York, USA
- Center for Perceptual Systems, University of Texas Austin, Austin, USA
- Center for Theoretical and Computational Neuroscience, University of Texas Austin, Austin, USA
- Department of Psychology, University of Texas Austin, Austin, USA
- Department of Neuroscience, University of Texas Austin, Austin, USA
| | - Yuzhi Chen
- Center for Perceptual Systems, University of Texas Austin, Austin, USA
- Center for Theoretical and Computational Neuroscience, University of Texas Austin, Austin, USA
- Department of Psychology, University of Texas Austin, Austin, USA
- Department of Neuroscience, University of Texas Austin, Austin, USA
| | - Matt Whitmire
- Center for Perceptual Systems, University of Texas Austin, Austin, USA
- Center for Theoretical and Computational Neuroscience, University of Texas Austin, Austin, USA
- Department of Psychology, University of Texas Austin, Austin, USA
- Department of Neuroscience, University of Texas Austin, Austin, USA
| | - Pin Kwang Tan
- Center for Perceptual Systems, University of Texas Austin, Austin, USA
- Center for Theoretical and Computational Neuroscience, University of Texas Austin, Austin, USA
- Department of Psychology, University of Texas Austin, Austin, USA
- Department of Neuroscience, University of Texas Austin, Austin, USA
| | - Jimin Wu
- Department of Bioengineering, Rice University, Houston, Texas, USA
| | - Ashok Veeraraghavan
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, USA
| | - Jacob T. Robinson
- Department of Bioengineering, Rice University, Houston, Texas, USA
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, USA
| | - Wilson Geisler
- Center for Perceptual Systems, University of Texas Austin, Austin, USA
- Center for Theoretical and Computational Neuroscience, University of Texas Austin, Austin, USA
- Department of Psychology, University of Texas Austin, Austin, USA
| | - Vincent A. Pieribone
- The Jon B. Pierce Laboratory, New Haven, CT, USA
- Department of Cellular & Molecular Physiology, Yale University, New Haven, CT, USA
- Department of Neuroscience, Yale University, New Haven, CT, USA
- Lamont Doherty Earth Observatory at Columbia University, Palisades, New York, USA
| | - Eyal Seidemann
- Center for Perceptual Systems, University of Texas Austin, Austin, USA
- Center for Theoretical and Computational Neuroscience, University of Texas Austin, Austin, USA
- Department of Psychology, University of Texas Austin, Austin, USA
- Department of Neuroscience, University of Texas Austin, Austin, USA
| |
Collapse
|
8
|
Baruzzi V, Indiveri G, Sabatini SP. Recurrent models of orientation selectivity enable robust early-vision processing in mixed-signal neuromorphic hardware. Nat Commun 2025; 16:243. [PMID: 39747257 PMCID: PMC11696034 DOI: 10.1038/s41467-024-55749-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 12/24/2024] [Indexed: 01/04/2025] Open
Abstract
Mixed signal analog/digital neuromorphic circuits represent an ideal medium for reproducing bio-physically realistic dynamics of biological neural systems in real-time. However, similar to their biological counterparts, these circuits have limited resolution and are affected by a high degree of variability. By developing a recurrent spiking neural network model of the retinocortical visual pathway, we show how such noisy and heterogeneous computing substrate can produce linear receptive fields tuned to visual stimuli with specific orientations and spatial frequencies. Compared to strictly feed-forward schemes, the model generates highly structured Gabor-like receptive fields of any phase symmetry, making optimal use of the hardware resources available in terms of synaptic connections and neuron numbers. Experimental results validate the approach, demonstrating how principles of neural computation can lead to robust sensory processing electronic systems, even when they are affected by high degree of heterogeneity, e.g., due to the use of analog circuits or memristive devices.
Collapse
Affiliation(s)
- Valentina Baruzzi
- Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genoa, Via Opera Pia 13, I-16145, Genoa, Italy
| | - Giacomo Indiveri
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Silvio P Sabatini
- Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genoa, Via Opera Pia 13, I-16145, Genoa, Italy.
| |
Collapse
|
9
|
Bertalmío M, Durán Vizcaíno A, Malo J, Wichmann FA. Plaid masking explained with input-dependent dendritic nonlinearities. Sci Rep 2024; 14:24856. [PMID: 39438555 PMCID: PMC11496684 DOI: 10.1038/s41598-024-75471-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
A serious obstacle for understanding early spatial vision comes from the failure of the so-called standard model (SM) to predict the perception of plaid masking. But the SM originated from a major oversimplification of single neuron computations, ignoring fundamental properties of dendrites. Here we show that a spatial vision model including computations mimicking the input-dependent nature of dendritic nonlinearities, i.e. including nonlinear neural summation, has the potential to explain plaid masking data.
Collapse
Affiliation(s)
| | | | - Jesús Malo
- Universitat de València, València, Spain
| | | |
Collapse
|
10
|
Muller L, Churchland PS, Sejnowski TJ. Transformers and cortical waves: encoders for pulling in context across time. Trends Neurosci 2024; 47:788-802. [PMID: 39341729 PMCID: PMC11936488 DOI: 10.1016/j.tins.2024.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/07/2024] [Accepted: 08/09/2024] [Indexed: 10/01/2024]
Abstract
The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.
Collapse
Affiliation(s)
- Lyle Muller
- Department of Mathematics, Western University, London, Ontario, Canada; Fields Laboratory for Network Science, Fields Institute, Toronto, Ontario, Canada.
| | - Patricia S Churchland
- Department of Philosophy, University of California at San Diego, San Diego, CA, USA.
| | - Terrence J Sejnowski
- Computational Neurobiology Laboratory, Salk Institute for Biological Studies, San Diego, CA, USA; Department of Neurobiology, University of California at San Diego, San Diego, CA, USA.
| |
Collapse
|
11
|
Cocuzza CV, Sanchez-Romero R, Ito T, Mill RD, Keane BP, Cole MW. Distributed network flows generate localized category selectivity in human visual cortex. PLoS Comput Biol 2024; 20:e1012507. [PMID: 39436929 PMCID: PMC11530028 DOI: 10.1371/journal.pcbi.1012507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 11/01/2024] [Accepted: 09/24/2024] [Indexed: 10/25/2024] Open
Abstract
A central goal of neuroscience is to understand how function-relevant brain activations are generated. Here we test the hypothesis that function-relevant brain activations are generated primarily by distributed network flows. We focused on visual processing in human cortex, given the long-standing literature supporting the functional relevance of brain activations in visual cortex regions exhibiting visual category selectivity. We began by using fMRI data from N = 352 human participants to identify category-specific responses in visual cortex for images of faces, places, body parts, and tools. We then systematically tested the hypothesis that distributed network flows can generate these localized visual category selective responses. This was accomplished using a recently developed approach for simulating - in a highly empirically constrained manner - the generation of task-evoked brain activations by modeling activity flowing over intrinsic brain connections. We next tested refinements to our hypothesis, focusing on how stimulus-driven network interactions initialized in V1 generate downstream visual category selectivity. We found evidence that network flows directly from V1 were sufficient for generating visual category selectivity, but that additional, globally distributed (whole-cortex) network flows increased category selectivity further. Using null network architectures we also found that each region's unique intrinsic "connectivity fingerprint" was key to the generation of category selectivity. These results generalized across regions associated with all four visual categories tested (bodies, faces, places, and tools), and provide evidence that the human brain's intrinsic network organization plays a prominent role in the generation of functionally relevant, localized responses.
Collapse
Affiliation(s)
- Carrisa V. Cocuzza
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, New Jersey, United States of America
- Behavioral and Neural Sciences PhD Program, Rutgers University, Newark, New Jersey, United States of America
- Department of Psychology, Yale University, New Haven, Connecticut, United States of America
- Department of Psychiatry, Brain Health Institute, Rutgers University, Piscataway, New Jersey, United States of America
| | - Ruben Sanchez-Romero
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, New Jersey, United States of America
| | - Takuya Ito
- Department of Psychiatry, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Ravi D. Mill
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, New Jersey, United States of America
| | - Brian P. Keane
- Department of Psychiatry and Neuroscience, University of Rochester Medical Center, Rochester, New York, United States of America
- Center for Visual Science, University of Rochester, Rochester, New York, United States of America
- Department of Brain and Cognitive Science, University of Rochester, Rochester, New York, United States of America
| | - Michael W. Cole
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, New Jersey, United States of America
| |
Collapse
|
12
|
Feng L, Zhao D, Zeng Y. Spiking generative adversarial network with attention scoring decoding. Neural Netw 2024; 178:106423. [PMID: 38906053 DOI: 10.1016/j.neunet.2024.106423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 04/22/2024] [Accepted: 05/31/2024] [Indexed: 06/23/2024]
Abstract
Generative models based on neural networks present a substantial challenge within deep learning. As it stands, such models are primarily limited to the domain of artificial neural networks. Spiking neural networks, as the third generation of neural networks, offer a closer approximation to brain-like processing due to their rich spatiotemporal dynamics. However, generative models based on spiking neural networks are not well studied. Particularly, previous works on generative adversarial networks based on spiking neural networks are conducted on simple datasets and do not perform well. In this work, we pioneer constructing a spiking generative adversarial network capable of handling complex images and having higher performance. Our first task is to identify the problems of out-of-domain inconsistency and temporal inconsistency inherent in spiking generative adversarial networks. We address these issues by incorporating the Earth-Mover distance and an attention-based weighted decoding method, significantly enhancing the performance of our algorithm across several datasets. Experimental results reveal that our approach outperforms existing methods on the MNIST, FashionMNIST, CIFAR10, and CelebA. In addition to our examination of static datasets, this study marks our inaugural investigation into event-based data, through which we achieved noteworthy results. Moreover, compared with hybrid spiking generative adversarial networks, where the discriminator is an artificial analog neural network, our methodology demonstrates closer alignment with the information processing patterns observed in the mouse. Our code can be found at https://github.com/Brain-Cog-Lab/sgad.
Collapse
Affiliation(s)
- Linghao Feng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Future Technology, University of Chinese Academy of Sciences, China.
| | - Dongcheng Zhao
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Center for Long-term Artificial Intelligence, China.
| | - Yi Zeng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Center for Long-term Artificial Intelligence, China; Key Laboratory of Brain Cognition and Brain-inspired Intelligence Technology, CAS, China; School of Future Technology, University of Chinese Academy of Sciences, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, China.
| |
Collapse
|
13
|
Liao L, Xu K, Wu H, Chen C, Sun W, Yan Q, Jay Kuo CC, Lin W. Blind Video Quality Prediction by Uncovering Human Video Perceptual Representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:4998-5013. [PMID: 39236121 DOI: 10.1109/tip.2024.3445738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks.
Collapse
|
14
|
Krüppel S, Khani MH, Schreyer HM, Sridhar S, Ramakrishna V, Zapp SJ, Mietsch M, Karamanlis D, Gollisch T. Applying Super-Resolution and Tomography Concepts to Identify Receptive Field Subunits in the Retina. PLoS Comput Biol 2024; 20:e1012370. [PMID: 39226328 PMCID: PMC11398665 DOI: 10.1371/journal.pcbi.1012370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 09/13/2024] [Accepted: 07/28/2024] [Indexed: 09/05/2024] Open
Abstract
Spatially nonlinear stimulus integration by retinal ganglion cells lies at the heart of various computations performed by the retina. It arises from the nonlinear transmission of signals that ganglion cells receive from bipolar cells, which thereby constitute functional subunits within a ganglion cell's receptive field. Inferring these subunits from recorded ganglion cell activity promises a new avenue for studying the functional architecture of the retina. This calls for efficient methods, which leave sufficient experimental time to leverage the acquired knowledge for further investigating identified subunits. Here, we combine concepts from super-resolution microscopy and computed tomography and introduce super-resolved tomographic reconstruction (STR) as a technique to efficiently stimulate and locate receptive field subunits. Simulations demonstrate that this approach can reliably identify subunits across a wide range of model variations, and application in recordings of primate parasol ganglion cells validates the experimental feasibility. STR can potentially reveal comprehensive subunit layouts within only a few tens of minutes of recording time, making it ideal for online analysis and closed-loop investigations of receptive field substructure in retina recordings.
Collapse
Affiliation(s)
- Steffen Krüppel
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- Cluster of Excellence "Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells" (MBExC), University of Göttingen, Göttingen, Germany
| | - Mohammad H Khani
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Helene M Schreyer
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Shashwat Sridhar
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Varsha Ramakrishna
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- International Max Planck Research School for Neurosciences, Göttingen, Germany
| | - Sören J Zapp
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Matthias Mietsch
- German Primate Center, Laboratory Animal Science Unit, Göttingen, Germany
- German Center for Cardiovascular Research, Partner Site Göttingen, Göttingen, Germany
| | - Dimokratis Karamanlis
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
| | - Tim Gollisch
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- Cluster of Excellence "Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells" (MBExC), University of Göttingen, Göttingen, Germany
- Else Kröner Fresenius Center for Optogenetic Therapies, University Medical Center Göttingen, Göttingen, Germany
| |
Collapse
|
15
|
Duecker K, Idiart M, van Gerven M, Jensen O. Oscillations in an artificial neural network convert competing inputs into a temporal code. PLoS Comput Biol 2024; 20:e1012429. [PMID: 39259769 PMCID: PMC11419396 DOI: 10.1371/journal.pcbi.1012429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 09/23/2024] [Accepted: 08/17/2024] [Indexed: 09/13/2024] Open
Abstract
The field of computer vision has long drawn inspiration from neuroscientific studies of the human and non-human primate visual system. The development of convolutional neural networks (CNNs), for example, was informed by the properties of simple and complex cells in early visual cortex. However, the computational relevance of oscillatory dynamics experimentally observed in the visual system are typically not considered in artificial neural networks (ANNs). Computational models of neocortical dynamics, on the other hand, rarely take inspiration from computer vision. Here, we combine methods from computational neuroscience and machine learning to implement multiplexing in a simple ANN using oscillatory dynamics. We first trained the network to classify individually presented letters. Post-training, we added temporal dynamics to the hidden layer, introducing refraction in the hidden units as well as pulsed inhibition mimicking neuronal alpha oscillations. Without these dynamics, the trained network correctly classified individual letters but produced a mixed output when presented with two letters simultaneously, indicating a bottleneck problem. When introducing refraction and oscillatory inhibition, the output nodes corresponding to the two stimuli activate sequentially, ordered along the phase of the inhibitory oscillations. Our model implements the idea that inhibitory oscillations segregate competing inputs in time. The results of our simulations pave the way for applications in deeper network architectures and more complicated machine learning problems.
Collapse
Affiliation(s)
- Katharina Duecker
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, United Kingdom
- Department of Neuroscience, Brown University, Providence, Rhode Island, United States of America
| | - Marco Idiart
- Institute of Physics, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Ole Jensen
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, United Kingdom
| |
Collapse
|
16
|
Zhang J, Huang L, Ma Z, Zhou H. Predicting the temporal-dynamic trajectories of cortical neuronal responses in non-human primates based on deep spiking neural network. Cogn Neurodyn 2024; 18:1977-1988. [PMID: 39104695 PMCID: PMC11297849 DOI: 10.1007/s11571-023-09989-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 05/25/2023] [Accepted: 06/21/2023] [Indexed: 08/07/2024] Open
Abstract
Deep convolutional neural networks (CNNs) are commonly used as computational models for the primate ventral stream, while deep spiking neural networks (SNNs) incorporated with both the temporal and spatial spiking information still lack investigation. We compared performances of SNN and CNN in prediction of visual responses to the naturalistic stimuli in area V4, inferior temporal (IT), and orbitofrontal cortex (OFC). The accuracies based on SNN were significantly higher than that of CNN in prediction of temporal-dynamic trajectory and averaged firing rate of visual response in V4 and IT. The temporal dynamics were captured by SNN for neurons with diverse temporal profiles and category selectivities, and most sensitively captured around the time of peak responses for each brain region. Consistently, SNN activities showed significantly stronger correlations with IT, V4 and OFC responses. In SNN, correlations with neural activities were stronger for later time-step features than early time-step features. The temporal-dynamic prediction was also significantly improved by considering preceding neural activities during the prediction. Thus, our study demonstrated SNN as a powerful temporal-dynamic model for cortical responses to complex naturalistic stimuli.
Collapse
Affiliation(s)
- Jie Zhang
- The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
- University of Chinese Academy of Sciences, Beijing, 100049 China
- Peng Cheng Laboratory, Shenzhen, 518000 China
| | - Liwei Huang
- Peng Cheng Laboratory, Shenzhen, 518000 China
- Peking University, Beijing, 100871 China
| | - Zhengyu Ma
- Peng Cheng Laboratory, Shenzhen, 518000 China
| | - Huihui Zhou
- The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
- Peng Cheng Laboratory, Shenzhen, 518000 China
| |
Collapse
|
17
|
Chen Y, Beech P, Yin Z, Jia S, Zhang J, Yu Z, Liu JK. Decoding dynamic visual scenes across the brain hierarchy. PLoS Comput Biol 2024; 20:e1012297. [PMID: 39093861 PMCID: PMC11324145 DOI: 10.1371/journal.pcbi.1012297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 08/14/2024] [Accepted: 07/03/2024] [Indexed: 08/04/2024] Open
Abstract
Understanding the computational mechanisms that underlie the encoding and decoding of environmental stimuli is a crucial investigation in neuroscience. Central to this pursuit is the exploration of how the brain represents visual information across its hierarchical architecture. A prominent challenge resides in discerning the neural underpinnings of the processing of dynamic natural visual scenes. Although considerable research efforts have been made to characterize individual components of the visual pathway, a systematic understanding of the distinctive neural coding associated with visual stimuli, as they traverse this hierarchical landscape, remains elusive. In this study, we leverage the comprehensive Allen Visual Coding-Neuropixels dataset and utilize the capabilities of deep learning neural network models to study neural coding in response to dynamic natural visual scenes across an expansive array of brain regions. Our study reveals that our decoding model adeptly deciphers visual scenes from neural spiking patterns exhibited within each distinct brain area. A compelling observation arises from the comparative analysis of decoding performances, which manifests as a notable encoding proficiency within the visual cortex and subcortical nuclei, in contrast to a relatively reduced encoding activity within hippocampal neurons. Strikingly, our results unveil a robust correlation between our decoding metrics and well-established anatomical and functional hierarchy indexes. These findings corroborate existing knowledge in visual coding related to artificial visual stimuli and illuminate the functional role of these deeper brain regions using dynamic stimuli. Consequently, our results suggest a novel perspective on the utility of decoding neural network models as a metric for quantifying the encoding quality of dynamic natural visual scenes represented by neural responses, thereby advancing our comprehension of visual coding within the complex hierarchy of the brain.
Collapse
Affiliation(s)
- Ye Chen
- School of Computer Science, Peking University, Beijing, China
- Institute for Artificial Intelligence, Peking University, Beijing, China
| | - Peter Beech
- School of Computing, University of Leeds, Leeds, United Kingdom
| | - Ziwei Yin
- School of Computer Science, Centre for Human Brain Health, University of Birmingham, Birmingham, United Kingdom
| | - Shanshan Jia
- School of Computer Science, Peking University, Beijing, China
- Institute for Artificial Intelligence, Peking University, Beijing, China
| | - Jiayi Zhang
- Institutes of Brain Science, State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science and Institute for Medical and Engineering Innovation, Eye & ENT Hospital, Fudan University, Shanghai, China
| | - Zhaofei Yu
- School of Computer Science, Peking University, Beijing, China
- Institute for Artificial Intelligence, Peking University, Beijing, China
| | - Jian K. Liu
- School of Computing, University of Leeds, Leeds, United Kingdom
- School of Computer Science, Centre for Human Brain Health, University of Birmingham, Birmingham, United Kingdom
| |
Collapse
|
18
|
Lahner B, Dwivedi K, Iamshchinina P, Graumann M, Lascelles A, Roig G, Gifford AT, Pan B, Jin S, Ratan Murty NA, Kay K, Oliva A, Cichy R. Modeling short visual events through the BOLD moments video fMRI dataset and metadata. Nat Commun 2024; 15:6241. [PMID: 39048577 PMCID: PMC11269733 DOI: 10.1038/s41467-024-50310-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 07/04/2024] [Indexed: 07/27/2024] Open
Abstract
Studying the neural basis of human dynamic visual perception requires extensive experimental data to evaluate the large swathes of functionally diverse brain neural networks driven by perceiving visual events. Here, we introduce the BOLD Moments Dataset (BMD), a repository of whole-brain fMRI responses to over 1000 short (3 s) naturalistic video clips of visual events across ten human subjects. We use the videos' extensive metadata to show how the brain represents word- and sentence-level descriptions of visual events and identify correlates of video memorability scores extending into the parietal cortex. Furthermore, we reveal a match in hierarchical processing between cortical regions of interest and video-computable deep neural networks, and we showcase that BMD successfully captures temporal dynamics of visual events at second resolution. With its rich metadata, BMD offers new perspectives and accelerates research on the human brain basis of visual event perception.
Collapse
Affiliation(s)
- Benjamin Lahner
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA.
| | - Kshitij Dwivedi
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Department of Computer Science, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Polina Iamshchinina
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
| | - Monika Graumann
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
| | - Alex Lascelles
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - Gemma Roig
- Department of Computer Science, Goethe University Frankfurt, Frankfurt am Main, Germany
- The Hessian Center for AI (hessian.AI), Darmstadt, Germany
| | | | - Bowen Pan
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - SouYoung Jin
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - N Apurva Ratan Murty
- Department of Brain and Cognitive Science, MIT, Cambridge, MA, USA
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Kendrick Kay
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN, USA
| | - Aude Oliva
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - Radoslaw Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
19
|
Quaia C, Krauzlis RJ. Object recognition in primates: what can early visual areas contribute? Front Behav Neurosci 2024; 18:1425496. [PMID: 39070778 PMCID: PMC11272660 DOI: 10.3389/fnbeh.2024.1425496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 07/01/2024] [Indexed: 07/30/2024] Open
Abstract
Introduction If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. Methods To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. Results We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Discussion Because peripheral recognition is both crucial to enable fine recognition (by bringing an object of interest on the fovea), and probably sufficient to account for a considerable fraction of our daily recognition-guided behavior, we think that the current focus on area IT and foveal processing is too narrow. We propose that rather than a hierarchical system with IT-like properties as its primary aim, object recognition should be seen as a parallel process, with high-accuracy foveal modules operating in parallel with lower-accuracy and faster modules that can operate across the visual field.
Collapse
Affiliation(s)
- Christian Quaia
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, United States
| | | |
Collapse
|
20
|
Quaia C, Krauzlis RJ. Object recognition in primates: What can early visual areas contribute? ARXIV 2024:arXiv:2407.04816v1. [PMID: 39398202 PMCID: PMC11468158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Because peripheral recognition is both crucial to enable fine recognition (by bringing an object of interest on the fovea), and probably sufficient to account for a considerable fraction of our daily recognition-guided behavior, we think that the current focus on area IT and foveal processing is too narrow. We propose that rather than a hierarchical system with IT-like properties as its primary aim, object recognition should be seen as a parallel process, with high-accuracy foveal modules operating in parallel with lower-accuracy and faster modules that can operate across the visual field.
Collapse
Affiliation(s)
- Christian Quaia
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, USA
| | - Richard J Krauzlis
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, USA
| |
Collapse
|
21
|
Miao HY, Tong F. Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing. J Vis 2024; 24:1. [PMID: 38829629 PMCID: PMC11156204 DOI: 10.1167/jov.24.6.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/03/2024] [Indexed: 06/05/2024] Open
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
22
|
Lewis CM, Wunderle T, Fries P. Top-down modulation of visual cortical stimulus encoding and gamma independent of firing rates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.11.589006. [PMID: 38645050 PMCID: PMC11030389 DOI: 10.1101/2024.04.11.589006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Neurons in primary visual cortex integrate sensory input with signals reflecting the animal's internal state to support flexible behavior. Internal variables, such as expectation, attention, or current goals, are imposed in a top-down manner via extensive feedback projections from higher-order areas. We optogenetically activated a high-order visual area, area 21a, in the lightly anesthetized cat (OptoTD), while recording from neuronal populations in V1. OptoTD induced strong, up to several fold, changes in gamma-band synchronization together with much smaller changes in firing rate, and the two effects showed no correlation. OptoTD effects showed specificity for the features of the simultaneously presented visual stimuli. OptoTD-induced changes in gamma synchronization, but not firing rates, were predictive of simultaneous changes in the amount of encoded stimulus information. Our findings suggest that one important role of top-down signals is to modulate synchronization and the information encoded by populations of sensory neurons.
Collapse
Affiliation(s)
- Christopher M. Lewis
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany
- Brain Research Institute, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Thomas Wunderle
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany
| | - Pascal Fries
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, 6525 EN Nijmegen, Netherlands
| |
Collapse
|
23
|
Pang R, Baker C, Murthy M, Pillow J. Inferring neural dynamics of memory during naturalistic social communication. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.26.577404. [PMID: 38328156 PMCID: PMC10849655 DOI: 10.1101/2024.01.26.577404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Memory processes in complex behaviors like social communication require forming representations of the past that grow with time. The neural mechanisms that support such continually growing memory remain unknown. We address this gap in the context of fly courtship, a natural social behavior involving the production and perception of long, complex song sequences. To study female memory for male song history in unrestrained courtship, we present 'Natural Continuation' (NC)-a general, simulation-based model comparison procedure to evaluate candidate neural codes for complex stimuli using naturalistic behavioral data. Applying NC to fly courtship revealed strong evidence for an adaptive population mechanism for how female auditory neural dynamics could convert long song histories into a rich mnemonic format. Song temporal patterning is continually transformed by heterogeneous nonlinear adaptation dynamics, then integrated into persistent activity, enabling common neural mechanisms to retain continuously unfolding information over long periods and yielding state-of-the-art predictions of female courtship behavior. At a population level this coding model produces multi-dimensional advection-diffusion-like responses that separate songs over a continuum of timescales and can be linearly transformed into flexible output signals, illustrating its potential to create a generic, scalable mnemonic format for extended input signals poised to drive complex behavioral responses. This work thus shows how naturalistic behavior can directly inform neural population coding models, revealing here a novel process for memory formation.
Collapse
Affiliation(s)
- Rich Pang
- Princeton Neuroscience Institute, Princeton, NJ, USA
- Center for the Physics of Biological Function, Princeton, NJ and New York, NY, USA
| | - Christa Baker
- Princeton Neuroscience Institute, Princeton, NJ, USA
- Present address: Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
| | - Mala Murthy
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | | |
Collapse
|
24
|
Weidler T, Goebel R, Senden M. AngoraPy: A Python toolkit for modeling anthropomorphic goal-driven sensorimotor systems. Front Neuroinform 2023; 17:1223687. [PMID: 38204578 PMCID: PMC10777840 DOI: 10.3389/fninf.2023.1223687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 11/27/2023] [Indexed: 01/12/2024] Open
Abstract
Goal-driven deep learning increasingly supplements classical modeling approaches in computational neuroscience. The strength of deep neural networks as models of the brain lies in their ability to autonomously learn the connectivity required to solve complex and ecologically valid tasks, obviating the need for hand-engineered or hypothesis-driven connectivity patterns. Consequently, goal-driven models can generate hypotheses about the neurocomputations underlying cortical processing that are grounded in macro- and mesoscopic anatomical properties of the network's biological counterpart. Whereas, goal-driven modeling is already becoming prevalent in the neuroscience of perception, its application to the sensorimotor domain is currently hampered by the complexity of the methods required to train models comprising the closed sensation-action loop. This paper describes AngoraPy, a Python library that mitigates this obstacle by providing researchers with the tools necessary to train complex recurrent convolutional neural networks that model the human sensorimotor system. To make the technical details of this toolkit more approachable, an illustrative example that trains a recurrent toy model on in-hand object manipulation accompanies the theoretical remarks. An extensive benchmark on various classical, 3D robotic, and anthropomorphic control tasks demonstrates AngoraPy's general applicability to a wide range of tasks. Together with its ability to adaptively handle custom architectures, the flexibility of this toolkit demonstrates its power for goal-driven sensorimotor modeling.
Collapse
Affiliation(s)
- Tonio Weidler
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Rainer Goebel
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Mario Senden
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
25
|
Cowley BR, Stan PL, Pillow JW, Smith MA. Compact deep neural network models of visual cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.22.568315. [PMID: 38045255 PMCID: PMC10690296 DOI: 10.1101/2023.11.22.568315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
A powerful approach to understanding the computations carried out in visual cortex is to develop models that predict neural responses to arbitrary images. Deep neural network (DNN) models have worked remarkably well at predicting neural responses [1, 2, 3], yet their underlying computations remain buried in millions of parameters. Have we simply replaced one complicated system in vivo with another in silico? Here, we train a data-driven deep ensemble model that predicts macaque V4 responses ~50% more accurately than currently-used task-driven DNN models. We then compress this deep ensemble to identify compact models that have 5,000x fewer parameters yet equivalent accuracy as the deep ensemble. We verified that the stimulus preferences of the compact models matched those of the real V4 neurons by measuring V4 responses to both 'maximizing' and adversarial images generated using compact models. We then analyzed the inner workings of the compact models and discovered a common circuit motif: Compact models share a similar set of filters in early stages of processing but then specialize by heavily consolidating this shared representation with a precise readout. This suggests that a V4 neuron's stimulus preference is determined entirely by its consolidation step. To demonstrate this, we investigated the compression step of a dot-detecting compact model and found a set of simple computations that may be carried out by dot-selective V4 neurons. Overall, our work demonstrates that the DNN models currently used in computational neuroscience are needlessly large; our approach provides a new way forward for obtaining explainable, high-accuracy models of visual cortical neurons.
Collapse
Affiliation(s)
- Benjamin R. Cowley
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Patricia L. Stan
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
| | - Jonathan W. Pillow
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Matthew A. Smith
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
| |
Collapse
|
26
|
Malik G, Crowder D, Mingolla E. Extreme image transformations affect humans and machines differently. BIOLOGICAL CYBERNETICS 2023; 117:331-343. [PMID: 37310489 PMCID: PMC10600046 DOI: 10.1007/s00422-023-00968-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 05/26/2023] [Indexed: 06/14/2023]
Abstract
Some recent artificial neural networks (ANNs) claim to model aspects of primate neural and human performance data. Their success in object recognition is, however, dependent on exploiting low-level features for solving visual tasks in a way that humans do not. As a result, out-of-distribution or adversarial input is often challenging for ANNs. Humans instead learn abstract patterns and are mostly unaffected by many extreme image distortions. We introduce a set of novel image transforms inspired by neurophysiological findings and evaluate humans and ANNs on an object recognition task. We show that machines perform better than humans for certain transforms and struggle to perform at par with humans on others that are easy for humans. We quantify the differences in accuracy for humans and machines and find a ranking of difficulty for our transforms for human data. We also suggest how certain characteristics of human visual processing can be adapted to improve the performance of ANNs for our difficult-for-machines transforms.
Collapse
Affiliation(s)
- Girik Malik
- Northeastern University, Boston, MA 02115 USA
| | | | | |
Collapse
|
27
|
Wang C, Yan H, Huang W, Sheng W, Wang Y, Fan YS, Liu T, Zou T, Li R, Chen H. Neural encoding with unsupervised spiking convolutional neural network. Commun Biol 2023; 6:880. [PMID: 37640808 PMCID: PMC10462614 DOI: 10.1038/s42003-023-05257-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 08/18/2023] [Indexed: 08/31/2023] Open
Abstract
Accurately predicting the brain responses to various stimuli poses a significant challenge in neuroscience. Despite recent breakthroughs in neural encoding using convolutional neural networks (CNNs) in fMRI studies, there remain critical gaps between the computational rules of traditional artificial neurons and real biological neurons. To address this issue, a spiking CNN (SCNN)-based framework is presented in this study to achieve neural encoding in a more biologically plausible manner. The framework utilizes unsupervised SCNN to extract visual features of image stimuli and employs a receptive field-based regression algorithm to predict fMRI responses from the SCNN features. Experimental results on handwritten characters, handwritten digits and natural images demonstrate that the proposed approach can achieve remarkably good encoding performance and can be utilized for "brain reading" tasks such as image reconstruction and identification. This work suggests that SNN can serve as a promising tool for neural encoding.
Collapse
Affiliation(s)
- Chong Wang
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hongmei Yan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Wei Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Wei Sheng
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yuting Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yun-Shuang Fan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Tao Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Ting Zou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Rong Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Huafu Chen
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 611731, China.
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| |
Collapse
|
28
|
Gong Z, Zhou M, Dai Y, Wen Y, Liu Y, Zhen Z. A large-scale fMRI dataset for the visual processing of naturalistic scenes. Sci Data 2023; 10:559. [PMID: 37612327 PMCID: PMC10447576 DOI: 10.1038/s41597-023-02471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/14/2023] [Indexed: 08/25/2023] Open
Abstract
One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put a lot of effort into collecting large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.
Collapse
Affiliation(s)
- Zhengxin Gong
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Ming Zhou
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Yuxuan Dai
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Yushan Wen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Youyi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China.
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
29
|
Luna R, Zabaleta I, Bertalmío M. State-of-the-art image and video quality assessment with a metric based on an intrinsically non-linear neural summation model. Front Neurosci 2023; 17:1222815. [PMID: 37559700 PMCID: PMC10408451 DOI: 10.3389/fnins.2023.1222815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/30/2023] [Indexed: 08/11/2023] Open
Abstract
The development of automatic methods for image and video quality assessment that correlate well with the perception of human observers is a very challenging open problem in vision science, with numerous practical applications in disciplines such as image processing and computer vision, as well as in the media industry. In the past two decades, the goal of image quality research has been to improve upon classical metrics by developing models that emulate some aspects of the visual system, and while the progress has been considerable, state-of-the-art quality assessment methods still share a number of shortcomings, like their performance dropping considerably when they are tested on a database that is quite different from the one used to train them, or their significant limitations in predicting observer scores for high framerate videos. In this work we propose a novel objective method for image and video quality assessment that is based on the recently introduced Intrinsically Non-linear Receptive Field (INRF) formulation, a neural summation model that has been shown to be better at predicting neural activity and visual perception phenomena than the classical linear receptive field. Here we start by optimizing, on a classic image quality database, the four parameters of a very simple INRF-based metric, and proceed to test this metric on three other databases, showing that its performance equals or surpasses that of the state-of-the-art methods, some of them having millions of parameters. Next, we extend to the temporal domain this INRF image quality metric, and test it on several popular video quality datasets; again, the results of our proposed INRF-based video quality metric are shown to be very competitive.
Collapse
Affiliation(s)
- Raúl Luna
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| | - Itziar Zabaleta
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Marcelo Bertalmío
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| |
Collapse
|
30
|
Penacchio O, Otazu X, Wilkins AJ, Haigh SM. A mechanistic account of visual discomfort. Front Neurosci 2023; 17:1200661. [PMID: 37547142 PMCID: PMC10397803 DOI: 10.3389/fnins.2023.1200661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 06/27/2023] [Indexed: 08/08/2023] Open
Abstract
Much of the neural machinery of the early visual cortex, from the extraction of local orientations to contextual modulations through lateral interactions, is thought to have developed to provide a sparse encoding of contour in natural scenes, allowing the brain to process efficiently most of the visual scenes we are exposed to. Certain visual stimuli, however, cause visual stress, a set of adverse effects ranging from simple discomfort to migraine attacks, and epileptic seizures in the extreme, all phenomena linked with an excessive metabolic demand. The theory of efficient coding suggests a link between excessive metabolic demand and images that deviate from natural statistics. Yet, the mechanisms linking energy demand and image spatial content in discomfort remain elusive. Here, we used theories of visual coding that link image spatial structure and brain activation to characterize the response to images observers reported as uncomfortable in a biologically based neurodynamic model of the early visual cortex that included excitatory and inhibitory layers to implement contextual influences. We found three clear markers of aversive images: a larger overall activation in the model, a less sparse response, and a more unbalanced distribution of activity across spatial orientations. When the ratio of excitation over inhibition was increased in the model, a phenomenon hypothesised to underlie interindividual differences in susceptibility to visual discomfort, the three markers of discomfort progressively shifted toward values typical of the response to uncomfortable stimuli. Overall, these findings propose a unifying mechanistic explanation for why there are differences between images and between observers, suggesting how visual input and idiosyncratic hyperexcitability give rise to abnormal brain responses that result in visual stress.
Collapse
Affiliation(s)
- Olivier Penacchio
- Department of Computer Science, Universitat Autònoma de Barcelona, Bellaterra, Spain
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Spain
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, United Kingdom
| | - Xavier Otazu
- Department of Computer Science, Universitat Autònoma de Barcelona, Bellaterra, Spain
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Arnold J. Wilkins
- Department of Psychology, University of Essex, Colchester, United Kingdom
| | - Sarah M. Haigh
- Department of Psychology, University of Nevada Reno, Reno, NV, United States
- Institute for Neuroscience, University of Nevada Reno, Reno, NV, United States
| |
Collapse
|
31
|
Ahn J, Ryu J, Lee S, Lee C, Im CH, Lee SH. Transcranial direct current stimulation elevates the baseline activity while sharpening the spatial tuning of the human visual cortex. Brain Stimul 2023; 16:1154-1164. [PMID: 37517465 DOI: 10.1016/j.brs.2023.07.052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 06/04/2023] [Accepted: 07/23/2023] [Indexed: 08/01/2023] Open
Affiliation(s)
- Jeongyeol Ahn
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea.
| | - Juhyoung Ryu
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea
| | - Sangjun Lee
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Chany Lee
- Department of Structure & Function of Neural Network, Korea Brain Research Institute, Daegu, Republic of Korea
| | - Chang-Hwan Im
- Department of Biomedical Engineering, Hanyang University, Seoul, Republic of Korea
| | - Sang-Hun Lee
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
32
|
Ladret HJ, Cortes N, Ikan L, Chavane F, Casanova C, Perrinet LU. Cortical recurrence supports resilience to sensory variance in the primary visual cortex. Commun Biol 2023; 6:667. [PMID: 37353519 PMCID: PMC10290066 DOI: 10.1038/s42003-023-05042-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 06/13/2023] [Indexed: 06/25/2023] Open
Abstract
Our daily endeavors occur in a complex visual environment, whose intrinsic variability challenges the way we integrate information to make decisions. By processing myriads of parallel sensory inputs, our brain is theoretically able to compute the variance of its environment, a cue known to guide our behavior. Yet, the neurobiological and computational basis of such variance computations are still poorly understood. Here, we quantify the dynamics of sensory variance modulations of cat primary visual cortex neurons. We report two archetypal neuronal responses, one of which is resilient to changes in variance and co-encodes the sensory feature and its variance, improving the population encoding of orientation. The existence of these variance-specific responses can be accounted for by a model of intracortical recurrent connectivity. We thus propose that local recurrent circuits process uncertainty as a generic computation, advancing our understanding of how the brain handles naturalistic inputs.
Collapse
Affiliation(s)
- Hugo J Ladret
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France.
- School of Optometry, Université de Montréal, Montréal, Canada.
| | - Nelson Cortes
- School of Optometry, Université de Montréal, Montréal, Canada
| | - Lamyae Ikan
- School of Optometry, Université de Montréal, Montréal, Canada
| | - Frédéric Chavane
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France
| | | | - Laurent U Perrinet
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France
| |
Collapse
|
33
|
Marrazzo G, De Martino F, Lage-Castellanos A, Vaessen MJ, de Gelder B. Voxelwise encoding models of body stimuli reveal a representational gradient from low-level visual features to postural features in occipitotemporal cortex. Neuroimage 2023:120240. [PMID: 37348622 DOI: 10.1016/j.neuroimage.2023.120240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/16/2023] [Accepted: 06/19/2023] [Indexed: 06/24/2023] Open
Abstract
Research on body representation in the brain has focused on category-specific representation, using fMRI to investigate the response pattern to body stimuli in occipitotemporal cortex without so far addressing the issue of the specific computations involved in body selective regions, only defined by higher order category selectivity. This study used ultra-high field fMRI and banded ridge regression to investigate the coding of body images, by comparing the performance of three encoding models in predicting brain activity in occipitotemporal cortex and specifically the extrastriate body area (EBA). Our results suggest that bodies are encoded in occipitotemporal cortex and in the EBA according to a combination of low-level visual features and postural features.
Collapse
Affiliation(s)
- Giuseppe Marrazzo
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands
| | - Federico De Martino
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands; Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States and Department of NeuroInformatics
| | - Agustin Lage-Castellanos
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands; Cuban Center for Neuroscience, Street 190 e/25 and 27 Cubanacán Playa Havana, CP 11600, Cuba
| | - Maarten J Vaessen
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands
| | - Beatrice de Gelder
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands.
| |
Collapse
|
34
|
Yates JL, Coop SH, Sarch GH, Wu RJ, Butts DA, Rucci M, Mitchell JF. Detailed characterization of neural selectivity in free viewing primates. Nat Commun 2023; 14:3656. [PMID: 37339973 PMCID: PMC10282080 DOI: 10.1038/s41467-023-38564-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 05/08/2023] [Indexed: 06/22/2023] Open
Abstract
Fixation constraints in visual tasks are ubiquitous in visual and cognitive neuroscience. Despite its widespread use, fixation requires trained subjects, is limited by the accuracy of fixational eye movements, and ignores the role of eye movements in shaping visual input. To overcome these limitations, we developed a suite of hardware and software tools to study vision during natural behavior in untrained subjects. We measured visual receptive fields and tuning properties from multiple cortical areas of marmoset monkeys who freely viewed full-field noise stimuli. The resulting receptive fields and tuning curves from primary visual cortex (V1) and area MT match reported selectivity from the literature which was measured using conventional approaches. We then combined free viewing with high-resolution eye tracking to make the first detailed 2D spatiotemporal measurements of foveal receptive fields in V1. These findings demonstrate the power of free viewing to characterize neural responses in untrained animals while simultaneously studying the dynamics of natural behavior.
Collapse
Affiliation(s)
- Jacob L Yates
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA.
- Center for Visual Science, University of Rochester, Rochester, NY, USA.
- Department of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, USA.
- Herbert Wertheim School of Optometry and Vision Science, UC Berkeley, Berkeley, CA, USA.
| | - Shanna H Coop
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
- Neurobiology, Stanford University, Stanford, CA, USA
| | - Gabriel H Sarch
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ruei-Jr Wu
- Center for Visual Science, University of Rochester, Rochester, NY, USA
- Institute of Optics, University of Rochester, Rochester, NY, USA
| | - Daniel A Butts
- Department of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, USA
| | - Michele Rucci
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Jude F Mitchell
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| |
Collapse
|
35
|
Price BH, Jensen CM, Khoudary AA, Gavornik JP. Expectation violations produce error signals in mouse V1. Cereb Cortex 2023; 33:8803-8820. [PMID: 37183176 PMCID: PMC10321125 DOI: 10.1093/cercor/bhad163] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/22/2023] [Accepted: 04/25/2023] [Indexed: 05/16/2023] Open
Abstract
Repeated exposure to visual sequences changes the form of evoked activity in the primary visual cortex (V1). Predictive coding theory provides a potential explanation for this, namely that plasticity shapes cortical circuits to encode spatiotemporal predictions and that subsequent responses are modulated by the degree to which actual inputs match these expectations. Here we use a recently developed statistical modeling technique called Model-Based Targeted Dimensionality Reduction (MbTDR) to study visually evoked dynamics in mouse V1 in the context of an experimental paradigm called "sequence learning." We report that evoked spiking activity changed significantly with training, in a manner generally consistent with the predictive coding framework. Neural responses to expected stimuli were suppressed in a late window (100-150 ms) after stimulus onset following training, whereas responses to novel stimuli were not. Substituting a novel stimulus for a familiar one led to increases in firing that persisted for at least 300 ms. Omitting predictable stimuli in trained animals also led to increased firing at the expected time of stimulus onset. Finally, we show that spiking data can be used to accurately decode time within the sequence. Our findings are consistent with the idea that plasticity in early visual circuits is involved in coding spatiotemporal information.
Collapse
Affiliation(s)
- Byron H Price
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
- Graduate Program in Neuroscience, Boston University, Boston, MA 02215, USA
| | - Cambria M Jensen
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
| | - Anthony A Khoudary
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
| | - Jeffrey P Gavornik
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
- Graduate Program in Neuroscience, Boston University, Boston, MA 02215, USA
| |
Collapse
|
36
|
Kay K, Bonnen K, Denison RN, Arcaro MJ, Barack DL. Tasks and their role in visual neuroscience. Neuron 2023; 111:1697-1713. [PMID: 37040765 DOI: 10.1016/j.neuron.2023.03.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 03/13/2023] [Accepted: 03/15/2023] [Indexed: 04/13/2023]
Abstract
Vision is widely used as a model system to gain insights into how sensory inputs are processed and interpreted by the brain. Historically, careful quantification and control of visual stimuli have served as the backbone of visual neuroscience. There has been less emphasis, however, on how an observer's task influences the processing of sensory inputs. Motivated by diverse observations of task-dependent activity in the visual system, we propose a framework for thinking about tasks, their role in sensory processing, and how we might formally incorporate tasks into our models of vision.
Collapse
Affiliation(s)
- Kendrick Kay
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Kathryn Bonnen
- School of Optometry, Indiana University, Bloomington, IN 47405, USA
| | - Rachel N Denison
- Department of Psychological and Brain Sciences, Boston University, Boston, MA 02215, USA
| | - Mike J Arcaro
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19146, USA
| | - David L Barack
- Departments of Neuroscience and Philosophy, University of Pennsylvania, Philadelphia, PA 19146, USA
| |
Collapse
|
37
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
38
|
Yun M, Hwang JY, Jung MW. Septotemporal variations in hippocampal value and outcome processing. Cell Rep 2023; 42:112094. [PMID: 36763498 DOI: 10.1016/j.celrep.2023.112094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 11/11/2022] [Accepted: 01/26/2023] [Indexed: 02/11/2023] Open
Abstract
A large body of evidence indicates functional variations along the hippocampal longitudinal axis. To investigate whether and how value and outcome processing vary between the dorsal (DH) and the ventral hippocampus (VH), we examined neuronal activity and inactivation effects of the DH and VH in mice performing probabilistic classical conditioning tasks. Inactivation of either structure disrupts value-dependent anticipatory licking, and value-coding neurons are found in both structures, indicating their involvement in value processing. However, the DH neuronal population increases activity as a function of value, while the VH neuronal population is preferentially responsive to the highest-value sensory cue. Also, signals related to outcome-dependent value learning are stronger in the DH. VH neurons instead show rapid responses to punishment and strongly biased responses to negative prediction error. These findings suggest that the DH faithfully represents the external value landscape, whereas the VH preferentially represents behaviorally relevant, salient features of experienced events.
Collapse
Affiliation(s)
- Miru Yun
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea; Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea
| | - Ji Young Hwang
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea; Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea
| | - Min Whan Jung
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea; Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea.
| |
Collapse
|
39
|
Victor JD, Rizvi SM, Bush JW, Conte MM. Discrimination of textures with spatial correlations and multiple gray levels. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2023; 40:237-258. [PMID: 36821194 PMCID: PMC9971653 DOI: 10.1364/josaa.472553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 12/05/2022] [Indexed: 06/18/2023]
Abstract
Analysis of visual texture is important for many key steps in early vision. We study visual sensitivity to image statistics in three families of textures that include multiple gray levels and correlations in two spatial dimensions. Sensitivities to positive and negative correlations are approximately independent of correlation sign, and signals from different kinds of correlations combine quadratically. We build a computational model, fully constrained by prior studies of sensitivity to uncorrelated textures and black-and-white textures with spatial correlations. The model accounts for many features of the new data, including sign-independence, quadratic combination, and the dependence on gray-level distribution.
Collapse
Affiliation(s)
- Jonathan D. Victor
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
| | - Syed M. Rizvi
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
- Currently with Centerlight Healthcare, 136-65 37th Ave., Flushing, NY 11354, USA
| | - Jacob W. Bush
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
- Currently with Shopify, 151 O’Connor St Ground floor, Ottawa, ON K2P 2L8, Canada
| | - Mary M. Conte
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
| |
Collapse
|
40
|
Baranauskas G, Rysevaite-Kyguoliene K, Sabeckis I, Pauza DH. Saturation of visual responses explains size tuning in rat collicular neurons. Eur J Neurosci 2023; 57:285-309. [PMID: 36451583 DOI: 10.1111/ejn.15877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 08/03/2022] [Accepted: 11/21/2022] [Indexed: 12/02/2022]
Abstract
The receptive field of many visual neurons is composed of a central responsive area, the classical receptive field, and a non-classical receptive field, also called the "suppressive surround." A visual stimulus placed in the suppressive surround does not induce any response but modulates visual responses to stimuli within the classical receptive field, usually by suppressing them. Therefore, visual responses become smaller when stimuli exceed the classical receptive field size. The stimulus size inducing the maximal response is called the preferred stimulus size. In cortex, there is good correspondence between the sizes of the classical receptive field and the preferred stimulus. In contrast, in the rodent superior colliculus, the preferred size is often several fold smaller than the classical receptive field size. Here, we show that in the rat superior colliculus, the preferred stimulus size changes as a square root of the contrast inverse and the classical receptive field size is independent of contrast. In addition, responses to annulus were largely independent of the inner hole size. To explain these data, three models were tested: the divisive modulation of the gain by the suppressive surround (the "normalization" model), the difference of the Gaussians, and a divisive model that incorporates saturation to light flux. Despite the same number of free parameters, the model incorporating saturation to light performed the best. Thus, our data indicate that in rats, the saturation to light can be a dominant phenomenon even at relatively low illumination levels defining visual responses in the collicular neurons.
Collapse
Affiliation(s)
- Gytis Baranauskas
- Neurophysiology Laboratory, Neuroscience Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | | | - Ignas Sabeckis
- Anatomy Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | - Dainius H Pauza
- Anatomy Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania
| |
Collapse
|
41
|
Parker PRL, Abe ETT, Leonard ESP, Martins DM, Niell CM. Joint coding of visual input and eye/head position in V1 of freely moving mice. Neuron 2022; 110:3897-3906.e5. [PMID: 36137549 PMCID: PMC9742335 DOI: 10.1016/j.neuron.2022.08.029] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/16/2022] [Accepted: 08/30/2022] [Indexed: 12/15/2022]
Abstract
Visual input during natural behavior is highly dependent on movements of the eyes and head, but how information about eye and head position is integrated with visual processing during free movement is unknown, as visual physiology is generally performed under head fixation. To address this, we performed single-unit electrophysiology in V1 of freely moving mice while simultaneously measuring the mouse's eye position, head orientation, and the visual scene from the mouse's perspective. From these measures, we mapped spatiotemporal receptive fields during free movement based on the gaze-corrected visual input. Furthermore, we found a significant fraction of neurons tuned for eye and head position, and these signals were integrated with visual responses through a multiplicative mechanism in the majority of modulated neurons. These results provide new insight into coding in the mouse V1 and, more generally, provide a paradigm for investigating visual physiology under natural conditions, including active sensing and ethological behavior.
Collapse
Affiliation(s)
- Philip R L Parker
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Elliott T T Abe
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Emmalyn S P Leonard
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Dylan M Martins
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Cristopher M Niell
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA.
| |
Collapse
|
42
|
Henry CA, Kohn A. Feature representation under crowding in macaque V1 and V4 neuronal populations. Curr Biol 2022; 32:5126-5137.e3. [PMID: 36379216 PMCID: PMC9729449 DOI: 10.1016/j.cub.2022.10.049] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 09/02/2022] [Accepted: 10/21/2022] [Indexed: 11/16/2022]
Abstract
Visual perception depends strongly on spatial context. A profound example is visual crowding, whereby the presence of nearby stimuli impairs the discriminability of object features. Despite extensive work on perceptual crowding and the spatial integrative properties of visual cortical neurons, the link between these two aspects of visual processing remains unclear. To understand better the neural basis of crowding, we recorded activity simultaneously from neuronal populations in V1 and V4 of fixating macaque monkeys. We assessed the information available from the measured responses about the orientation of a visual target both for targets presented in isolation and amid distractors. Both single neuron and population responses had less information about target orientation when distractors were present. Information loss was moderate in V1 and more substantial in V4. Information loss could be traced to systematic divisive and additive changes in neuronal tuning. Additive and multiplicative changes in tuning were more severe in V4; in addition, tuning exhibited other, non-affine transformations that were greater in V4, further restricting the ability of a fixed sensory readout strategy to extract accurate feature information across displays. Our results provide a direct test of crowding effects at different stages of the visual hierarchy. They reveal how crowded visual environments alter the spiking activity of cortical populations by which sensory stimuli are encoded and connect these changes to established mechanisms of neuronal spatial integration.
Collapse
Affiliation(s)
- Christopher A Henry
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA.
| | - Adam Kohn
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA.
| |
Collapse
|
43
|
Gifford AT, Dwivedi K, Roig G, Cichy RM. A large and rich EEG dataset for modeling human visual object recognition. Neuroimage 2022; 264:119754. [PMID: 36400378 PMCID: PMC9771828 DOI: 10.1016/j.neuroimage.2022.119754] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 09/14/2022] [Accepted: 11/14/2022] [Indexed: 11/16/2022] Open
Abstract
The human brain achieves visual object recognition through multiple stages of linear and nonlinear transformations operating at a millisecond scale. To predict and explain these rapid transformations, computational neuroscientists employ machine learning modeling techniques. However, state-of-the-art models require massive amounts of data to properly train, and to the present day there is a lack of vast brain datasets which extensively sample the temporal dynamics of visual object recognition. Here we collected a large and rich dataset of high temporal resolution EEG responses to images of objects on a natural background. This dataset includes 10 participants, each with 82,160 trials spanning 16,740 image conditions. Through computational modeling we established the quality of this dataset in five ways. First, we trained linearizing encoding models that successfully synthesized the EEG responses to arbitrary images. Second, we correctly identified the recorded EEG data image conditions in a zero-shot fashion, using EEG synthesized responses to hundreds of thousands of candidate image conditions. Third, we show that both the high number of conditions as well as the trial repetitions of the EEG dataset contribute to the trained models' prediction accuracy. Fourth, we built encoding models whose predictions well generalize to novel participants. Fifth, we demonstrate full end-to-end training of randomly initialized DNNs that output EEG responses for arbitrary input images. We release this dataset as a tool to foster research in visual neuroscience and computer vision.
Collapse
Affiliation(s)
- Alessandro T Gifford
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany; Einstein Center for Neurosciences Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany; Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany.
| | - Kshitij Dwivedi
- Department of Computer Science, Goethe Universität, Frankfurt am Main, Germany
| | - Gemma Roig
- Department of Computer Science, Goethe Universität, Frankfurt am Main, Germany
| | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany; Einstein Center for Neurosciences Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany; Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany; Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
44
|
Freedland J, Rieke F. Systematic reduction of the dimensionality of natural scenes allows accurate predictions of retinal ganglion cell spike outputs. Proc Natl Acad Sci U S A 2022; 119:e2121744119. [PMID: 36343230 PMCID: PMC9674269 DOI: 10.1073/pnas.2121744119] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 09/23/2022] [Indexed: 11/09/2022] Open
Abstract
The mammalian retina engages a broad array of linear and nonlinear circuit mechanisms to convert natural scenes into retinal ganglion cell (RGC) spike outputs. Although many individual integration mechanisms are well understood, we know less about how multiple mechanisms interact to encode the complex spatial features present in natural inputs. Here, we identified key spatial features in natural scenes that shape encoding by primate parasol RGCs. Our approach identified simplifications in the spatial structure of natural scenes that minimally altered RGC spike responses. We observed that reducing natural movies into 16 linearly integrated regions described ∼80% of the structure of parasol RGC spike responses; this performance depended on the number of regions but not their precise spatial locations. We used simplified stimuli to design high-dimensional metamers that recapitulated responses to naturalistic movies. Finally, we modeled the retinal computations that convert flashed natural images into one-dimensional spike counts.
Collapse
Affiliation(s)
- Julian Freedland
- Molecular Engineering & Sciences Institute, University of Washington, Seattle, WA 98195
| | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195
| |
Collapse
|
45
|
Spontaneous activity patterns in human motor cortex replay evoked activity patterns for hand movements. Sci Rep 2022; 12:16867. [PMID: 36207360 PMCID: PMC9546868 DOI: 10.1038/s41598-022-20866-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 09/20/2022] [Indexed: 11/08/2022] Open
Abstract
Spontaneous brain activity, measured with resting state fMRI (R-fMRI), is correlated among regions that are co-activated by behavioral tasks. It is unclear, however, whether spatial patterns of spontaneous activity within a cortical region correspond to spatial patterns of activity evoked by specific stimuli, actions, or mental states. The current study investigated the hypothesis that spontaneous activity in motor cortex represents motor patterns commonly occurring in daily life. To test this hypothesis 15 healthy participants were scanned while performing four different hand movements. Three movements (Grip, Extend, Pinch) were ecological involving grip and grasp hand movements; one control movement involving the rotation of the wrist was not ecological and infrequent (Shake). They were also scanned at rest before and after the execution of the motor tasks (resting-state scans). Using the task data, we identified movement-specific patterns in the primary motor cortex. These task-defined patterns were compared to resting-state patterns in the same motor region. We also performed a control analysis within the primary visual cortex. We found that spontaneous activity patterns in the primary motor cortex were more like task patterns for ecological than control movements. In contrast, there was no difference between ecological and control hand movements in the primary visual area. These findings provide evidence that spontaneous activity in human motor cortex forms fine-scale, patterned representations associated with behaviors that frequently occur in daily life.
Collapse
|
46
|
Hofstetter S, Dumoulin SO. Assessing the ecological validity of numerosity-selective neuronal populations with real-world natural scenes. iScience 2022; 25:105267. [PMID: 36274951 PMCID: PMC9579010 DOI: 10.1016/j.isci.2022.105267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 07/18/2022] [Accepted: 09/26/2022] [Indexed: 11/21/2022] Open
Abstract
Animals and humans are able to quickly and effortlessly estimate the number of items in a set: their numerosity. Numerosity perception is thought to be critical to behavior, from feeding to escaping predators to human mathematical cognition. Virtually, all scientific studies on numerosity mechanisms use well controlled but artificial stimuli to isolate the numerosity dimension from other physical quantities. Here, we probed the ecological validity of these artificial stimuli and evaluate whether an important component in numerosity processing, the numerosity-selective neural populations, also respond to numerosity of items in real-world natural scenes. Using 7T MRI and natural images from a wide range of categories, we provide evidence that the numerosity-tuned neuronal populations show numerosity-selective responses when viewing images from a real-world natural scene. Our findings strengthen the role of numerosity-selective neurons in numerosity perception and provide an important link to their function in numerosity perception in real-world settings.
Collapse
Affiliation(s)
- Shir Hofstetter
- The Spinoza Centre for Neuroimaging, Amsterdam, the Netherlands,Department of Computational Cognitive Neuroscience and Neuroimaging, Netherlands Institute for Neuroscience, Amsterdam, the Netherlands,Corresponding author
| | - Serge O. Dumoulin
- The Spinoza Centre for Neuroimaging, Amsterdam, the Netherlands,Department of Computational Cognitive Neuroscience and Neuroimaging, Netherlands Institute for Neuroscience, Amsterdam, the Netherlands,Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, the Netherlands,Department of Experimental and Applied Psychology, VU University, Amsterdam, the Netherlands,Corresponding author
| |
Collapse
|
47
|
Bailey KM, Giordano BL, Kaas AL, Smith FW. Decoding sounds depicting hand-object interactions in primary somatosensory cortex. Cereb Cortex 2022; 33:3621-3635. [PMID: 36045002 DOI: 10.1093/cercor/bhac296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 05/24/2022] [Accepted: 07/07/2022] [Indexed: 11/13/2022] Open
Abstract
Neurons, even in the earliest sensory regions of cortex, are subject to a great deal of contextual influences from both within and across modality connections. Recent work has shown that primary sensory areas can respond to and, in some cases, discriminate stimuli that are not of their target modality: for example, primary somatosensory cortex (SI) discriminates visual images of graspable objects. In the present work, we investigated whether SI would discriminate sounds depicting hand-object interactions (e.g. bouncing a ball). In a rapid event-related functional magnetic resonance imaging experiment, participants listened attentively to sounds from 3 categories: hand-object interactions, and control categories of pure tones and animal vocalizations, while performing a one-back repetition detection task. Multivoxel pattern analysis revealed significant decoding of hand-object interaction sounds within SI, but not for either control category. Crucially, in the hand-sensitive voxels defined from an independent tactile localizer, decoding accuracies were significantly higher for hand-object interactions compared to pure tones in left SI. Our findings indicate that simply hearing sounds depicting familiar hand-object interactions elicit different patterns of activity in SI, despite the complete absence of tactile stimulation. These results highlight the rich contextual information that can be transmitted across sensory modalities even to primary sensory areas.
Collapse
Affiliation(s)
- Kerri M Bailey
- School of Psychology, University of East Anglia, Norwich NR4 7TJ, United Kingdom
| | - Bruno L Giordano
- Institut des Neurosciences de La Timone, CNRS UMR 7289, Université Aix-Marseille, Marseille CNRS UMR 7289, France
| | - Amanda L Kaas
- Department of Cognitive Neuroscience, Maastricht University, Maastricht 6229 EV, The Netherlands
| | - Fraser W Smith
- School of Psychology, University of East Anglia, Norwich NR4 7TJ, United Kingdom
| |
Collapse
|
48
|
Nilsson DE, Smolka J, Bok M. The vertical light-gradient and its potential impact on animal distribution and behavior. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.951328] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The visual environment provides vital cues allowing animals to assess habitat quality, weather conditions or measure time of day. Together with other sensory cues and physiological conditions, the visual environment sets behavioral states that make the animal more prone to engage in some behaviors, and less in others. This master-control of behavior serves a fundamental and essential role in determining the distribution and behavior of all animals. Although it is obvious that visual information contains vital input for setting behavioral states, the precise nature of these visual cues remains unknown. Here we use a recently described method to quantify the distribution of light reaching animals’ eyes in different environments. The method records the vertical gradient (as a function of elevation angle) of intensity, spatial structure and spectral balance. Comparison of measurements from different types of environments, weather conditions, times of day, and seasons reveal that these aspects can be readily discriminated from one another. The vertical gradients of radiance, spatial structure (contrast) and color are thus reliable indicators that are likely to have a strong impact on animal behavior and spatial distribution.
Collapse
|
49
|
Abstract
Human vision relies on mechanisms that respond to luminance edges in space and time. Most edge models use orientation-selective mechanisms on multiple spatial scales and operate on static inputs assuming that edge processing occurs within a single fixational instance. Recent studies, however, demonstrate functionally relevant temporal modulations of the sensory input due to fixational eye movements. Here we propose a spatiotemporal model of human edge detection that combines elements of spatial and active vision. The model augments a spatial vision model by temporal filtering and shifts the input images over time, mimicking an active sampling scheme via fixational eye movements. The first model test was White's illusion, a lightness effect that has been shown to depend on edges. The model reproduced the spatial-frequency-specific interference with the edges by superimposing narrowband noise (1–5 cpd), similar to the psychophysical interference observed in White's effect. Second, we compare the model's edge detection performance in natural images in the presence and absence of Gaussian white noise with human-labeled contours for the same (noise-free) images. Notably, the model detects edges robustly against noise in both test cases without relying on orientation-selective processes. Eliminating model components, we demonstrate the relevance of multiscale spatiotemporal filtering and scale-specific normalization for edge detection. The proposed model facilitates efficient edge detection in (artificial) vision systems and challenges the notion that orientation-selective mechanisms are required for edge detection.
Collapse
Affiliation(s)
- Lynn Schmittwilken
- Science of Intelligence and Computational Psychology, Faculty of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany.,
| | - Marianne Maertens
- Science of Intelligence and Computational Psychology, Faculty of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany.,
| |
Collapse
|
50
|
Price BH, Gavornik JP. Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions. Front Comput Neurosci 2022; 16:929348. [PMID: 35874317 PMCID: PMC9298461 DOI: 10.3389/fncom.2022.929348] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/13/2022] [Indexed: 01/16/2023] Open
Abstract
While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.
Collapse
Affiliation(s)
| | - Jeffrey P. Gavornik
- Center for Systems Neuroscience, Graduate Program in Neuroscience, Department of Biology, Boston University, Boston, MA, United States
| |
Collapse
|