1
|
Miao HY, Tong F. Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing. J Vis 2024; 24:1. [PMID: 38829629 DOI: 10.1167/jov.24.6.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024] Open
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
2
|
Lewis CM, Wunderle T, Fries P. Top-down modulation of visual cortical stimulus encoding and gamma independent of firing rates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.11.589006. [PMID: 38645050 PMCID: PMC11030389 DOI: 10.1101/2024.04.11.589006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Neurons in primary visual cortex integrate sensory input with signals reflecting the animal's internal state to support flexible behavior. Internal variables, such as expectation, attention, or current goals, are imposed in a top-down manner via extensive feedback projections from higher-order areas. We optogenetically activated a high-order visual area, area 21a, in the lightly anesthetized cat (OptoTD), while recording from neuronal populations in V1. OptoTD induced strong, up to several fold, changes in gamma-band synchronization together with much smaller changes in firing rate, and the two effects showed no correlation. OptoTD effects showed specificity for the features of the simultaneously presented visual stimuli. OptoTD-induced changes in gamma synchronization, but not firing rates, were predictive of simultaneous changes in the amount of encoded stimulus information. Our findings suggest that one important role of top-down signals is to modulate synchronization and the information encoded by populations of sensory neurons.
Collapse
|
3
|
Pang R, Baker C, Murthy M, Pillow J. Inferring neural dynamics of memory during naturalistic social communication. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.26.577404. [PMID: 38328156 PMCID: PMC10849655 DOI: 10.1101/2024.01.26.577404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Memory processes in complex behaviors like social communication require forming representations of the past that grow with time. The neural mechanisms that support such continually growing memory remain unknown. We address this gap in the context of fly courtship, a natural social behavior involving the production and perception of long, complex song sequences. To study female memory for male song history in unrestrained courtship, we present 'Natural Continuation' (NC)-a general, simulation-based model comparison procedure to evaluate candidate neural codes for complex stimuli using naturalistic behavioral data. Applying NC to fly courtship revealed strong evidence for an adaptive population mechanism for how female auditory neural dynamics could convert long song histories into a rich mnemonic format. Song temporal patterning is continually transformed by heterogeneous nonlinear adaptation dynamics, then integrated into persistent activity, enabling common neural mechanisms to retain continuously unfolding information over long periods and yielding state-of-the-art predictions of female courtship behavior. At a population level this coding model produces multi-dimensional advection-diffusion-like responses that separate songs over a continuum of timescales and can be linearly transformed into flexible output signals, illustrating its potential to create a generic, scalable mnemonic format for extended input signals poised to drive complex behavioral responses. This work thus shows how naturalistic behavior can directly inform neural population coding models, revealing here a novel process for memory formation.
Collapse
Affiliation(s)
- Rich Pang
- Princeton Neuroscience Institute, Princeton, NJ, USA
- Center for the Physics of Biological Function, Princeton, NJ and New York, NY, USA
| | - Christa Baker
- Princeton Neuroscience Institute, Princeton, NJ, USA
- Present address: Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
| | - Mala Murthy
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | | |
Collapse
|
4
|
Weidler T, Goebel R, Senden M. AngoraPy: A Python toolkit for modeling anthropomorphic goal-driven sensorimotor systems. Front Neuroinform 2023; 17:1223687. [PMID: 38204578 PMCID: PMC10777840 DOI: 10.3389/fninf.2023.1223687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 11/27/2023] [Indexed: 01/12/2024] Open
Abstract
Goal-driven deep learning increasingly supplements classical modeling approaches in computational neuroscience. The strength of deep neural networks as models of the brain lies in their ability to autonomously learn the connectivity required to solve complex and ecologically valid tasks, obviating the need for hand-engineered or hypothesis-driven connectivity patterns. Consequently, goal-driven models can generate hypotheses about the neurocomputations underlying cortical processing that are grounded in macro- and mesoscopic anatomical properties of the network's biological counterpart. Whereas, goal-driven modeling is already becoming prevalent in the neuroscience of perception, its application to the sensorimotor domain is currently hampered by the complexity of the methods required to train models comprising the closed sensation-action loop. This paper describes AngoraPy, a Python library that mitigates this obstacle by providing researchers with the tools necessary to train complex recurrent convolutional neural networks that model the human sensorimotor system. To make the technical details of this toolkit more approachable, an illustrative example that trains a recurrent toy model on in-hand object manipulation accompanies the theoretical remarks. An extensive benchmark on various classical, 3D robotic, and anthropomorphic control tasks demonstrates AngoraPy's general applicability to a wide range of tasks. Together with its ability to adaptively handle custom architectures, the flexibility of this toolkit demonstrates its power for goal-driven sensorimotor modeling.
Collapse
Affiliation(s)
- Tonio Weidler
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Rainer Goebel
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Mario Senden
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
5
|
Cowley BR, Stan PL, Pillow JW, Smith MA. Compact deep neural network models of visual cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.22.568315. [PMID: 38045255 PMCID: PMC10690296 DOI: 10.1101/2023.11.22.568315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
A powerful approach to understanding the computations carried out in visual cortex is to develop models that predict neural responses to arbitrary images. Deep neural network (DNN) models have worked remarkably well at predicting neural responses [1, 2, 3], yet their underlying computations remain buried in millions of parameters. Have we simply replaced one complicated system in vivo with another in silico ? Here, we train a data-driven deep ensemble model that predicts macaque V4 responses ∼50% more accurately than currently-used task-driven DNN models. We then compress this deep ensemble to identify compact models that have 5,000x fewer parameters yet equivalent accuracy as the deep ensemble. We verified that the stimulus preferences of the compact models matched those of the real V4 neurons by measuring V4 responses to both 'maximizing' and adversarial images generated using compact models. We then analyzed the inner workings of the compact models and discovered a common circuit motif: Compact models share a similar set of filters in early stages of processing but then specialize by heavily consolidating this shared representation with a precise readout. This suggests that a V4 neuron's stimulus preference is determined entirely by its consolidation step. To demonstrate this, we investigated the compression step of a dot-detecting compact model and found a set of simple computations that may be carried out by dot-selective V4 neurons. Overall, our work demonstrates that the DNN models currently used in computational neuroscience are needlessly large; our approach provides a new way forward for obtaining explainable, high-accuracy models of visual cortical neurons.
Collapse
|
6
|
Malik G, Crowder D, Mingolla E. Extreme image transformations affect humans and machines differently. BIOLOGICAL CYBERNETICS 2023; 117:331-343. [PMID: 37310489 PMCID: PMC10600046 DOI: 10.1007/s00422-023-00968-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 05/26/2023] [Indexed: 06/14/2023]
Abstract
Some recent artificial neural networks (ANNs) claim to model aspects of primate neural and human performance data. Their success in object recognition is, however, dependent on exploiting low-level features for solving visual tasks in a way that humans do not. As a result, out-of-distribution or adversarial input is often challenging for ANNs. Humans instead learn abstract patterns and are mostly unaffected by many extreme image distortions. We introduce a set of novel image transforms inspired by neurophysiological findings and evaluate humans and ANNs on an object recognition task. We show that machines perform better than humans for certain transforms and struggle to perform at par with humans on others that are easy for humans. We quantify the differences in accuracy for humans and machines and find a ranking of difficulty for our transforms for human data. We also suggest how certain characteristics of human visual processing can be adapted to improve the performance of ANNs for our difficult-for-machines transforms.
Collapse
Affiliation(s)
- Girik Malik
- Northeastern University, Boston, MA 02115 USA
| | | | | |
Collapse
|
7
|
Wang C, Yan H, Huang W, Sheng W, Wang Y, Fan YS, Liu T, Zou T, Li R, Chen H. Neural encoding with unsupervised spiking convolutional neural network. Commun Biol 2023; 6:880. [PMID: 37640808 PMCID: PMC10462614 DOI: 10.1038/s42003-023-05257-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 08/18/2023] [Indexed: 08/31/2023] Open
Abstract
Accurately predicting the brain responses to various stimuli poses a significant challenge in neuroscience. Despite recent breakthroughs in neural encoding using convolutional neural networks (CNNs) in fMRI studies, there remain critical gaps between the computational rules of traditional artificial neurons and real biological neurons. To address this issue, a spiking CNN (SCNN)-based framework is presented in this study to achieve neural encoding in a more biologically plausible manner. The framework utilizes unsupervised SCNN to extract visual features of image stimuli and employs a receptive field-based regression algorithm to predict fMRI responses from the SCNN features. Experimental results on handwritten characters, handwritten digits and natural images demonstrate that the proposed approach can achieve remarkably good encoding performance and can be utilized for "brain reading" tasks such as image reconstruction and identification. This work suggests that SNN can serve as a promising tool for neural encoding.
Collapse
Affiliation(s)
- Chong Wang
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hongmei Yan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Wei Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Wei Sheng
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yuting Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yun-Shuang Fan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Tao Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Ting Zou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Rong Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Huafu Chen
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 611731, China.
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| |
Collapse
|
8
|
Gong Z, Zhou M, Dai Y, Wen Y, Liu Y, Zhen Z. A large-scale fMRI dataset for the visual processing of naturalistic scenes. Sci Data 2023; 10:559. [PMID: 37612327 PMCID: PMC10447576 DOI: 10.1038/s41597-023-02471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/14/2023] [Indexed: 08/25/2023] Open
Abstract
One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put a lot of effort into collecting large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.
Collapse
Affiliation(s)
- Zhengxin Gong
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Ming Zhou
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Yuxuan Dai
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Yushan Wen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Youyi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China.
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
9
|
Luna R, Zabaleta I, Bertalmío M. State-of-the-art image and video quality assessment with a metric based on an intrinsically non-linear neural summation model. Front Neurosci 2023; 17:1222815. [PMID: 37559700 PMCID: PMC10408451 DOI: 10.3389/fnins.2023.1222815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/30/2023] [Indexed: 08/11/2023] Open
Abstract
The development of automatic methods for image and video quality assessment that correlate well with the perception of human observers is a very challenging open problem in vision science, with numerous practical applications in disciplines such as image processing and computer vision, as well as in the media industry. In the past two decades, the goal of image quality research has been to improve upon classical metrics by developing models that emulate some aspects of the visual system, and while the progress has been considerable, state-of-the-art quality assessment methods still share a number of shortcomings, like their performance dropping considerably when they are tested on a database that is quite different from the one used to train them, or their significant limitations in predicting observer scores for high framerate videos. In this work we propose a novel objective method for image and video quality assessment that is based on the recently introduced Intrinsically Non-linear Receptive Field (INRF) formulation, a neural summation model that has been shown to be better at predicting neural activity and visual perception phenomena than the classical linear receptive field. Here we start by optimizing, on a classic image quality database, the four parameters of a very simple INRF-based metric, and proceed to test this metric on three other databases, showing that its performance equals or surpasses that of the state-of-the-art methods, some of them having millions of parameters. Next, we extend to the temporal domain this INRF image quality metric, and test it on several popular video quality datasets; again, the results of our proposed INRF-based video quality metric are shown to be very competitive.
Collapse
Affiliation(s)
- Raúl Luna
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| | - Itziar Zabaleta
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Marcelo Bertalmío
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| |
Collapse
|
10
|
Penacchio O, Otazu X, Wilkins AJ, Haigh SM. A mechanistic account of visual discomfort. Front Neurosci 2023; 17:1200661. [PMID: 37547142 PMCID: PMC10397803 DOI: 10.3389/fnins.2023.1200661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 06/27/2023] [Indexed: 08/08/2023] Open
Abstract
Much of the neural machinery of the early visual cortex, from the extraction of local orientations to contextual modulations through lateral interactions, is thought to have developed to provide a sparse encoding of contour in natural scenes, allowing the brain to process efficiently most of the visual scenes we are exposed to. Certain visual stimuli, however, cause visual stress, a set of adverse effects ranging from simple discomfort to migraine attacks, and epileptic seizures in the extreme, all phenomena linked with an excessive metabolic demand. The theory of efficient coding suggests a link between excessive metabolic demand and images that deviate from natural statistics. Yet, the mechanisms linking energy demand and image spatial content in discomfort remain elusive. Here, we used theories of visual coding that link image spatial structure and brain activation to characterize the response to images observers reported as uncomfortable in a biologically based neurodynamic model of the early visual cortex that included excitatory and inhibitory layers to implement contextual influences. We found three clear markers of aversive images: a larger overall activation in the model, a less sparse response, and a more unbalanced distribution of activity across spatial orientations. When the ratio of excitation over inhibition was increased in the model, a phenomenon hypothesised to underlie interindividual differences in susceptibility to visual discomfort, the three markers of discomfort progressively shifted toward values typical of the response to uncomfortable stimuli. Overall, these findings propose a unifying mechanistic explanation for why there are differences between images and between observers, suggesting how visual input and idiosyncratic hyperexcitability give rise to abnormal brain responses that result in visual stress.
Collapse
Affiliation(s)
- Olivier Penacchio
- Department of Computer Science, Universitat Autònoma de Barcelona, Bellaterra, Spain
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Spain
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, United Kingdom
| | - Xavier Otazu
- Department of Computer Science, Universitat Autònoma de Barcelona, Bellaterra, Spain
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Arnold J. Wilkins
- Department of Psychology, University of Essex, Colchester, United Kingdom
| | - Sarah M. Haigh
- Department of Psychology, University of Nevada Reno, Reno, NV, United States
- Institute for Neuroscience, University of Nevada Reno, Reno, NV, United States
| |
Collapse
|
11
|
Ahn J, Ryu J, Lee S, Lee C, Im CH, Lee SH. Transcranial direct current stimulation elevates the baseline activity while sharpening the spatial tuning of the human visual cortex. Brain Stimul 2023; 16:1154-1164. [PMID: 37517465 DOI: 10.1016/j.brs.2023.07.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 06/04/2023] [Accepted: 07/23/2023] [Indexed: 08/01/2023] Open
Affiliation(s)
- Jeongyeol Ahn
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea.
| | - Juhyoung Ryu
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea
| | - Sangjun Lee
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Chany Lee
- Department of Structure & Function of Neural Network, Korea Brain Research Institute, Daegu, Republic of Korea
| | - Chang-Hwan Im
- Department of Biomedical Engineering, Hanyang University, Seoul, Republic of Korea
| | - Sang-Hun Lee
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
12
|
Ladret HJ, Cortes N, Ikan L, Chavane F, Casanova C, Perrinet LU. Cortical recurrence supports resilience to sensory variance in the primary visual cortex. Commun Biol 2023; 6:667. [PMID: 37353519 PMCID: PMC10290066 DOI: 10.1038/s42003-023-05042-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 06/13/2023] [Indexed: 06/25/2023] Open
Abstract
Our daily endeavors occur in a complex visual environment, whose intrinsic variability challenges the way we integrate information to make decisions. By processing myriads of parallel sensory inputs, our brain is theoretically able to compute the variance of its environment, a cue known to guide our behavior. Yet, the neurobiological and computational basis of such variance computations are still poorly understood. Here, we quantify the dynamics of sensory variance modulations of cat primary visual cortex neurons. We report two archetypal neuronal responses, one of which is resilient to changes in variance and co-encodes the sensory feature and its variance, improving the population encoding of orientation. The existence of these variance-specific responses can be accounted for by a model of intracortical recurrent connectivity. We thus propose that local recurrent circuits process uncertainty as a generic computation, advancing our understanding of how the brain handles naturalistic inputs.
Collapse
Affiliation(s)
- Hugo J Ladret
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France.
- School of Optometry, Université de Montréal, Montréal, Canada.
| | - Nelson Cortes
- School of Optometry, Université de Montréal, Montréal, Canada
| | - Lamyae Ikan
- School of Optometry, Université de Montréal, Montréal, Canada
| | - Frédéric Chavane
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France
| | | | - Laurent U Perrinet
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France
| |
Collapse
|
13
|
Marrazzo G, De Martino F, Lage-Castellanos A, Vaessen MJ, de Gelder B. Voxelwise encoding models of body stimuli reveal a representational gradient from low-level visual features to postural features in occipitotemporal cortex. Neuroimage 2023:120240. [PMID: 37348622 DOI: 10.1016/j.neuroimage.2023.120240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/16/2023] [Accepted: 06/19/2023] [Indexed: 06/24/2023] Open
Abstract
Research on body representation in the brain has focused on category-specific representation, using fMRI to investigate the response pattern to body stimuli in occipitotemporal cortex without so far addressing the issue of the specific computations involved in body selective regions, only defined by higher order category selectivity. This study used ultra-high field fMRI and banded ridge regression to investigate the coding of body images, by comparing the performance of three encoding models in predicting brain activity in occipitotemporal cortex and specifically the extrastriate body area (EBA). Our results suggest that bodies are encoded in occipitotemporal cortex and in the EBA according to a combination of low-level visual features and postural features.
Collapse
Affiliation(s)
- Giuseppe Marrazzo
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands
| | - Federico De Martino
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands; Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States and Department of NeuroInformatics
| | - Agustin Lage-Castellanos
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands; Cuban Center for Neuroscience, Street 190 e/25 and 27 Cubanacán Playa Havana, CP 11600, Cuba
| | - Maarten J Vaessen
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands
| | - Beatrice de Gelder
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Limburg 6200 MD, Maastricht, The Netherlands.
| |
Collapse
|
14
|
Yates JL, Coop SH, Sarch GH, Wu RJ, Butts DA, Rucci M, Mitchell JF. Detailed characterization of neural selectivity in free viewing primates. Nat Commun 2023; 14:3656. [PMID: 37339973 PMCID: PMC10282080 DOI: 10.1038/s41467-023-38564-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 05/08/2023] [Indexed: 06/22/2023] Open
Abstract
Fixation constraints in visual tasks are ubiquitous in visual and cognitive neuroscience. Despite its widespread use, fixation requires trained subjects, is limited by the accuracy of fixational eye movements, and ignores the role of eye movements in shaping visual input. To overcome these limitations, we developed a suite of hardware and software tools to study vision during natural behavior in untrained subjects. We measured visual receptive fields and tuning properties from multiple cortical areas of marmoset monkeys who freely viewed full-field noise stimuli. The resulting receptive fields and tuning curves from primary visual cortex (V1) and area MT match reported selectivity from the literature which was measured using conventional approaches. We then combined free viewing with high-resolution eye tracking to make the first detailed 2D spatiotemporal measurements of foveal receptive fields in V1. These findings demonstrate the power of free viewing to characterize neural responses in untrained animals while simultaneously studying the dynamics of natural behavior.
Collapse
Affiliation(s)
- Jacob L Yates
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA.
- Center for Visual Science, University of Rochester, Rochester, NY, USA.
- Department of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, USA.
- Herbert Wertheim School of Optometry and Vision Science, UC Berkeley, Berkeley, CA, USA.
| | - Shanna H Coop
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
- Neurobiology, Stanford University, Stanford, CA, USA
| | - Gabriel H Sarch
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ruei-Jr Wu
- Center for Visual Science, University of Rochester, Rochester, NY, USA
- Institute of Optics, University of Rochester, Rochester, NY, USA
| | - Daniel A Butts
- Department of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, USA
| | - Michele Rucci
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Jude F Mitchell
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| |
Collapse
|
15
|
Price BH, Jensen CM, Khoudary AA, Gavornik JP. Expectation violations produce error signals in mouse V1. Cereb Cortex 2023; 33:8803-8820. [PMID: 37183176 PMCID: PMC10321125 DOI: 10.1093/cercor/bhad163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/22/2023] [Accepted: 04/25/2023] [Indexed: 05/16/2023] Open
Abstract
Repeated exposure to visual sequences changes the form of evoked activity in the primary visual cortex (V1). Predictive coding theory provides a potential explanation for this, namely that plasticity shapes cortical circuits to encode spatiotemporal predictions and that subsequent responses are modulated by the degree to which actual inputs match these expectations. Here we use a recently developed statistical modeling technique called Model-Based Targeted Dimensionality Reduction (MbTDR) to study visually evoked dynamics in mouse V1 in the context of an experimental paradigm called "sequence learning." We report that evoked spiking activity changed significantly with training, in a manner generally consistent with the predictive coding framework. Neural responses to expected stimuli were suppressed in a late window (100-150 ms) after stimulus onset following training, whereas responses to novel stimuli were not. Substituting a novel stimulus for a familiar one led to increases in firing that persisted for at least 300 ms. Omitting predictable stimuli in trained animals also led to increased firing at the expected time of stimulus onset. Finally, we show that spiking data can be used to accurately decode time within the sequence. Our findings are consistent with the idea that plasticity in early visual circuits is involved in coding spatiotemporal information.
Collapse
Affiliation(s)
- Byron H Price
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
- Graduate Program in Neuroscience, Boston University, Boston, MA 02215, USA
| | - Cambria M Jensen
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
| | - Anthony A Khoudary
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
| | - Jeffrey P Gavornik
- Center for Systems Neuroscience, Department of Biology, Boston University, Boston, MA 02215, USA
- Graduate Program in Neuroscience, Boston University, Boston, MA 02215, USA
| |
Collapse
|
16
|
Kay K, Bonnen K, Denison RN, Arcaro MJ, Barack DL. Tasks and their role in visual neuroscience. Neuron 2023; 111:1697-1713. [PMID: 37040765 DOI: 10.1016/j.neuron.2023.03.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 03/13/2023] [Accepted: 03/15/2023] [Indexed: 04/13/2023]
Abstract
Vision is widely used as a model system to gain insights into how sensory inputs are processed and interpreted by the brain. Historically, careful quantification and control of visual stimuli have served as the backbone of visual neuroscience. There has been less emphasis, however, on how an observer's task influences the processing of sensory inputs. Motivated by diverse observations of task-dependent activity in the visual system, we propose a framework for thinking about tasks, their role in sensory processing, and how we might formally incorporate tasks into our models of vision.
Collapse
Affiliation(s)
- Kendrick Kay
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Kathryn Bonnen
- School of Optometry, Indiana University, Bloomington, IN 47405, USA
| | - Rachel N Denison
- Department of Psychological and Brain Sciences, Boston University, Boston, MA 02215, USA
| | - Mike J Arcaro
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19146, USA
| | - David L Barack
- Departments of Neuroscience and Philosophy, University of Pennsylvania, Philadelphia, PA 19146, USA
| |
Collapse
|
17
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
18
|
Yun M, Hwang JY, Jung MW. Septotemporal variations in hippocampal value and outcome processing. Cell Rep 2023; 42:112094. [PMID: 36763498 DOI: 10.1016/j.celrep.2023.112094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 11/11/2022] [Accepted: 01/26/2023] [Indexed: 02/11/2023] Open
Abstract
A large body of evidence indicates functional variations along the hippocampal longitudinal axis. To investigate whether and how value and outcome processing vary between the dorsal (DH) and the ventral hippocampus (VH), we examined neuronal activity and inactivation effects of the DH and VH in mice performing probabilistic classical conditioning tasks. Inactivation of either structure disrupts value-dependent anticipatory licking, and value-coding neurons are found in both structures, indicating their involvement in value processing. However, the DH neuronal population increases activity as a function of value, while the VH neuronal population is preferentially responsive to the highest-value sensory cue. Also, signals related to outcome-dependent value learning are stronger in the DH. VH neurons instead show rapid responses to punishment and strongly biased responses to negative prediction error. These findings suggest that the DH faithfully represents the external value landscape, whereas the VH preferentially represents behaviorally relevant, salient features of experienced events.
Collapse
Affiliation(s)
- Miru Yun
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea; Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea
| | - Ji Young Hwang
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea; Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea
| | - Min Whan Jung
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea; Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea.
| |
Collapse
|
19
|
Victor JD, Rizvi SM, Bush JW, Conte MM. Discrimination of textures with spatial correlations and multiple gray levels. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2023; 40:237-258. [PMID: 36821194 PMCID: PMC9971653 DOI: 10.1364/josaa.472553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 12/05/2022] [Indexed: 06/18/2023]
Abstract
Analysis of visual texture is important for many key steps in early vision. We study visual sensitivity to image statistics in three families of textures that include multiple gray levels and correlations in two spatial dimensions. Sensitivities to positive and negative correlations are approximately independent of correlation sign, and signals from different kinds of correlations combine quadratically. We build a computational model, fully constrained by prior studies of sensitivity to uncorrelated textures and black-and-white textures with spatial correlations. The model accounts for many features of the new data, including sign-independence, quadratic combination, and the dependence on gray-level distribution.
Collapse
Affiliation(s)
- Jonathan D. Victor
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
| | - Syed M. Rizvi
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
- Currently with Centerlight Healthcare, 136-65 37th Ave., Flushing, NY 11354, USA
| | - Jacob W. Bush
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
- Currently with Shopify, 151 O’Connor St Ground floor, Ottawa, ON K2P 2L8, Canada
| | - Mary M. Conte
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA
| |
Collapse
|
20
|
Baranauskas G, Rysevaite-Kyguoliene K, Sabeckis I, Pauza DH. Saturation of visual responses explains size tuning in rat collicular neurons. Eur J Neurosci 2023; 57:285-309. [PMID: 36451583 DOI: 10.1111/ejn.15877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 08/03/2022] [Accepted: 11/21/2022] [Indexed: 12/02/2022]
Abstract
The receptive field of many visual neurons is composed of a central responsive area, the classical receptive field, and a non-classical receptive field, also called the "suppressive surround." A visual stimulus placed in the suppressive surround does not induce any response but modulates visual responses to stimuli within the classical receptive field, usually by suppressing them. Therefore, visual responses become smaller when stimuli exceed the classical receptive field size. The stimulus size inducing the maximal response is called the preferred stimulus size. In cortex, there is good correspondence between the sizes of the classical receptive field and the preferred stimulus. In contrast, in the rodent superior colliculus, the preferred size is often several fold smaller than the classical receptive field size. Here, we show that in the rat superior colliculus, the preferred stimulus size changes as a square root of the contrast inverse and the classical receptive field size is independent of contrast. In addition, responses to annulus were largely independent of the inner hole size. To explain these data, three models were tested: the divisive modulation of the gain by the suppressive surround (the "normalization" model), the difference of the Gaussians, and a divisive model that incorporates saturation to light flux. Despite the same number of free parameters, the model incorporating saturation to light performed the best. Thus, our data indicate that in rats, the saturation to light can be a dominant phenomenon even at relatively low illumination levels defining visual responses in the collicular neurons.
Collapse
Affiliation(s)
- Gytis Baranauskas
- Neurophysiology Laboratory, Neuroscience Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | | | - Ignas Sabeckis
- Anatomy Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | - Dainius H Pauza
- Anatomy Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania
| |
Collapse
|
21
|
Parker PRL, Abe ETT, Leonard ESP, Martins DM, Niell CM. Joint coding of visual input and eye/head position in V1 of freely moving mice. Neuron 2022; 110:3897-3906.e5. [PMID: 36137549 PMCID: PMC9742335 DOI: 10.1016/j.neuron.2022.08.029] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/16/2022] [Accepted: 08/30/2022] [Indexed: 12/15/2022]
Abstract
Visual input during natural behavior is highly dependent on movements of the eyes and head, but how information about eye and head position is integrated with visual processing during free movement is unknown, as visual physiology is generally performed under head fixation. To address this, we performed single-unit electrophysiology in V1 of freely moving mice while simultaneously measuring the mouse's eye position, head orientation, and the visual scene from the mouse's perspective. From these measures, we mapped spatiotemporal receptive fields during free movement based on the gaze-corrected visual input. Furthermore, we found a significant fraction of neurons tuned for eye and head position, and these signals were integrated with visual responses through a multiplicative mechanism in the majority of modulated neurons. These results provide new insight into coding in the mouse V1 and, more generally, provide a paradigm for investigating visual physiology under natural conditions, including active sensing and ethological behavior.
Collapse
Affiliation(s)
- Philip R L Parker
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Elliott T T Abe
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Emmalyn S P Leonard
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Dylan M Martins
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA
| | - Cristopher M Niell
- Institute of Neuroscience and Department of Biology, University of Oregon, Eugene, OR, USA.
| |
Collapse
|
22
|
Henry CA, Kohn A. Feature representation under crowding in macaque V1 and V4 neuronal populations. Curr Biol 2022; 32:5126-5137.e3. [PMID: 36379216 PMCID: PMC9729449 DOI: 10.1016/j.cub.2022.10.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 09/02/2022] [Accepted: 10/21/2022] [Indexed: 11/16/2022]
Abstract
Visual perception depends strongly on spatial context. A profound example is visual crowding, whereby the presence of nearby stimuli impairs the discriminability of object features. Despite extensive work on perceptual crowding and the spatial integrative properties of visual cortical neurons, the link between these two aspects of visual processing remains unclear. To understand better the neural basis of crowding, we recorded activity simultaneously from neuronal populations in V1 and V4 of fixating macaque monkeys. We assessed the information available from the measured responses about the orientation of a visual target both for targets presented in isolation and amid distractors. Both single neuron and population responses had less information about target orientation when distractors were present. Information loss was moderate in V1 and more substantial in V4. Information loss could be traced to systematic divisive and additive changes in neuronal tuning. Additive and multiplicative changes in tuning were more severe in V4; in addition, tuning exhibited other, non-affine transformations that were greater in V4, further restricting the ability of a fixed sensory readout strategy to extract accurate feature information across displays. Our results provide a direct test of crowding effects at different stages of the visual hierarchy. They reveal how crowded visual environments alter the spiking activity of cortical populations by which sensory stimuli are encoded and connect these changes to established mechanisms of neuronal spatial integration.
Collapse
Affiliation(s)
- Christopher A Henry
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA.
| | - Adam Kohn
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA.
| |
Collapse
|
23
|
Gifford AT, Dwivedi K, Roig G, Cichy RM. A large and rich EEG dataset for modeling human visual object recognition. Neuroimage 2022; 264:119754. [PMID: 36400378 PMCID: PMC9771828 DOI: 10.1016/j.neuroimage.2022.119754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 09/14/2022] [Accepted: 11/14/2022] [Indexed: 11/16/2022] Open
Abstract
The human brain achieves visual object recognition through multiple stages of linear and nonlinear transformations operating at a millisecond scale. To predict and explain these rapid transformations, computational neuroscientists employ machine learning modeling techniques. However, state-of-the-art models require massive amounts of data to properly train, and to the present day there is a lack of vast brain datasets which extensively sample the temporal dynamics of visual object recognition. Here we collected a large and rich dataset of high temporal resolution EEG responses to images of objects on a natural background. This dataset includes 10 participants, each with 82,160 trials spanning 16,740 image conditions. Through computational modeling we established the quality of this dataset in five ways. First, we trained linearizing encoding models that successfully synthesized the EEG responses to arbitrary images. Second, we correctly identified the recorded EEG data image conditions in a zero-shot fashion, using EEG synthesized responses to hundreds of thousands of candidate image conditions. Third, we show that both the high number of conditions as well as the trial repetitions of the EEG dataset contribute to the trained models' prediction accuracy. Fourth, we built encoding models whose predictions well generalize to novel participants. Fifth, we demonstrate full end-to-end training of randomly initialized DNNs that output EEG responses for arbitrary input images. We release this dataset as a tool to foster research in visual neuroscience and computer vision.
Collapse
Affiliation(s)
- Alessandro T. Gifford
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany,Einstein Center for Neurosciences Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany,Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany,Corresponding author.
| | - Kshitij Dwivedi
- Department of Computer Science, Goethe Universität, Frankfurt am Main, Germany
| | - Gemma Roig
- Department of Computer Science, Goethe Universität, Frankfurt am Main, Germany
| | - Radoslaw M. Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany,Einstein Center for Neurosciences Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany,Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany,Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
24
|
Freedland J, Rieke F. Systematic reduction of the dimensionality of natural scenes allows accurate predictions of retinal ganglion cell spike outputs. Proc Natl Acad Sci U S A 2022; 119:e2121744119. [PMID: 36343230 PMCID: PMC9674269 DOI: 10.1073/pnas.2121744119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 09/23/2022] [Indexed: 11/09/2022] Open
Abstract
The mammalian retina engages a broad array of linear and nonlinear circuit mechanisms to convert natural scenes into retinal ganglion cell (RGC) spike outputs. Although many individual integration mechanisms are well understood, we know less about how multiple mechanisms interact to encode the complex spatial features present in natural inputs. Here, we identified key spatial features in natural scenes that shape encoding by primate parasol RGCs. Our approach identified simplifications in the spatial structure of natural scenes that minimally altered RGC spike responses. We observed that reducing natural movies into 16 linearly integrated regions described ∼80% of the structure of parasol RGC spike responses; this performance depended on the number of regions but not their precise spatial locations. We used simplified stimuli to design high-dimensional metamers that recapitulated responses to naturalistic movies. Finally, we modeled the retinal computations that convert flashed natural images into one-dimensional spike counts.
Collapse
Affiliation(s)
- Julian Freedland
- Molecular Engineering & Sciences Institute, University of Washington, Seattle, WA 98195
| | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195
| |
Collapse
|
25
|
Spontaneous activity patterns in human motor cortex replay evoked activity patterns for hand movements. Sci Rep 2022; 12:16867. [PMID: 36207360 PMCID: PMC9546868 DOI: 10.1038/s41598-022-20866-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 09/20/2022] [Indexed: 11/08/2022] Open
Abstract
Spontaneous brain activity, measured with resting state fMRI (R-fMRI), is correlated among regions that are co-activated by behavioral tasks. It is unclear, however, whether spatial patterns of spontaneous activity within a cortical region correspond to spatial patterns of activity evoked by specific stimuli, actions, or mental states. The current study investigated the hypothesis that spontaneous activity in motor cortex represents motor patterns commonly occurring in daily life. To test this hypothesis 15 healthy participants were scanned while performing four different hand movements. Three movements (Grip, Extend, Pinch) were ecological involving grip and grasp hand movements; one control movement involving the rotation of the wrist was not ecological and infrequent (Shake). They were also scanned at rest before and after the execution of the motor tasks (resting-state scans). Using the task data, we identified movement-specific patterns in the primary motor cortex. These task-defined patterns were compared to resting-state patterns in the same motor region. We also performed a control analysis within the primary visual cortex. We found that spontaneous activity patterns in the primary motor cortex were more like task patterns for ecological than control movements. In contrast, there was no difference between ecological and control hand movements in the primary visual area. These findings provide evidence that spontaneous activity in human motor cortex forms fine-scale, patterned representations associated with behaviors that frequently occur in daily life.
Collapse
|
26
|
Hofstetter S, Dumoulin SO. Assessing the ecological validity of numerosity-selective neuronal populations with real-world natural scenes. iScience 2022; 25:105267. [PMID: 36274951 PMCID: PMC9579010 DOI: 10.1016/j.isci.2022.105267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 07/18/2022] [Accepted: 09/26/2022] [Indexed: 11/21/2022] Open
Abstract
Animals and humans are able to quickly and effortlessly estimate the number of items in a set: their numerosity. Numerosity perception is thought to be critical to behavior, from feeding to escaping predators to human mathematical cognition. Virtually, all scientific studies on numerosity mechanisms use well controlled but artificial stimuli to isolate the numerosity dimension from other physical quantities. Here, we probed the ecological validity of these artificial stimuli and evaluate whether an important component in numerosity processing, the numerosity-selective neural populations, also respond to numerosity of items in real-world natural scenes. Using 7T MRI and natural images from a wide range of categories, we provide evidence that the numerosity-tuned neuronal populations show numerosity-selective responses when viewing images from a real-world natural scene. Our findings strengthen the role of numerosity-selective neurons in numerosity perception and provide an important link to their function in numerosity perception in real-world settings.
Collapse
Affiliation(s)
- Shir Hofstetter
- The Spinoza Centre for Neuroimaging, Amsterdam, the Netherlands,Department of Computational Cognitive Neuroscience and Neuroimaging, Netherlands Institute for Neuroscience, Amsterdam, the Netherlands,Corresponding author
| | - Serge O. Dumoulin
- The Spinoza Centre for Neuroimaging, Amsterdam, the Netherlands,Department of Computational Cognitive Neuroscience and Neuroimaging, Netherlands Institute for Neuroscience, Amsterdam, the Netherlands,Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, the Netherlands,Department of Experimental and Applied Psychology, VU University, Amsterdam, the Netherlands,Corresponding author
| |
Collapse
|
27
|
Bailey KM, Giordano BL, Kaas AL, Smith FW. Decoding sounds depicting hand-object interactions in primary somatosensory cortex. Cereb Cortex 2022; 33:3621-3635. [PMID: 36045002 DOI: 10.1093/cercor/bhac296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 05/24/2022] [Accepted: 07/07/2022] [Indexed: 11/13/2022] Open
Abstract
Neurons, even in the earliest sensory regions of cortex, are subject to a great deal of contextual influences from both within and across modality connections. Recent work has shown that primary sensory areas can respond to and, in some cases, discriminate stimuli that are not of their target modality: for example, primary somatosensory cortex (SI) discriminates visual images of graspable objects. In the present work, we investigated whether SI would discriminate sounds depicting hand-object interactions (e.g. bouncing a ball). In a rapid event-related functional magnetic resonance imaging experiment, participants listened attentively to sounds from 3 categories: hand-object interactions, and control categories of pure tones and animal vocalizations, while performing a one-back repetition detection task. Multivoxel pattern analysis revealed significant decoding of hand-object interaction sounds within SI, but not for either control category. Crucially, in the hand-sensitive voxels defined from an independent tactile localizer, decoding accuracies were significantly higher for hand-object interactions compared to pure tones in left SI. Our findings indicate that simply hearing sounds depicting familiar hand-object interactions elicit different patterns of activity in SI, despite the complete absence of tactile stimulation. These results highlight the rich contextual information that can be transmitted across sensory modalities even to primary sensory areas.
Collapse
Affiliation(s)
- Kerri M Bailey
- School of Psychology, University of East Anglia, Norwich NR4 7TJ, United Kingdom
| | - Bruno L Giordano
- Institut des Neurosciences de La Timone, CNRS UMR 7289, Université Aix-Marseille, Marseille CNRS UMR 7289, France
| | - Amanda L Kaas
- Department of Cognitive Neuroscience, Maastricht University, Maastricht 6229 EV, The Netherlands
| | - Fraser W Smith
- School of Psychology, University of East Anglia, Norwich NR4 7TJ, United Kingdom
| |
Collapse
|
28
|
Nilsson DE, Smolka J, Bok M. The vertical light-gradient and its potential impact on animal distribution and behavior. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.951328] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The visual environment provides vital cues allowing animals to assess habitat quality, weather conditions or measure time of day. Together with other sensory cues and physiological conditions, the visual environment sets behavioral states that make the animal more prone to engage in some behaviors, and less in others. This master-control of behavior serves a fundamental and essential role in determining the distribution and behavior of all animals. Although it is obvious that visual information contains vital input for setting behavioral states, the precise nature of these visual cues remains unknown. Here we use a recently described method to quantify the distribution of light reaching animals’ eyes in different environments. The method records the vertical gradient (as a function of elevation angle) of intensity, spatial structure and spectral balance. Comparison of measurements from different types of environments, weather conditions, times of day, and seasons reveal that these aspects can be readily discriminated from one another. The vertical gradients of radiance, spatial structure (contrast) and color are thus reliable indicators that are likely to have a strong impact on animal behavior and spatial distribution.
Collapse
|
29
|
Schmittwilken L, Maertens M. Fixational eye movements enable robust edge detection. J Vis 2022; 22:5. [PMID: 35834376 PMCID: PMC9290315 DOI: 10.1167/jov.22.8.5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Human vision relies on mechanisms that respond to luminance edges in space and time. Most edge models use orientation-selective mechanisms on multiple spatial scales and operate on static inputs assuming that edge processing occurs within a single fixational instance. Recent studies, however, demonstrate functionally relevant temporal modulations of the sensory input due to fixational eye movements. Here we propose a spatiotemporal model of human edge detection that combines elements of spatial and active vision. The model augments a spatial vision model by temporal filtering and shifts the input images over time, mimicking an active sampling scheme via fixational eye movements. The first model test was White's illusion, a lightness effect that has been shown to depend on edges. The model reproduced the spatial-frequency-specific interference with the edges by superimposing narrowband noise (1–5 cpd), similar to the psychophysical interference observed in White's effect. Second, we compare the model's edge detection performance in natural images in the presence and absence of Gaussian white noise with human-labeled contours for the same (noise-free) images. Notably, the model detects edges robustly against noise in both test cases without relying on orientation-selective processes. Eliminating model components, we demonstrate the relevance of multiscale spatiotemporal filtering and scale-specific normalization for edge detection. The proposed model facilitates efficient edge detection in (artificial) vision systems and challenges the notion that orientation-selective mechanisms are required for edge detection.
Collapse
Affiliation(s)
- Lynn Schmittwilken
- Science of Intelligence and Computational Psychology, Faculty of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany.,
| | - Marianne Maertens
- Science of Intelligence and Computational Psychology, Faculty of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany.,
| |
Collapse
|
30
|
Price BH, Gavornik JP. Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions. Front Comput Neurosci 2022; 16:929348. [PMID: 35874317 PMCID: PMC9298461 DOI: 10.3389/fncom.2022.929348] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/13/2022] [Indexed: 01/16/2023] Open
Abstract
While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.
Collapse
|
31
|
Zhang Y, Bu T, Zhang J, Tang S, Yu Z, Liu JK, Huang T. Decoding Pixel-Level Image Features from Two-Photon Calcium Signals of Macaque Visual Cortex. Neural Comput 2022; 34:1369-1397. [PMID: 35534008 DOI: 10.1162/neco_a_01498] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 12/20/2021] [Indexed: 11/04/2022]
Abstract
Images of visual scenes comprise essential features important for visual cognition of the brain. The complexity of visual features lies at different levels, from simple artificial patterns to natural images with different scenes. It has been a focus of using stimulus images to predict neural responses. However, it remains unclear how to extract features from neuronal responses. Here we address this question by leveraging two-photon calcium neural data recorded from the visual cortex of awake macaque monkeys. With stimuli including various categories of artificial patterns and diverse scenes of natural images, we employed a deep neural network decoder inspired by image segmentation technique. Consistent with the notation of sparse coding for natural images, a few neurons with stronger responses dominated the decoding performance, whereas decoding of ar tificial patterns needs a large number of neurons. When natural images using the model pretrained on artificial patterns are decoded, salient features of natural scenes can be extracted, as well as the conventional category information. Altogether, our results give a new perspective on studying neural encoding principles using reverse-engineering decoding strategies.
Collapse
Affiliation(s)
- Yijun Zhang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240.,Department of Computer Science and Technology, Peking University, Peking 100871, P.R.C.
| | - Tong Bu
- Department of Computer Science and Technology, Peking University, Beijing 100871, P.R.C.
| | - Jiyuan Zhang
- Department of Computer Science and Technology, Peking University, Beijing 100871, P.R.C.
| | - Shiming Tang
- School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, P.R.C.
| | - Zhaofei Yu
- Department of Computer Science and Technology and In stitute for Artificial Intelligence, Peking University, Beijing 100871, P.R.C.
| | - Jian K Liu
- School of Computing, University of Leeds, Leeds LS2 9JT, U.K.
| | - Tiejun Huang
- Department of Computer Science and Technology and Institute for Artificial Intelligence, Peking University, Beijing 100871, P.R.C.,Beijing Academy of Artificial Intelligence, Beijing 100190, P.R.C.
| |
Collapse
|
32
|
Microsaccades, Drifts, Hopf Bundle and Neurogeometry. J Imaging 2022; 8:jimaging8030076. [PMID: 35324631 PMCID: PMC8953095 DOI: 10.3390/jimaging8030076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/03/2022] [Accepted: 03/04/2022] [Indexed: 02/01/2023] Open
Abstract
The first part of the paper contains a short review of the image processing in early vision is static, when the eyes and the stimulus are stable, and in dynamics, when the eyes participate in fixation eye movements. In the second part, we give an interpretation of Donders’ and Listing’s law in terms of the Hopf fibration of the 3-sphere over the 2-sphere. In particular, it is shown that the configuration space of the eye ball (when the head is fixed) is the 2-dimensional hemisphere SL+, called Listing hemisphere, and saccades are described as geodesic segments of SL+ with respect to the standard round metric. We study fixation eye movements (drift and microsaccades) in terms of this model and discuss the role of fixation eye movements in vision. A model of fixation eye movements is proposed that gives an explanation of presaccadic shift of receptive fields.
Collapse
|
33
|
Abstract
The THINGS database is a freely available stimulus set that has the potential to facilitate the generation of theory that bridges multiple areas within cognitive neuroscience. The database consists of 26,107 high quality digital photos that are sorted into 1,854 concepts. While a valuable resource, relatively few technical details relevant to the design of studies in cognitive neuroscience have been described. We present an analysis of two key low-level properties of THINGS images, luminance and luminance contrast. These image statistics are known to influence common physiological and neural correlates of perceptual and cognitive processes. In general, we found that the distributions of luminance and contrast are in close agreement with the statistics of natural images reported previously. However, we found that image concepts are separable in their luminance and contrast: we show that luminance and contrast alone are sufficient to classify images into their concepts with above chance accuracy. We describe how these factors may confound studies using the THINGS images, and suggest simple controls that can be implemented a priori or post-hoc. We discuss the importance of using such natural images as stimuli in psychological research.
Collapse
Affiliation(s)
- William J Harrison
- Queensland Brain Institute and School of Psychology, 1974The University of Queensland
| |
Collapse
|
34
|
Barbieri D. Reconstructing Group Wavelet Transform From Feature Maps With a Reproducing Kernel Iteration. Front Comput Neurosci 2022; 16:775241. [PMID: 35370587 PMCID: PMC8965351 DOI: 10.3389/fncom.2022.775241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 01/28/2022] [Indexed: 11/13/2022] Open
Abstract
In this article, we consider the problem of reconstructing an image that is downsampled in the space of its SE(2) wavelet transform, which is motivated by classical models of simple cell receptive fields and feature preference maps in the primary visual cortex. We prove that, whenever the problem is solvable, the reconstruction can be obtained by an elementary project and replace iterative scheme based on the reproducing kernel arising from the group structure, and show numerical results on real images.
Collapse
Affiliation(s)
- Davide Barbieri
- Departamento de Matemáticas, Universidad Autónoma de Madrid, Madrid, Spain
| |
Collapse
|
35
|
Broderick WF, Simoncelli EP, Winawer J. Mapping spatial frequency preferences across human primary visual cortex. J Vis 2022; 22:3. [PMID: 35266962 PMCID: PMC8934567 DOI: 10.1167/jov.22.4.3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Neurons in primate visual cortex (area V1) are tuned for spatial frequency, in a manner that depends on their position in the visual field. Several studies have examined this dependency using functional magnetic resonance imaging (fMRI), reporting preferred spatial frequencies (tuning curve peaks) of V1 voxels as a function of eccentricity, but their results differ by as much as two octaves, presumably owing to differences in stimuli, measurements, and analysis methodology. Here, we characterize spatial frequency tuning at a millimeter resolution within the human primary visual cortex, across stimulus orientation and visual field locations. We measured fMRI responses to a novel set of stimuli, constructed as sinusoidal gratings in log-polar coordinates, which include circular, radial, and spiral geometries. For each individual stimulus, the local spatial frequency varies inversely with eccentricity, and for any given location in the visual field, the full set of stimuli span a broad range of spatial frequencies and orientations. Over the measured range of eccentricities, the preferred spatial frequency is well-fit by a function that varies as the inverse of the eccentricity plus a small constant. We also find small but systematic effects of local stimulus orientation, defined in both absolute coordinates and relative to visual field location. Specifically, peak spatial frequency is higher for pinwheel than annular stimuli and for horizontal than vertical stimuli.
Collapse
Affiliation(s)
- William F. Broderick
- Center for Neural Science, New York University, New York, NY, USA,https://wfbroderick.com/
| | - Eero P. Simoncelli
- Center for Neural Science, and Courant Institue for Mathematical Sciences, New York University, New York, NY, USA,Flatiron Institute, Simons Foundation, USA,
| | - Jonathan Winawer
- Department of Psychology, New York University, New York, NY, USA,
| |
Collapse
|
36
|
Liu JK, Karamanlis D, Gollisch T. Simple model for encoding natural images by retinal ganglion cells with nonlinear spatial integration. PLoS Comput Biol 2022; 18:e1009925. [PMID: 35259159 PMCID: PMC8932571 DOI: 10.1371/journal.pcbi.1009925] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 03/18/2022] [Accepted: 02/14/2022] [Indexed: 01/05/2023] Open
Abstract
A central goal in sensory neuroscience is to understand the neuronal signal processing involved in the encoding of natural stimuli. A critical step towards this goal is the development of successful computational encoding models. For ganglion cells in the vertebrate retina, the development of satisfactory models for responses to natural visual scenes is an ongoing challenge. Standard models typically apply linear integration of visual stimuli over space, yet many ganglion cells are known to show nonlinear spatial integration, in particular when stimulated with contrast-reversing gratings. We here study the influence of spatial nonlinearities in the encoding of natural images by ganglion cells, using multielectrode-array recordings from isolated salamander and mouse retinas. We assess how responses to natural images depend on first- and second-order statistics of spatial patterns inside the receptive field. This leads us to a simple extension of current standard ganglion cell models. We show that taking not only the weighted average of light intensity inside the receptive field into account but also its variance over space can partly account for nonlinear integration and substantially improve response predictions of responses to novel images. For salamander ganglion cells, we find that response predictions for cell classes with large receptive fields profit most from including spatial contrast information. Finally, we demonstrate how this model framework can be used to assess the spatial scale of nonlinear integration. Our results underscore that nonlinear spatial stimulus integration translates to stimulation with natural images. Furthermore, the introduced model framework provides a simple, yet powerful extension of standard models and may serve as a benchmark for the development of more detailed models of the nonlinear structure of receptive fields. For understanding how sensory systems operate in the natural environment, an important goal is to develop models that capture neuronal responses to natural stimuli. For retinal ganglion cells, which connect the eye to the brain, current standard models often fail to capture responses to natural visual scenes. This shortcoming is at least partly rooted in the fact that ganglion cells may combine visual signals over space in a nonlinear fashion. We here show that a simple model, which not only considers the average light intensity inside a cell’s receptive field but also the variance of light intensity over space, can partly account for these nonlinearities and thereby improve current standard models. This provides an easy-to-obtain benchmark for modeling ganglion cell responses to natural images.
Collapse
Affiliation(s)
- Jian K. Liu
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- School of Computing, University of Leeds, Leeds, United Kingdom
| | - Dimokratis Karamanlis
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- International Max Planck Research School for Neurosciences, Göttingen, Germany
| | - Tim Gollisch
- University Medical Center Göttingen, Department of Ophthalmology, Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, Göttingen, Germany
- Cluster of Excellence “Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells” (MBExC), University of Göttingen, Göttingen, Germany
- * E-mail:
| |
Collapse
|
37
|
Feedforward and feedback interactions between visual cortical areas use different population activity patterns. Nat Commun 2022; 13:1099. [PMID: 35232956 PMCID: PMC8888615 DOI: 10.1038/s41467-022-28552-w] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 01/19/2022] [Indexed: 12/19/2022] Open
Abstract
Brain function relies on the coordination of activity across multiple, recurrently connected brain areas. For instance, sensory information encoded in early sensory areas is relayed to, and further processed by, higher cortical areas and then fed back. However, the way in which feedforward and feedback signaling interact with one another is incompletely understood. Here we investigate this question by leveraging simultaneous neuronal population recordings in early and midlevel visual areas (V1-V2 and V1-V4). Using a dimensionality reduction approach, we find that population interactions are feedforward-dominated shortly after stimulus onset and feedback-dominated during spontaneous activity. The population activity patterns most correlated across areas were distinct during feedforward- and feedback-dominated periods. These results suggest that feedforward and feedback signaling rely on separate "channels", which allows feedback signals to not directly affect activity that is fed forward.
Collapse
|
38
|
Gao S, Liu X. Explaining Orientation Adaptation in V1 by Updating the State of a Spatial Model. Front Comput Neurosci 2022; 15:759254. [PMID: 35250523 PMCID: PMC8895385 DOI: 10.3389/fncom.2021.759254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 12/06/2021] [Indexed: 11/17/2022] Open
Abstract
In this work, we extend an influential statistical model based on the spatial classical receptive field (CRF) and non-classical receptive field (nCRF) interactions (Coen-Cagli et al., 2012) to explain the typical orientation adaptation effects observed in V1. If we assume that the temporal adaptation modifies the “state” of the model, the spatial statistical model can explain all of the orientation adaptation effects in the context of neuronal output using small and large grating observed in neurophysiological experiments in V1. The “state” of the model represents the internal parameters such as the prior and the covariance trained on a mixed dataset that totally determine the response of the model. These two parameters, respectively, reflect the probability of the orientation component and the connectivity among neurons between CRF and nCRF. Specifically, we have two key findings: First, neural adapted results using a small grating that just covers the CRF can be predicted by the change of the prior of our model. Second, the change of the prior can also predict most of the observed results using a large grating that covers both CRF and nCRF of a neuron. However, the prediction of the novel attractive adaptation using large grating covering both CRF and nCRF also necessitates the involvement of a connectivity change of the center-surround RFs. In addition, our paper contributes a new prior-based winner-take-all (WTA) working mechanism derived from the statistical-based model to explain why and how all of these orientation adaptation effects can be predicted by relying on this spatial model without modifying its structure, a novel application of the spatial model. The research results show that adaptation may link time and space by changing the “state” of the neural system according to a specific adaptor. Furthermore, different forms of stimulus used for adaptation can cause various adaptation effects, such as an a priori shift or a connectivity change, depending on the specific stimulus size.
Collapse
Affiliation(s)
- Shaobing Gao
- College of Computer Science, Sichuan University, Chengdu, China
- *Correspondence: Shaobing Gao
| | - Xiao Liu
- Tomorrow Advancing Life Education Group (TAL), Beijing, China
| |
Collapse
|
39
|
De A, Horwitz GD. Coding of chromatic spatial contrast by macaque V1 neurons. eLife 2022; 11:68133. [PMID: 35147497 PMCID: PMC8920507 DOI: 10.7554/elife.68133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 02/01/2022] [Indexed: 11/13/2022] Open
Abstract
Color perception relies on comparisons between adjacent lights, but how the brain performs these comparisons is poorly understood. To elucidate the underlying neural mechanisms, we recorded spiking responses of individual V1 neurons in macaque monkeys to pairs of stimuli within the classical receptive field (RF). We estimated the spatial-chromatic RF of each neuron and then presented customized colored edges using a novel closed-loop technique. We found that many double-opponent (DO) cells, which have spatially and chromatically opponent RFs, responded to chromatic contrast as a weighted sum, akin to how other V1 cells responded to luminance contrast. Yet other neurons integrated chromatic signals non-linearly, confirming that linear signal integration is not an obligate property of V1 neurons. The functional similarity of cone-opponent DO cells and cone non-opponent simple cells suggests that these two groups may share a common underlying neural circuitry, promotes the construction of image-computable models for full-color image representation, and sheds new light on V1 complex cells.
Collapse
Affiliation(s)
- Abhishek De
- Department of Physiology and Biophysics, University of Washington, Seattle, United States
| | - Gregory D Horwitz
- Department of Physiology and Biophysics, University of Washington, Seattle, United States
| |
Collapse
|
40
|
Rideaux R, West RK, Wallis TSA, Bex PJ, Mattingley JB, Harrison WJ. Spatial structure, phase, and the contrast of natural images. J Vis 2022; 22:4. [PMID: 35006237 PMCID: PMC8762697 DOI: 10.1167/jov.22.1.4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/25/2021] [Indexed: 11/24/2022] Open
Abstract
The sensitivity of the human visual system is thought to be shaped by environmental statistics. A major endeavor in vision science, therefore, is to uncover the image statistics that predict perceptual and cognitive function. When searching for targets in natural images, for example, it has recently been proposed that target detection is inversely related to the spatial similarity of the target to its local background. We tested this hypothesis by measuring observers' sensitivity to targets that were blended with natural image backgrounds. Targets were designed to have a spatial structure that was either similar or dissimilar to the background. Contrary to masking from similarity, we found that observers were most sensitive to targets that were most similar to their backgrounds. We hypothesized that a coincidence of phase alignment between target and background results in a local contrast signal that facilitates detection when target-background similarity is high. We confirmed this prediction in a second experiment. Indeed, we show that, by solely manipulating the phase of a target relative to its background, the target can be rendered easily visible or undetectable. Our study thus reveals that, in addition to its structural similarity, the phase of the target relative to the background must be considered when predicting detection sensitivity in natural images.
Collapse
Affiliation(s)
- Reuben Rideaux
- Queensland Brain Institute, University of Queensland, St. Lucia, Queensland, Australia
| | - Rebecca K West
- School of Psychology, University of Queensland, St. Lucia, Queensland, Australia
| | - Thomas S A Wallis
- Institut für Psychologie & Centre for Cognitive Science, Technische Universität Darmstadt, Darmstadt, Germany
| | - Peter J Bex
- Department of Psychology, Northeastern University, Boston, MA, USA
| | - Jason B Mattingley
- Queensland Brain Institute, University of Queensland, St. Lucia, Queensland, Australia
- School of Psychology, University of Queensland, St. Lucia, Queensland, Australia
| | - William J Harrison
- Queensland Brain Institute, University of Queensland, St. Lucia, Queensland, Australia
- School of Psychology, University of Queensland, St. Lucia, Queensland, Australia
| |
Collapse
|
41
|
Ezra-Tsur E, Amsalem O, Ankri L, Patil P, Segev I, Rivlin-Etzion M. Realistic retinal modeling unravels the differential role of excitation and inhibition to starburst amacrine cells in direction selectivity. PLoS Comput Biol 2021; 17:e1009754. [PMID: 34968385 PMCID: PMC8754344 DOI: 10.1371/journal.pcbi.1009754] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 01/12/2022] [Accepted: 12/14/2021] [Indexed: 11/19/2022] Open
Abstract
Retinal direction-selectivity originates in starburst amacrine cells (SACs), which display a centrifugal preference, responding with greater depolarization to a stimulus expanding from soma to dendrites than to a collapsing stimulus. Various mechanisms were hypothesized to underlie SAC centrifugal preference, but dissociating them is experimentally challenging and the mechanisms remain debatable. To address this issue, we developed the Retinal Stimulation Modeling Environment (RSME), a multifaceted data-driven retinal model that encompasses detailed neuronal morphology and biophysical properties, retina-tailored connectivity scheme and visual input. Using a genetic algorithm, we demonstrated that spatiotemporally diverse excitatory inputs–sustained in the proximal and transient in the distal processes–are sufficient to generate experimentally validated centrifugal preference in a single SAC. Reversing these input kinetics did not produce any centrifugal-preferring SAC. We then explored the contribution of SAC-SAC inhibitory connections in establishing the centrifugal preference. SAC inhibitory network enhanced the centrifugal preference, but failed to generate it in its absence. Embedding a direction selective ganglion cell (DSGC) in a SAC network showed that the known SAC-DSGC asymmetric connectivity by itself produces direction selectivity. Still, this selectivity is sharpened in a centrifugal-preferring SAC network. Finally, we use RSME to demonstrate the contribution of SAC-SAC inhibitory connections in mediating direction selectivity and recapitulate recent experimental findings. Thus, using RSME, we obtained a mechanistic understanding of SACs’ centrifugal preference and its contribution to direction selectivity. Retinal direction selectivity is a canonical example for a computation undertaken by the retina. Starburst amacrine cells (SACs), interneurons in the retina, mediate direction selectivity via two mechanisms: they form asymmetric inhibitory connections with direction selective ganglion cells (DSGCs); and their processes are themselves direction selective, displaying a centrifugal preference. Various hypotheses were raised to account for this centrifugal preference, including the arrangement of SAC excitatory inputs, their kinetics, as well as reciprocal inhibition between SACs. To address this, we developed the Retinal Stimulation Modeling Environment (RSME)–a modeling environment for highly detailed, biologically plausible simulations, tailored to the exploration of neuronal dynamic and visual processing in retinal circuits. We started with exploring the excitation to a single SAC, and found that a precise organization of the input kinetics along SAC processes can generate a centrifugal preference that matched our experimental recordings. We then generated a network of SACs and found that reciprocal inhibition between SACs further enhances the centrifugal preference. Finally, we embedded a DSGC in the network, and dissected the contribution of SAC-DSGC asymmetric connections and SAC centrifugal preference to direction selectivity in DSGC.
Collapse
Affiliation(s)
- Elishai Ezra-Tsur
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel
- Department of Mathematics and Computer Science, The Open University of Israel, Ra’anana, Israel
- * E-mail: (EE-T); (MR-E)
| | - Oren Amsalem
- Department of Neurobiology, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Lea Ankri
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Pritish Patil
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Idan Segev
- Department of Neurobiology, Hebrew University of Jerusalem, Jerusalem, Israel
- Edmond and Lily Safra Center for Brain Sciences, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michal Rivlin-Etzion
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel
- * E-mail: (EE-T); (MR-E)
| |
Collapse
|
42
|
Crosse MJ, Zuk NJ, Di Liberto GM, Nidiffer AR, Molholm S, Lalor EC. Linear Modeling of Neurophysiological Responses to Speech and Other Continuous Stimuli: Methodological Considerations for Applied Research. Front Neurosci 2021; 15:705621. [PMID: 34880719 PMCID: PMC8648261 DOI: 10.3389/fnins.2021.705621] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 09/21/2021] [Indexed: 01/01/2023] Open
Abstract
Cognitive neuroscience, in particular research on speech and language, has seen an increase in the use of linear modeling techniques for studying the processing of natural, environmental stimuli. The availability of such computational tools has prompted similar investigations in many clinical domains, facilitating the study of cognitive and sensory deficits under more naturalistic conditions. However, studying clinical (and often highly heterogeneous) cohorts introduces an added layer of complexity to such modeling procedures, potentially leading to instability of such techniques and, as a result, inconsistent findings. Here, we outline some key methodological considerations for applied research, referring to a hypothetical clinical experiment involving speech processing and worked examples of simulated electrophysiological (EEG) data. In particular, we focus on experimental design, data preprocessing, stimulus feature extraction, model design, model training and evaluation, and interpretation of model weights. Throughout the paper, we demonstrate the implementation of each step in MATLAB using the mTRF-Toolbox and discuss how to address issues that could arise in applied research. In doing so, we hope to provide better intuition on these more technical points and provide a resource for applied and clinical researchers investigating sensory and cognitive processing using ecologically rich stimuli.
Collapse
Affiliation(s)
- Michael J Crosse
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,X, The Moonshot Factory, Mountain View, CA, United States.,Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States.,Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
| | - Nathaniel J Zuk
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States.,Department of Neuroscience, University of Rochester, Rochester, NY, United States
| | - Giovanni M Di Liberto
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,Centre for Biomedical Engineering, School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland.,School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
| | - Aaron R Nidiffer
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States.,Department of Neuroscience, University of Rochester, Rochester, NY, United States
| | - Sophie Molholm
- Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States.,Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
| | - Edmund C Lalor
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States.,Department of Neuroscience, University of Rochester, Rochester, NY, United States
| |
Collapse
|
43
|
Yedutenko M, Howlett MHC, Kamermans M. Enhancing the dark side: asymmetric gain of cone photoreceptors underpins their discrimination of visual scenes based on skewness. J Physiol 2021; 600:123-142. [PMID: 34783026 PMCID: PMC9300210 DOI: 10.1113/jp282152] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 11/11/2021] [Indexed: 11/08/2022] Open
Abstract
Psychophysical data indicate that humans can discriminate visual scenes based on their skewness, i.e. the ratio of dark and bright patches within a visual scene. It has also been shown that at a phenomenological level this skew discrimination is described by the so-called blackshot mechanism, which accentuates strong negative contrasts within a scene. Here, we present a set of observations suggesting that the underlying computation might start as early as the cone phototransduction cascade, whose gain is higher for strong negative contrasts than for strong positive contrasts. We recorded from goldfish cone photoreceptors and found that the asymmetry in the phototransduction gain leads to responses with larger amplitudes when using negatively rather than positively skewed light stimuli. This asymmetry in amplitude was present in the cone photocurrent, voltage response and synaptic output. Given that the properties of the phototransduction cascade are universal across vertebrates, it is possible that the mechanism shown here gives rise to a general ability to discriminate between scenes based only on their skewness, which psychophysical studies have shown humans can do. Thus, our data suggest the importance of non-linearity of the early photoreceptor for perception. Additionally, we found that stimulus skewness leads to a subtle change in photoreceptor kinetics. For negatively skewed stimuli, the impulse response functions of the cone peak later than for positively skewed stimuli. However, stimulus skewness does not affect the overall integration time of the cone. KEY POINTS: Humans can discriminate visual scenes based on skewness, i.e. the relative prevalence of bright and dark patches within a scene. Here, we show that negatively skewed time-series stimuli induce larger responses in goldfish cone photoreceptors than comparable positively skewed stimuli. This response asymmetry originates from within the phototransduction cascade, where gain is higher for strong negative contrasts (dark patches) than for strong positive contrasts (bright patches). Unlike the implicit assumption often contained within models of downstream visual neurons, our data show that cone photoreceptors do not simply relay linearly filtered versions of visual stimuli to downstream circuitry, but that they also emphasize specific stimulus features. Given that the phototransduction cascade properties among vertebrate retinas are mostly universal, our data imply that the skew discrimination by human subjects reported in psychophysical studies might stem from the asymmetric gain function of the phototransduction cascade.
Collapse
Affiliation(s)
- Matthew Yedutenko
- Retinal Signal Processing Laboratory, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
| | - Marcus H C Howlett
- Retinal Signal Processing Laboratory, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
| | - Maarten Kamermans
- Retinal Signal Processing Laboratory, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands.,Department of Biomedical Physics and Biomedical Optics, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
44
|
Meikle SJ, Wong YT. Neurophysiological considerations for visual implants. Brain Struct Funct 2021; 227:1523-1543. [PMID: 34773502 DOI: 10.1007/s00429-021-02417-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Accepted: 10/17/2021] [Indexed: 11/26/2022]
Abstract
Neural implants have the potential to restore visual capabilities in blind individuals by electrically stimulating the neurons of the visual system. This stimulation can produce visual percepts known as phosphenes. The ideal location of electrical stimulation for achieving vision restoration is widely debated and dependent on the physiological properties of the targeted tissue. Here, the neurophysiology of several potential target structures within the visual system will be explored regarding their benefits and downfalls in producing phosphenes. These regions will include the lateral geniculate nucleus, primary visual cortex, visual area 2, visual area 3, visual area 4 and the middle temporal area. Based on the existing engineering limitations of neural prostheses, we anticipate that electrical stimulation of any singular brain region will be incapable of achieving high-resolution naturalistic perception including color, texture, shape and motion. As improvements in visual acuity facilitate improvements in quality of life, emulating naturalistic vision should be one of the ultimate goals of visual prostheses. To achieve this goal, we propose that multiple brain areas will need to be targeted in unison enabling different aspects of vision to be recreated.
Collapse
Affiliation(s)
- Sabrina J Meikle
- Department of Electrical and Computer Systems Engineering, Monash University, 14 Alliance Lane, Clayton, Vic, 3800, Australia
- Department of Physiology and Biomedicine Discovery Institute, Monash University, 14 Alliance Lane, Clayton, Vic, 3800, Australia
- Monash Vision Group, Monash University, 14 Alliance Lane, Clayton, Vic, 3800, Australia
| | - Yan T Wong
- Department of Electrical and Computer Systems Engineering, Monash University, 14 Alliance Lane, Clayton, Vic, 3800, Australia.
- Department of Physiology and Biomedicine Discovery Institute, Monash University, 14 Alliance Lane, Clayton, Vic, 3800, Australia.
- Monash Vision Group, Monash University, 14 Alliance Lane, Clayton, Vic, 3800, Australia.
| |
Collapse
|
45
|
Primary visual cortex straightens natural video trajectories. Nat Commun 2021; 12:5982. [PMID: 34645787 PMCID: PMC8514453 DOI: 10.1038/s41467-021-25939-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 09/08/2021] [Indexed: 11/08/2022] Open
Abstract
Many sensory-driven behaviors rely on predictions about future states of the environment. Visual input typically evolves along complex temporal trajectories that are difficult to extrapolate. We test the hypothesis that spatial processing mechanisms in the early visual system facilitate prediction by constructing neural representations that follow straighter temporal trajectories. We recorded V1 population activity in anesthetized macaques while presenting static frames taken from brief video clips, and developed a procedure to measure the curvature of the associated neural population trajectory. We found that V1 populations straighten naturally occurring image sequences, but entangle artificial sequences that contain unnatural temporal transformations. We show that these effects arise in part from computational mechanisms that underlie the stimulus selectivity of V1 cells. Together, our findings reveal that the early visual system uses a set of specialized computations to build representations that can support prediction in the natural environment. Many behaviours depend on predictions about the environment. Here the authors find neural populations in primary visual cortex to straighten the temporal trajectories of natural video clips, facilitating the extrapolation of past observations.
Collapse
|
46
|
Wu Z, Rockwell H, Zhang Y, Tang S, Lee TS. Complexity and diversity in sparse code priors improve receptive field characterization of Macaque V1 neurons. PLoS Comput Biol 2021; 17:e1009528. [PMID: 34695120 PMCID: PMC8589190 DOI: 10.1371/journal.pcbi.1009528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 11/12/2021] [Accepted: 10/05/2021] [Indexed: 11/18/2022] Open
Abstract
System identification techniques-projection pursuit regression models (PPRs) and convolutional neural networks (CNNs)-provide state-of-the-art performance in predicting visual cortical neurons' responses to arbitrary input stimuli. However, the constituent kernels recovered by these methods are often noisy and lack coherent structure, making it difficult to understand the underlying component features of a neuron's receptive field. In this paper, we show that using a dictionary of diverse kernels with complex shapes learned from natural scenes based on efficient coding theory, as the front-end for PPRs and CNNs can improve their performance in neuronal response prediction as well as algorithmic data efficiency and convergence speed. Extensive experimental results also indicate that these sparse-code kernels provide important information on the component features of a neuron's receptive field. In addition, we find that models with the complex-shaped sparse code front-end are significantly better than models with a standard orientation-selective Gabor filter front-end for modeling V1 neurons that have been found to exhibit complex pattern selectivity. We show that the relative performance difference due to these two front-ends can be used to produce a sensitive metric for detecting complex selectivity in V1 neurons.
Collapse
Affiliation(s)
- Ziniu Wu
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Mathematics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Harold Rockwell
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Yimeng Zhang
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Shiming Tang
- Center for Life Sciences, Peking University, Beijing, China
| | - Tai Sing Lee
- Center for the Neural Basis of Cognition and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
47
|
Hansen BC, Greene MR, Field DJ. Dynamic Electrode-to-Image (DETI) mapping reveals the human brain's spatiotemporal code of visual information. PLoS Comput Biol 2021; 17:e1009456. [PMID: 34570753 PMCID: PMC8496831 DOI: 10.1371/journal.pcbi.1009456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 10/07/2021] [Accepted: 09/16/2021] [Indexed: 11/18/2022] Open
Abstract
A number of neuroimaging techniques have been employed to understand how visual information is transformed along the visual pathway. Although each technique has spatial and temporal limitations, they can each provide important insights into the visual code. While the BOLD signal of fMRI can be quite informative, the visual code is not static and this can be obscured by fMRI’s poor temporal resolution. In this study, we leveraged the high temporal resolution of EEG to develop an encoding technique based on the distribution of responses generated by a population of real-world scenes. This approach maps neural signals to each pixel within a given image and reveals location-specific transformations of the visual code, providing a spatiotemporal signature for the image at each electrode. Our analyses of the mapping results revealed that scenes undergo a series of nonuniform transformations that prioritize different spatial frequencies at different regions of scenes over time. This mapping technique offers a potential avenue for future studies to explore how dynamic feedforward and recurrent processes inform and refine high-level representations of our visual world. The visual information that we sample from our environment undergoes a series of neural modifications, with each modification state (or visual code) consisting of a unique distribution of responses across neurons along the visual pathway. However, current noninvasive neuroimaging techniques provide an account of that code that is coarse with respect to time or space. Here, we present dynamic electrode-to-image (DETI) mapping, an analysis technique that capitalizes on the high temporal resolution of EEG to map neural signals to each pixel within a given image to reveal location-specific modifications of the visual code. The DETI technique reveals maps of features that are associated with the neural signal at each pixel and at each time point. DETI mapping shows that real-world scenes undergo a series of nonuniform modifications over both space and time. Specifically, we find that the visual code varies in a location-specific manner, likely reflecting that neural processing prioritizes different features at different image locations over time. DETI mapping therefore offers a potential avenue for future studies to explore how each modification state informs and refines the conceptual meaning of our visual world.
Collapse
Affiliation(s)
- Bruce C. Hansen
- Colgate University, Department of Psychological & Brain Sciences, Neuroscience Program, Hamilton New York, United States of America
- * E-mail:
| | - Michelle R. Greene
- Bates College, Neuroscience Program, Lewiston, Maine, United States of America
| | - David J. Field
- Cornell University, Department of Psychology, Ithaca, New York, United States of America
| |
Collapse
|
48
|
A Visual Encoding Model Based on Contrastive Self-Supervised Learning for Human Brain Activity along the Ventral Visual Stream. Brain Sci 2021; 11:brainsci11081004. [PMID: 34439623 PMCID: PMC8391143 DOI: 10.3390/brainsci11081004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/23/2021] [Accepted: 07/26/2021] [Indexed: 11/30/2022] Open
Abstract
Visual encoding models are important computational models for understanding how information is processed along the visual stream. Many improved visual encoding models have been developed from the perspective of the model architecture and the learning objective, but these are limited to the supervised learning method. From the view of unsupervised learning mechanisms, this paper utilized a pre-trained neural network to construct a visual encoding model based on contrastive self-supervised learning for the ventral visual stream measured by functional magnetic resonance imaging (fMRI). We first extracted features using the ResNet50 model pre-trained in contrastive self-supervised learning (ResNet50-CSL model), trained a linear regression model for each voxel, and finally calculated the prediction accuracy of different voxels. Compared with the ResNet50 model pre-trained in a supervised classification task, the ResNet50-CSL model achieved an equal or even relatively better encoding performance in multiple visual cortical areas. Moreover, the ResNet50-CSL model performs hierarchical representation of input visual stimuli, which is similar to the human visual cortex in its hierarchical information processing. Our experimental results suggest that the encoding model based on contrastive self-supervised learning is a strong computational model to compete with supervised models, and contrastive self-supervised learning proves an effective learning method to extract human brain-like representations.
Collapse
|
49
|
Burg MF, Cadena SA, Denfield GH, Walker EY, Tolias AS, Bethge M, Ecker AS. Learning divisive normalization in primary visual cortex. PLoS Comput Biol 2021; 17:e1009028. [PMID: 34097695 PMCID: PMC8211272 DOI: 10.1371/journal.pcbi.1009028] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 06/17/2021] [Accepted: 04/30/2021] [Indexed: 11/18/2022] Open
Abstract
Divisive normalization (DN) is a prominent computational building block in the brain that has been proposed as a canonical cortical operation. Numerous experimental studies have verified its importance for capturing nonlinear neural response properties to simple, artificial stimuli, and computational studies suggest that DN is also an important component for processing natural stimuli. However, we lack quantitative models of DN that are directly informed by measurements of spiking responses in the brain and applicable to arbitrary stimuli. Here, we propose a DN model that is applicable to arbitrary input images. We test its ability to predict how neurons in macaque primary visual cortex (V1) respond to natural images, with a focus on nonlinear response properties within the classical receptive field. Our model consists of one layer of subunits followed by learned orientation-specific DN. It outperforms linear-nonlinear and wavelet-based feature representations and makes a significant step towards the performance of state-of-the-art convolutional neural network (CNN) models. Unlike deep CNNs, our compact DN model offers a direct interpretation of the nature of normalization. By inspecting the learned normalization pool of our model, we gained insights into a long-standing question about the tuning properties of DN that update the current textbook description: we found that within the receptive field oriented features were normalized preferentially by features with similar orientation rather than non-specifically as currently assumed.
Collapse
Affiliation(s)
- Max F. Burg
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- * E-mail:
| | - Santiago A. Cadena
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
| | - George H. Denfield
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - Edgar Y. Walker
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - Andreas S. Tolias
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America
| | - Matthias Bethge
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
| | - Alexander S. Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| |
Collapse
|
50
|
CUI YIBO, ZHANG CHI, WANG LINYUAN, YAN BIN, TONG LI. DENSE-GWP: AN IMPROVED PRIMARY VISUAL ENCODING MODEL BASED ON DENSE GABOR FEATURES. J MECH MED BIOL 2021. [DOI: 10.1142/s0219519421400170] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Brain visual encoding models based on functional magnetic resonance imaging are growing increasingly popular. The Gabor wavelet pyramid model (GWP) is a classic example, exhibiting a good prediction performance for the primary visual cortex (V1, V2, and V3). However, the local variations in the visual stimulation are quite convoluted in terms of spatial frequency, orientation, and position, posing a challenge for visual encoding models. Whether the GWP model can thoroughly extract informative and effective features from visual stimulus remains unclear. To this end, this paper proposes a dense GWP visual encoding model by ameliorating the composition of the Gabor wavelet basis from three aspects: spatial frequency, orientation, and position. The improved model named Dense-GWP model could extract denser features from the image stimulus. A regularization optimization algorithm was used to select informative and effective features, which were crucial for predicting voxel activity in the region of interest. Extensive experimental results showed that the Dense-GWP model exhibits an improved prediction performance and can therefore help further understand the human visual perception mechanism.
Collapse
Affiliation(s)
- YIBO CUI
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force, Information Engineering University, Zhengzhou 450001, P. R. China
| | - CHI ZHANG
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force, Information Engineering University, Zhengzhou 450001, P. R. China
| | - LINYUAN WANG
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force, Information Engineering University, Zhengzhou 450001, P. R. China
| | - BIN YAN
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force, Information Engineering University, Zhengzhou 450001, P. R. China
| | - LI TONG
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force, Information Engineering University, Zhengzhou 450001, P. R. China
| |
Collapse
|