201
|
Han S, Vasconcelos N. Object recognition with hierarchical discriminant saliency networks. Front Comput Neurosci 2014; 8:109. [PMID: 25249971 PMCID: PMC4158795 DOI: 10.3389/fncom.2014.00109] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 08/22/2014] [Indexed: 12/22/2022] Open
Abstract
The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition.
Collapse
Affiliation(s)
- Sunhyoung Han
- Analytics Department, ID Analytics San Diego, CA, USA
| | - Nuno Vasconcelos
- Statistical and Visual Computing Lab, Electrical and Computer Engineering, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
202
|
High-resolution eye tracking using V1 neuron activity. Nat Commun 2014; 5:4605. [PMID: 25197783 PMCID: PMC4159777 DOI: 10.1038/ncomms5605] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Accepted: 07/04/2014] [Indexed: 11/09/2022] Open
Abstract
Studies of high-acuity visual cortical processing have been limited by the inability to track eye position with sufficient accuracy to precisely reconstruct the visual stimulus on the retina. As a result, studies of primary visual cortex (V1) have been performed almost entirely on neurons outside the high-resolution central portion of the visual field (the fovea). Here we describe a procedure for inferring eye position using multi-electrode array recordings from V1 coupled with nonlinear stimulus processing models. We show that this method can be used to infer eye position with 1 arc-min accuracy--significantly better than conventional techniques. This allows for analysis of foveal stimulus processing, and provides a means to correct for eye movement-induced biases present even outside the fovea. This method could thus reveal critical insights into the role of eye movements in cortical coding, as well as their contribution to measures of cortical variability.
Collapse
|
203
|
An X, Gong H, Yin J, Wang X, Pan Y, Zhang X, Lu Y, Yang Y, Toth Z, Schiessl I, McLoughlin N, Wang W. Orientation-cue invariant population responses to contrast-modulated and phase-reversed contour stimuli in macaque V1 and V2. PLoS One 2014; 9:e106753. [PMID: 25188576 PMCID: PMC4154761 DOI: 10.1371/journal.pone.0106753] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2014] [Accepted: 08/01/2014] [Indexed: 11/20/2022] Open
Abstract
Visual scenes can be readily decomposed into a variety of oriented components, the processing of which is vital for object segregation and recognition. In primate V1 and V2, most neurons have small spatio-temporal receptive fields responding selectively to oriented luminance contours (first order), while only a subgroup of neurons signal non-luminance defined contours (second order). So how is the orientation of second-order contours represented at the population level in macaque V1 and V2? Here we compared the population responses in macaque V1 and V2 to two types of second-order contour stimuli generated either by modulation of contrast or phase reversal with those to first-order contour stimuli. Using intrinsic signal optical imaging, we found that the orientation of second-order contour stimuli was represented invariantly in the orientation columns of both macaque V1 and V2. A physiologically constrained spatio-temporal energy model of V1 and V2 neuronal populations could reproduce all the recorded population responses. These findings suggest that, at the population level, the primate early visual system processes the orientation of second-order contours initially through a linear spatio-temporal filter mechanism. Our results of population responses to different second-order contour stimuli support the idea that the orientation maps in primate V1 and V2 can be described as a spatial-temporal energy map.
Collapse
Affiliation(s)
- Xu An
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
- Key Laboratory of Brain Function and Diseases, School of Life Sciences, University of Science and Technology of China, Hefei, P. R. China
| | - Hongliang Gong
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Jiapeng Yin
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Xiaochun Wang
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Yanxia Pan
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Xian Zhang
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
- Key Laboratory of Brain Function and Diseases, School of Life Sciences, University of Science and Technology of China, Hefei, P. R. China
| | - Yiliang Lu
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Yupeng Yang
- Key Laboratory of Brain Function and Diseases, School of Life Sciences, University of Science and Technology of China, Hefei, P. R. China
| | - Zoltan Toth
- Faculty of Life Science, University of Manchester, Manchester, United Kingdom
| | - Ingo Schiessl
- Faculty of Life Science, University of Manchester, Manchester, United Kingdom
| | - Niall McLoughlin
- Faculty of Life Science, University of Manchester, Manchester, United Kingdom
| | - Wei Wang
- Institute of Neuroscience, State Key Laboratory of Neuroscience and Key Laboratory of Primate Neurobiology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
- * E-mail:
| |
Collapse
|
204
|
BARRETT JOHNMARTIN, BERLINGUER-PALMINI ROLANDO, DEGENAAR PATRICK. Optogenetic approaches to retinal prosthesis. Vis Neurosci 2014; 31:345-54. [PMID: 25100257 PMCID: PMC4161214 DOI: 10.1017/s0952523814000212] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Accepted: 05/07/2014] [Indexed: 01/14/2023]
Abstract
The concept of visual restoration via retinal prosthesis arguably started in 1992 with the discovery that some of the retinal cells were still intact in those with the retinitis pigmentosa disease. Two decades later, the first commercially available devices have the capability to allow users to identify basic shapes. Such devices are still very far from returning vision beyond the legal blindness. Thus, there is considerable continued development of electrode materials, and structures and electronic control mechanisms to increase both resolution and contrast. In parallel, the field of optogenetics--the genetic photosensitization of neural tissue holds particular promise for new approaches. Given that the eye is transparent, photosensitizing remaining neural layers of the eye and illuminating from the outside could prove to be less invasive, cheaper, and more effective than present approaches. As we move toward human trials in the coming years, this review explores the core technological and biological challenges related to the gene therapy and the high radiance optical stimulation requirement.
Collapse
Affiliation(s)
- JOHN MARTIN BARRETT
- Institute of Neuroscience,
Newcastle University, Newcastle upon
Tyne, United Kingdom
| | | | - PATRICK DEGENAAR
- School of EEE,
Newcastle University, Newcastle upon
Tyne, United Kingdom
| |
Collapse
|
205
|
Ghodrati M, Farzmahdi A, Rajaei K, Ebrahimpour R, Khaligh-Razavi SM. Feedforward object-vision models only tolerate small image variations compared to human. Front Comput Neurosci 2014; 8:74. [PMID: 25100986 PMCID: PMC4103258 DOI: 10.3389/fncom.2014.00074] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 06/28/2014] [Indexed: 11/13/2022] Open
Abstract
Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex.
Collapse
Affiliation(s)
- Masoud Ghodrati
- Brain and Intelligent Systems Research Laboratory, Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran ; Department of Physiology, Monash University Melbourne, VIC, Australia
| | - Amirhossein Farzmahdi
- Brain and Intelligent Systems Research Laboratory, Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran ; Department of Electrical Engineering, Amirkabir University of Technology Tehran, Iran
| | - Karim Rajaei
- Brain and Intelligent Systems Research Laboratory, Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| | - Reza Ebrahimpour
- Brain and Intelligent Systems Research Laboratory, Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| | | |
Collapse
|
206
|
Elliott T. Sparseness, antisparseness and anything in between: the operating point of a neuron determines its computational repertoire. Neural Comput 2014; 26:1924-72. [PMID: 24922502 DOI: 10.1162/neco_a_00630] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A recent model of intrinsic plasticity coupled to Hebbian synaptic plasticity proposes that adaptation of a neuron's threshold and gain in a sigmoidal response function to achieve a sparse, exponential output firing rate distribution facilitates the discovery of heavy-tailed or super- gaussian sources in the neuron's inputs. We show that the exponential output distribution is irrelevant to these dynamics and that, furthermore, while sparseness is sufficient, it is not necessary. The intrinsic plasticity mechanism drives the neuron's threshold large and positive, and we prove that in such a regime, the neuron will find supergaussian sources; equally, however, if the threshold is large and negative (an antisparse regime), it will also find supergaussian sources. Away from such extremes, the neuron can also discover subgaussian sources. By examining a neuron with a fixed sigmoidal nonlinearity and considering the synaptic strength fixed-point structure in the two-dimensional parameter space defined by the neuron's threshold and gain, we show that this space is carved up into sub- and supergaussian-input-finding regimes, possibly with regimes of simultaneous stability of sub- and supergaussian sources or regimes of instability of all sources; a single gaussian source may also be stabilized by the presence of a nongaussian source. A neuron's operating point (essentially its threshold and gain coupled with its input statistics) therefore critically determines its computational repertoire. Intrinsic plasticity mechanisms induce trajectories in this parameter space but do not fundamentally modify it. Unless the trajectories cross critical boundaries in this space, intrinsic plasticity is irrelevant and the neuron's nonlinearity may be frozen with identical receptive field refinement dynamics.
Collapse
Affiliation(s)
- Terry Elliott
- Department of Electronics and Computer Science, University of Southampton, Highfield, Southampton, SO17 1BJ, U.K.
| |
Collapse
|
207
|
Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci U S A 2014; 111:8619-24. [PMID: 24812127 DOI: 10.1073/pnas.1403112111] [Citation(s) in RCA: 926] [Impact Index Per Article: 84.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The ventral visual stream underlies key human visual object recognition abilities. However, neural encoding in the higher areas of the ventral stream remains poorly understood. Here, we describe a modeling approach that yields a quantitatively accurate model of inferior temporal (IT) cortex, the highest ventral cortical area. Using high-throughput computational techniques, we discovered that, within a class of biologically plausible hierarchical neural network models, there is a strong correlation between a model's categorization performance and its ability to predict individual IT neural unit response data. To pursue this idea, we then identified a high-performing neural network that matches human performance on a range of recognition tasks. Critically, even though we did not constrain this model to match neural data, its top output layer turns out to be highly predictive of IT spiking responses to complex naturalistic images at both the single site and population levels. Moreover, the model's intermediate layers are highly predictive of neural responses in the V4 cortex, a midlevel visual area that provides the dominant cortical input to IT. These results show that performance optimization--applied in a biologically appropriate model class--can be used to build quantitative predictive models of neural processing.
Collapse
|
208
|
Dähne S, Wilbert N, Wiskott L. Slow feature analysis on retinal waves leads to V1 complex cells. PLoS Comput Biol 2014; 10:e1003564. [PMID: 24810948 PMCID: PMC4014395 DOI: 10.1371/journal.pcbi.1003564] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 12/20/2013] [Indexed: 11/20/2022] Open
Abstract
The developing visual system of many mammalian species is partially structured and organized even before the onset of vision. Spontaneous neural activity, which spreads in waves across the retina, has been suggested to play a major role in these prenatal structuring processes. Recently, it has been shown that when employing an efficient coding strategy, such as sparse coding, these retinal activity patterns lead to basis functions that resemble optimal stimuli of simple cells in primary visual cortex (V1). Here we present the results of applying a coding strategy that optimizes for temporal slowness, namely Slow Feature Analysis (SFA), to a biologically plausible model of retinal waves. Previously, SFA has been successfully applied to model parts of the visual system, most notably in reproducing a rich set of complex-cell features by training SFA with quasi-natural image sequences. In the present work, we obtain SFA units that share a number of properties with cortical complex-cells by training on simulated retinal waves. The emergence of two distinct properties of the SFA units (phase invariance and orientation tuning) is thoroughly investigated via control experiments and mathematical analysis of the input-output functions found by SFA. The results support the idea that retinal waves share relevant temporal and spatial properties with natural visual input. Hence, retinal waves seem suitable training stimuli to learn invariances and thereby shape the developing early visual system such that it is best prepared for coding input from the natural world.
Collapse
Affiliation(s)
- Sven Dähne
- Machine Learning Group, Department of Computer Science, Berlin Institute of Technology, Berlin, Germany
- Institute for Theoretical Biology, Humboldt-University, Berlin, Germany
- Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Niko Wilbert
- Institute for Theoretical Biology, Humboldt-University, Berlin, Germany
- Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Laurenz Wiskott
- Institute for Theoretical Biology, Humboldt-University, Berlin, Germany
- Bernstein Center for Computational Neuroscience, Berlin, Germany
- Institute for Neural Computation, Ruhr-University Bochum, Bochum, Germany
| |
Collapse
|
209
|
Salmi J, Glerean E, Jääskeläinen IP, Lahnakoski JM, Kettunen J, Lampinen J, Tikka P, Sams M. Posterior parietal cortex activity reflects the significance of others' actions during natural viewing. Hum Brain Mapp 2014; 35:4767-76. [PMID: 24706557 DOI: 10.1002/hbm.22510] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Revised: 12/31/2013] [Accepted: 03/17/2014] [Indexed: 11/06/2022] Open
Abstract
The posterior parietal cortex (PPC) has been associated with multiple stimulus-driven (e.g., processing stimulus movements, providing visual signals for the motor system), goal-directed (e.g., directing visual attention to a target, processing behavioral priority of intentions), and action-related functions in previous studies with non-naturalistic paradigms. Here, we examined how these functions reflect PPC activity during natural viewing. Fourteen healthy volunteers watched a re-edited movie during functional magnetic resonance imaging (fMRI). Participants separately annotated behavioral priority (accounting for percepts, thoughts, and emotions) they had experienced during movie episodes. Movements in the movie were quantified with computer vision and eye movements were recorded from a separate group of subjects. Our results show that while overlapping dorsomedial PPC areas respond to episodes with multiple types of stimulus content, ventrolateral PPC areas exhibit enhanced activity when viewing goal-directed human hand actions. Furthermore, PPC activity related to viewing goal-directed human hand actions was more accurately explained by behavioral priority than by movements of the stimulus or eye movements. Taken together, our results suggest that PPC participates in perception of goal-directed human hand actions, supporting the view that PPC has a special role in providing visual signals for the motor system ("how"), in addition to processing visual spatial movements ("where").
Collapse
Affiliation(s)
- Juha Salmi
- Department of Biomedical Engineering and Computational Science, Brain and Mind Laboratory, Aalto University School of Science, Finland; Advanced Magnetic Imaging Centre, Aalto University School of Science, Finland
| | | | | | | | | | | | | | | |
Collapse
|
210
|
Abstract
In the early stages of image analysis, visual cortex represents scenes as spatially organized maps of locally defined features (e.g., edge orientation). As image reconstruction unfolds and features are assembled into larger constructs, cortex attempts to recover semantic content for object recognition. It is conceivable that higher level representations may feed back onto early processes and retune their properties to align with the semantic structure projected by the scene; however, there is no clear evidence to either support or discard the applicability of this notion to the human visual system. Obtaining such evidence is challenging because low and higher level processes must be probed simultaneously within the same experimental paradigm. We developed a methodology that targets both levels of analysis by embedding low-level probes within natural scenes. Human observers were required to discriminate probe orientation while semantic interpretation of the scene was selectively disrupted via stimulus inversion or reversed playback. We characterized the orientation tuning properties of the perceptual process supporting probe discrimination; tuning was substantially reshaped by semantic manipulation, demonstrating that low-level feature detectors operate under partial control from higher level modules. The manner in which such control was exerted may be interpreted as a top-down predictive strategy whereby global semantic content guides and refines local image reconstruction. We exploit the novel information gained from data to develop mechanistic accounts of unexplained phenomena such as the classic face inversion effect.
Collapse
|
211
|
An X, Gong H, McLoughlin N, Yang Y, Wang W. The mechanism for processing random-dot motion at various speeds in early visual cortices. PLoS One 2014; 9:e93115. [PMID: 24682033 PMCID: PMC3969330 DOI: 10.1371/journal.pone.0093115] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2013] [Accepted: 03/03/2014] [Indexed: 11/18/2022] Open
Abstract
All moving objects generate sequential retinotopic activations representing a series of discrete locations in space and time (motion trajectory). How direction-selective neurons in mammalian early visual cortices process motion trajectory remains to be clarified. Using single-cell recording and optical imaging of intrinsic signals along with mathematical simulation, we studied response properties of cat visual areas 17 and 18 to random dots moving at various speeds. We found that, the motion trajectory at low speed was encoded primarily as a direction signal by groups of neurons preferring that motion direction. Above certain transition speeds, the motion trajectory is perceived as a spatial orientation representing the motion axis of the moving dots. In both areas studied, above these speeds, other groups of direction-selective neurons with perpendicular direction preferences were activated to encode the motion trajectory as motion-axis information. This applied to both simple and complex neurons. The average transition speed for switching between encoding motion direction and axis was about 31°/s in area 18 and 15°/s in area 17. A spatio-temporal energy model predicted the transition speeds accurately in both areas, but not the direction-selective indexes to random-dot stimuli in area 18. In addition, above transition speeds, the change of direction preferences of population responses recorded by optical imaging can be revealed using vector maximum but not vector summation method. Together, this combined processing of motion direction and axis by neurons with orthogonal direction preferences associated with speed may serve as a common principle of early visual motion processing.
Collapse
Affiliation(s)
- Xu An
- CAS Key Laboratory of Brain Function and Diseases, School of Life Sciences, University of Science and Technology of China, Hefei, P. R. China; Institute of Neuroscience and State Key Laboratory of Neuroscience, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Hongliang Gong
- Institute of Neuroscience and State Key Laboratory of Neuroscience, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Niall McLoughlin
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
| | - Yupeng Yang
- CAS Key Laboratory of Brain Function and Diseases, School of Life Sciences, University of Science and Technology of China, Hefei, P. R. China
| | - Wei Wang
- Institute of Neuroscience and State Key Laboratory of Neuroscience, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| |
Collapse
|
212
|
Liu K, Yao H. Contrast-dependent OFF-dominance in cat primary visual cortex facilitates discrimination of stimuli with natural contrast statistics. Eur J Neurosci 2014; 39:2060-70. [DOI: 10.1111/ejn.12567] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Revised: 02/14/2014] [Accepted: 02/19/2014] [Indexed: 11/29/2022]
Affiliation(s)
- Kefei Liu
- Institute of Neuroscience and State Key Laboratory of Neuroscience; Shanghai Institutes for Biological Sciences; Chinese Academy of Sciences; Shanghai China
- University of Chinese Academy of Sciences; Shanghai China
| | - Haishan Yao
- Institute of Neuroscience and State Key Laboratory of Neuroscience; Shanghai Institutes for Biological Sciences; Chinese Academy of Sciences; Shanghai China
| |
Collapse
|
213
|
Torreão JRA, Victer SMC, Amaral MS. Signal-tuned Gabor functions as models for stimulus-dependent cortical receptive fields. Neural Comput 2014; 26:920-52. [PMID: 24555452 DOI: 10.1162/neco_a_00581] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We propose and analyze a model, based on signal-tuned Gabor functions, for the receptive fields and responses of V1 cells. Signal-tuned Gabor functions are gaussian-modulated sinusoids whose parameters are obtained from a given, spatial, or spectral "tuning" signal. These functions can be proven to yield exact representations of their tuning signals and have recently been proposed as the kernels of a variant Gabor transform-the signal-tuned Gabor transform (STGT)-which allows the accurate detection of spatial and spectral events. Here we show that by modeling the receptive fields of simple and complex cells as signal-tuned Gabor functions and expressing their responses as STGTs, we are able to replicate the properties of these cells when tested with standard grating and slit inputs, at the same time emulating their stimulus-dependent character as revealed by recent neurophysiological studies.
Collapse
Affiliation(s)
- José R A Torreão
- Instituto de Computação, Universidade Federal Fluminense, 24210-240 Niterói RJ, Brazil
| | | | | |
Collapse
|
214
|
Zhang YY, Wang RB, Pan XC, Gong HQ, Liang PJ. Visual pattern discrimination by population retinal ganglion cells' activities during natural movie stimulation. Cogn Neurodyn 2014; 8:27-35. [PMID: 24465283 PMCID: PMC3890090 DOI: 10.1007/s11571-013-9266-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Revised: 06/21/2013] [Accepted: 08/01/2013] [Indexed: 11/28/2022] Open
Abstract
In the visual system, neurons often fire in synchrony, and it is believed that synchronous activities of group neurons are more efficient than single cell response in transmitting neural signals to down-stream neurons. However, whether dynamic natural stimuli are encoded by dynamic spatiotemporal firing patterns of synchronous group neurons still needs to be investigated. In this paper we recorded the activities of population ganglion cells in bullfrog retina in response to time-varying natural images (natural scene movie) using multi-electrode arrays. In response to some different brief section pairs of the movie, synchronous groups of retinal ganglion cells (RGCs) fired with similar but different spike events. We attempted to discriminate the movie sections based on temporal firing patterns of single cells and spatiotemporal firing patterns of the synchronous groups of RGCs characterized by a measurement of subsequence distribution discrepancy. The discrimination performance was assessed by a classification method based on Support Vector Machines. Our results show that different movie sections of the natural movie elicited reliable dynamic spatiotemporal activity patterns of the synchronous RGCs, which are more efficient in discriminating different movie sections than the temporal patterns of the single cells' spike events. These results suggest that, during natural vision, the down-stream neurons may decode the visual information from the dynamic spatiotemporal patterns of the synchronous group of RGCs' activities.
Collapse
Affiliation(s)
- Ying-Ying Zhang
- />Institute for Cognitive Neurodynamics, East China University Science and Technology, Shanghai, 200237 China
| | - Ru-Bin Wang
- />Institute for Cognitive Neurodynamics, East China University Science and Technology, Shanghai, 200237 China
| | - Xiao-Chuan Pan
- />Institute for Cognitive Neurodynamics, East China University Science and Technology, Shanghai, 200237 China
| | - Hai-Qing Gong
- />School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Pei-Ji Liang
- />School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| |
Collapse
|
215
|
Kording KP. Bayesian statistics: relevant for the brain? Curr Opin Neurobiol 2014; 25:130-3. [PMID: 24463330 DOI: 10.1016/j.conb.2014.01.003] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2013] [Revised: 11/19/2013] [Accepted: 01/02/2014] [Indexed: 11/27/2022]
Abstract
Analyzing data from experiments involves variables that we neuroscientists are uncertain about. Efficiently calculating with such variables usually requires Bayesian statistics. As it is crucial when analyzing complex data, it seems natural that the brain would "use" such statistics to analyze data from the world. And indeed, recent studies in the areas of perception, action, and cognition suggest that Bayesian behavior is widespread, in many modalities and species. Consequently, many models have suggested that the brain is built on simple Bayesian principles. While the brain's code is probably not actually simple, I believe that Bayesian principles will facilitate the construction of faithful models of the brain.
Collapse
|
216
|
Vlachos I, Zaytsev YV, Spreizer S, Aertsen A, Kumar A. Neural system prediction and identification challenge. Front Neuroinform 2014; 7:43. [PMID: 24399966 PMCID: PMC3872335 DOI: 10.3389/fninf.2013.00043] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2013] [Accepted: 12/11/2013] [Indexed: 11/29/2022] Open
Abstract
Can we infer the function of a biological neural network (BNN) if we know the connectivity and activity of all its constituent neurons?This question is at the core of neuroscience and, accordingly, various methods have been developed to record the activity and connectivity of as many neurons as possible. Surprisingly, there is no theoretical or computational demonstration that neuronal activity and connectivity are indeed sufficient to infer the function of a BNN. Therefore, we pose the Neural Systems Identification and Prediction Challenge (nuSPIC). We provide the connectivity and activity of all neurons and invite participants (1) to infer the functions implemented (hard-wired) in spiking neural networks (SNNs) by stimulating and recording the activity of neurons and, (2) to implement predefined mathematical/biological functions using SNNs. The nuSPICs can be accessed via a web-interface to the NEST simulator and the user is not required to know any specific programming language. Furthermore, the nuSPICs can be used as a teaching tool. Finally, nuSPICs use the crowd-sourcing model to address scientific issues. With this computational approach we aim to identify which functions can be inferred by systematic recordings of neuronal activity and connectivity. In addition, nuSPICs will help the design and application of new experimental paradigms based on the structure of the SNN and the presumed function which is to be discovered.
Collapse
Affiliation(s)
- Ioannis Vlachos
- Faculty of Biology, Bernstein Center Freiburg, University of Freiburg Freiburg im Breisgau, Germany
| | - Yury V Zaytsev
- Faculty of Biology, Bernstein Center Freiburg, University of Freiburg Freiburg im Breisgau, Germany ; Simulation Laboratory Neuroscience - Bernstein Facility for Simulation and Database Technology, Institute for Advanced Simulation, Jülich Aachen Research Alliance, Jülich Research Center Jülich, Germany
| | - Sebastian Spreizer
- Faculty of Biology, Bernstein Center Freiburg, University of Freiburg Freiburg im Breisgau, Germany
| | - Ad Aertsen
- Faculty of Biology, Bernstein Center Freiburg, University of Freiburg Freiburg im Breisgau, Germany
| | - Arvind Kumar
- Faculty of Biology, Bernstein Center Freiburg, University of Freiburg Freiburg im Breisgau, Germany
| |
Collapse
|
217
|
Maravall M, Alenda A, Bale MR, Petersen RS. Transformation of adaptation and gain rescaling along the whisker sensory pathway. PLoS One 2013; 8:e82418. [PMID: 24349279 PMCID: PMC3859573 DOI: 10.1371/journal.pone.0082418] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 10/24/2013] [Indexed: 11/18/2022] Open
Abstract
Neurons in all sensory systems have a remarkable ability to adapt their sensitivity to the statistical structure of the sensory signals to which they are tuned. In the barrel cortex, firing rate adapts to the variance of a whisker stimulus and neuronal sensitivity (gain) adjusts in inverse proportion to the stimulus standard deviation. To determine how adaptation might be transformed across the ascending lemniscal pathway, we measured the responses of single units in the first and last subcortical stages, the trigeminal ganglion (TRG) and ventral posterior medial thalamic nucleus (VPM), to controlled whisker stimulation in urethane-anesthetized rats. We probed adaptation using a filtered white noise stimulus that switched between low- and high-variance epochs. We found that the firing rate of both TRG and VPM neurons adapted to stimulus variance. By fitting the responses of each unit to a Linear-Nonlinear-Poisson model, we tested whether adaptation changed feature selectivity and/or sensitivity. We found that, whereas feature selectivity was unaffected by stimulus variance, units often exhibited a marked change in sensitivity. The extent of these sensitivity changes increased systematically along the pathway from TRG to barrel cortex. However, there was marked variability across units, especially in VPM. In sum, in the whisker system, the adaptation properties of subcortical neurons are surprisingly diverse. The significance of this diversity may be that it contributes to a rich population representation of whisker dynamics.
Collapse
Affiliation(s)
- Miguel Maravall
- Instituto de Neurociencias de Alicante, Consejo Superior de Investigaciones Científicas-Universidad Miguel Hernández, Sant Joan d'Alacant, Alicante, Spain
- * E-mail: (MM); (RSP)
| | - Andrea Alenda
- Instituto de Neurociencias de Alicante, Consejo Superior de Investigaciones Científicas-Universidad Miguel Hernández, Sant Joan d'Alacant, Alicante, Spain
| | - Michael R. Bale
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
| | - Rasmus S. Petersen
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
- * E-mail: (MM); (RSP)
| |
Collapse
|
218
|
Lindeberg T. A computational theory of visual receptive fields. BIOLOGICAL CYBERNETICS 2013; 107:589-635. [PMID: 24197240 PMCID: PMC3840297 DOI: 10.1007/s00422-013-0569-z] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 09/02/2013] [Indexed: 05/21/2023]
Abstract
A receptive field constitutes a region in the visual field where a visual cell or a visual operator responds to visual stimuli. This paper presents a theory for what types of receptive field profiles can be regarded as natural for an idealized vision system, given a set of structural requirements on the first stages of visual processing that reflect symmetry properties of the surrounding world. These symmetry properties include (i) covariance properties under scale changes, affine image deformations, and Galilean transformations of space-time as occur for real-world image data as well as specific requirements of (ii) temporal causality implying that the future cannot be accessed and (iii) a time-recursive updating mechanism of a limited temporal buffer of the past as is necessary for a genuine real-time system. Fundamental structural requirements are also imposed to ensure (iv) mutual consistency and a proper handling of internal representations at different spatial and temporal scales. It is shown how a set of families of idealized receptive field profiles can be derived by necessity regarding spatial, spatio-chromatic, and spatio-temporal receptive fields in terms of Gaussian kernels, Gaussian derivatives, or closely related operators. Such image filters have been successfully used as a basis for expressing a large number of visual operations in computer vision, regarding feature detection, feature classification, motion estimation, object recognition, spatio-temporal recognition, and shape estimation. Hence, the associated so-called scale-space theory constitutes a both theoretically well-founded and general framework for expressing visual operations. There are very close similarities between receptive field profiles predicted from this scale-space theory and receptive field profiles found by cell recordings in biological vision. Among the family of receptive field profiles derived by necessity from the assumptions, idealized models with very good qualitative agreement are obtained for (i) spatial on-center/off-surround and off-center/on-surround receptive fields in the fovea and the LGN, (ii) simple cells with spatial directional preference in V1, (iii) spatio-chromatic double-opponent neurons in V1, (iv) space-time separable spatio-temporal receptive fields in the LGN and V1, and (v) non-separable space-time tilted receptive fields in V1, all within the same unified theory. In addition, the paper presents a more general framework for relating and interpreting these receptive fields conceptually and possibly predicting new receptive field profiles as well as for pre-wiring covariance under scaling, affine, and Galilean transformations into the representations of visual stimuli. This paper describes the basic structure of the necessity results concerning receptive field profiles regarding the mathematical foundation of the theory and outlines how the proposed theory could be used in further studies and modelling of biological vision. It is also shown how receptive field responses can be interpreted physically, as the superposition of relative variations of surface structure and illumination variations, given a logarithmic brightness scale, and how receptive field measurements will be invariant under multiplicative illumination variations and exposure control mechanisms.
Collapse
Affiliation(s)
- Tony Lindeberg
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of Technology, 100 44 , Stockholm, Sweden,
| |
Collapse
|
219
|
Benjamini Y, Yu B. The shuffle estimator for explainable variance in fMRI experiments. Ann Appl Stat 2013; 7. [DOI: 10.1214/13-aoas681] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
220
|
Theis L, Chagas AM, Arnstein D, Schwarz C, Bethge M. Beyond GLMs: a generative mixture modeling approach to neural system identification. PLoS Comput Biol 2013; 9:e1003356. [PMID: 24278006 PMCID: PMC3836720 DOI: 10.1371/journal.pcbi.1003356] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Accepted: 10/06/2013] [Indexed: 11/19/2022] Open
Abstract
Generalized linear models (GLMs) represent a popular choice for the probabilistic characterization of neural spike responses. While GLMs are attractive for their computational tractability, they also impose strong assumptions and thus only allow for a limited range of stimulus-response relationships to be discovered. Alternative approaches exist that make only very weak assumptions but scale poorly to high-dimensional stimulus spaces. Here we seek an approach which can gracefully interpolate between the two extremes. We extend two frequently used special cases of the GLM—a linear and a quadratic model—by assuming that the spike-triggered and non-spike-triggered distributions can be adequately represented using Gaussian mixtures. Because we derive the model from a generative perspective, its components are easy to interpret as they correspond to, for example, the spike-triggered distribution and the interspike interval distribution. The model is able to capture complex dependencies on high-dimensional stimuli with far fewer parameters than other approaches such as histogram-based methods. The added flexibility comes at the cost of a non-concave log-likelihood. We show that in practice this does not have to be an issue and the mixture-based model is able to outperform generalized linear and quadratic models. An essential goal of sensory systems neuroscience is to characterize the functional relationship between neural responses and external stimuli. Of particular interest are the nonlinear response properties of single cells. Inherently linear approaches such as generalized linear modeling can nevertheless be used to fit nonlinear behavior by choosing an appropriate feature space for the stimulus. This requires, however, that one has already obtained a good understanding of a cells nonlinear properties, whereas more flexible approaches are necessary for the characterization of unexpected nonlinear behavior. In this work, we present a generalization of some frequently used generalized linear models which enables us to automatically extract complex stimulus-response relationships from recorded data. We show that our model can lead to substantial quantitative and qualitative improvements over generalized linear and quadratic models, which we illustrate on the example of primary afferents of the rat whisker system.
Collapse
Affiliation(s)
- Lucas Theis
- Werner Reichardt Centre for Integrative Neuroscience, Tübingen, Germany
- Graduate School of Neural Information Processing, University of Tübingen, Tübingen, Germany
- * E-mail:
| | - Andrè Maia Chagas
- Werner Reichardt Centre for Integrative Neuroscience, Tübingen, Germany
- Hertie Institute for Clinical Brain Research, Tübingen, Germany
- Graduate School of Neural and Behavioural Sciences, University of Tübingen, Tübingen, Germany
| | - Daniel Arnstein
- Hertie Institute for Clinical Brain Research, Tübingen, Germany
- Graduate School of Neural and Behavioural Sciences, University of Tübingen, Tübingen, Germany
| | - Cornelius Schwarz
- Werner Reichardt Centre for Integrative Neuroscience, Tübingen, Germany
- Hertie Institute for Clinical Brain Research, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Matthias Bethge
- Werner Reichardt Centre for Integrative Neuroscience, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
221
|
Saleem AB, Ayaz A, Jeffery KJ, Harris KD, Carandini M. Integration of visual motion and locomotion in mouse visual cortex. Nat Neurosci 2013; 16:1864-9. [PMID: 24185423 PMCID: PMC3926520 DOI: 10.1038/nn.3567] [Citation(s) in RCA: 271] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 10/04/2013] [Indexed: 12/22/2022]
Abstract
Successful navigation through the world requires accurate estimation of one's own speed. To derive this estimate, animals integrate visual speed gauged from optic flow and run speed gauged from proprioceptive and locomotor systems. The primary visual cortex (V1) carries signals related to visual speed, and its responses are also affected by run speed. To study how V1 combines these signals during navigation, we recorded from mice that traversed a virtual environment. Nearly half of the V1 neurons were reliably driven by combinations of visual speed and run speed. These neurons performed a weighted sum of the two speeds. The weights were diverse across neurons, and typically positive. As a population, V1 neurons predicted a linear combination of visual and run speeds better than either visual or run speeds alone. These data indicate that V1 in the mouse participates in a multimodal processing system that integrates visual motion and locomotion during navigation.
Collapse
Affiliation(s)
- Aman B Saleem
- 1] UCL Institute of Ophthalmology, University College London, London, UK. [2] Department of Cognitive, Perceptual and Brain Sciences, University College London, London, UK
| | | | | | | | | |
Collapse
|
222
|
Zhu X, Yang Z. Multi-scale spatial concatenations of local features in natural scenes and scene classification. PLoS One 2013; 8:e76393. [PMID: 24098789 PMCID: PMC3787016 DOI: 10.1371/journal.pone.0076393] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2012] [Accepted: 08/29/2013] [Indexed: 11/19/2022] Open
Abstract
How does the visual system encode natural scenes? What are the basic structures of natural scenes? In current models of scene perception, there are two broad feature representations, global and local representations. Both representations are useful and have some successes; however, many observations on human scene perception seem to point to an intermediate-level representation. In this paper, we proposed natural scene structures, i.e., multi-scale spatial concatenations of local features, as an intermediate-level representation of natural scenes. To compile the natural scene structures, we first sampled a large number of multi-scale circular scene patches in a hexagonal configuration. We then performed independent component analysis on the patches and classified the independent components into a set of clusters using the K-means method. Finally, we obtained a set of natural scene structures, each of which is characterized by a set of dominant clusters of independent components. We examined a range of statistics of the natural scene structures, compiled from two widely used datasets of natural scenes, and modeled their spatial arrangements at larger spatial scales using adjacency matrices. We found that the natural scene structures include a full range of concatenations of visual features in natural scenes, and can be used to encode spatial information at various scales. We then selected a set of natural scene structures with high information, and used the occurring frequencies and the eigenvalues of the adjacency matrices to classify scenes in the datasets. We found that the performance of this model is comparable to or better than the state-of-the-art models on the two datasets. These results suggest that the natural scene structures are a useful intermediate-level representation of visual scenes for our understanding of natural scene perception.
Collapse
Affiliation(s)
- Xiaoyuan Zhu
- Brain and Behavior Discovery Institute, Georgia Regents University, Augusta, Georgia, United States of America
| | | |
Collapse
|
223
|
LeDue EE, King JL, Stover KR, Crowder NA. Spatiotemporal specificity of contrast adaptation in mouse primary visual cortex. Front Neural Circuits 2013; 7:154. [PMID: 24106461 PMCID: PMC3789212 DOI: 10.3389/fncir.2013.00154] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2013] [Accepted: 09/12/2013] [Indexed: 11/13/2022] Open
Abstract
Prolonged viewing of high contrast gratings alters perceived stimulus contrast, and produces characteristic changes in the contrast response functions of neurons in the primary visual cortex (V1). This is referred to as contrast adaptation. Although contrast adaptation has been well-studied, its underlying neural mechanisms are not well-understood. Therefore, we investigated contrast adaptation in mouse V1 with the goal of establishing a quantitative description of this phenomenon in a genetically manipulable animal model. One interesting aspect of contrast adaptation that has been observed both perceptually and in single unit studies is its specificity for the spatial and temporal characteristics of the stimulus. Therefore, in the present work we determined if the magnitude of contrast adaptation in mouse V1 neurons was dependent on the spatial frequency and temporal frequency of the adapting grating. We used protocols that were readily comparable with previous studies in cats and primates, and also a novel contrast ramp stimulus that characterized the spatial and temporal specificity of contrast adaptation simultaneously. Similar to previous work in higher mammals, we found that contrast adaptation was strongest when the spatial frequency and temporal frequency of the adapting grating matched the test stimulus. This suggests similar mechanisms underlying contrast adaptation across animal models and indicates that the rapidly advancing genetic tools available in mice could be used to provide insights into this phenomenon.
Collapse
Affiliation(s)
- Emily E LeDue
- Department of Psychology and Neuroscience, Dalhousie University Halifax, NS, Canada
| | | | | | | |
Collapse
|
224
|
Linking minds and brains. Vis Neurosci 2013; 30:207-17. [PMID: 23962654 DOI: 10.1017/s0952523813000291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
When I first came across William James' dictum that " … this sense of sameness is the very keel and backbone of our thinking," I thought he had foreseen the importance of cross-correlation in the brain, and told myself to find out how he had reached this conclusion. When I finally did this a year or two ago, I slowly came to realize that I had completely misunderstood him; from the full quote it is absolutely clear that his dictum cannot be referring to the process by which a cortical simple cell responds selectively to the orientation of features in a visual image, as I had originally supposed. If one translates the original dictum into two more prosaic modern versions, his version would say: "Our minds could not think at all without neural circuits in our brains that compute auto-correlations," but in my mistaken interpretation the last word would be "cross-correlations." Others may have made the same mistake, but the difference is profound, and finding what he really meant has been a revelation to me. This essay explains the revelation, describes how to determine experimentally whether the brain does auto- or cross-correlation, and gives the result of preliminary experiments showing clearly that it does both. A revised view of the visual cortex as autocorrelator as well as cross-correlator claims to tell us what complex cells in the visual cortex do, and it assigns a role to its columnar structure that is as important to fulfilling that role as the concept of the receptive field has been to understanding the simple cells' fulfillment of theirs. The new view has compelling features, broad implications, and suggests a plausible model of how neural circuits in the cortex achieve thought, but it needs further testing.
Collapse
|
225
|
Abstract
When faced with ambiguous sensory inputs, subjective perception alternates between the different interpretations in a stochastic manner. Such multistable perception phenomena have intrigued scientists and laymen alike for over a century. Despite rigorous investigations, the underlying mechanisms of multistable perception remain elusive. Recent studies using multivariate pattern analysis revealed that activity patterns in posterior visual areas correlate with fluctuating percepts. However, increasing evidence suggests that vision--and perception at large--is an active inferential process involving hierarchical brain systems. We applied searchlight multivariate pattern analysis to functional magnetic resonance imaging signals across the human brain to decode perceptual content during bistable perception and simple unambiguous perception. Although perceptually reflective activity patterns during simple perception localized predominantly to posterior visual regions, bistable perception involved additionally many higher-order frontoparietal and temporal regions. Moreover, compared with simple perception, both top-down and bottom-up influences were dramatically enhanced during bistable perception. We further studied the intermittent presentation of ambiguous images--a condition that is known to elicit perceptual memory. Compared with continuous presentation, intermittent presentation recruited even more higher-order regions and was accompanied by further strengthened top-down influences but relatively weakened bottom-up influences. Taken together, these results strongly support an active top-down inferential process in perception.
Collapse
|
226
|
Lindeberg T. Invariance of visual operations at the level of receptive fields. PLoS One 2013; 8:e66990. [PMID: 23894283 PMCID: PMC3716821 DOI: 10.1371/journal.pone.0066990] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Accepted: 05/14/2013] [Indexed: 11/29/2022] Open
Abstract
The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This paper presents a theory for achieving basic invariance properties already at the level of receptive fields. Specifically, the presented framework comprises (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 are close to ideal to what is motivated by the idealized requirements. By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination. The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.
Collapse
Affiliation(s)
- Tony Lindeberg
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of Technology, Stockholm, Sweden.
| |
Collapse
|
227
|
Kriegeskorte N, Kievit RA. Representational geometry: integrating cognition, computation, and the brain. Trends Cogn Sci 2013; 17:401-12. [PMID: 23876494 PMCID: PMC3730178 DOI: 10.1016/j.tics.2013.06.007] [Citation(s) in RCA: 490] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 06/06/2013] [Accepted: 06/12/2013] [Indexed: 01/08/2023]
Abstract
Representational geometry is a framework that enables us to relate brain, computation, and cognition. Representations in brains and models can be characterized by representational distance matrices. Distance matrices can be readily compared to test computational models. We review recent insights into perception, cognition, memory, and action and discuss current challenges.
The cognitive concept of representation plays a key role in theories of brain information processing. However, linking neuronal activity to representational content and cognitive theory remains challenging. Recent studies have characterized the representational geometry of neural population codes by means of representational distance matrices, enabling researchers to compare representations across stages of processing and to test cognitive and computational theories. Representational geometry provides a useful intermediate level of description, capturing both the information represented in a neuronal population code and the format in which it is represented. We review recent insights gained with this approach in perception, memory, cognition, and action. Analyses of representational geometry can compare representations between models and the brain, and promise to explain brain computation as transformation of representational similarity structure.
Collapse
|
228
|
McFarland JM, Cui Y, Butts DA. Inferring nonlinear neuronal computation based on physiologically plausible inputs. PLoS Comput Biol 2013; 9:e1003143. [PMID: 23874185 PMCID: PMC3715434 DOI: 10.1371/journal.pcbi.1003143] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Accepted: 06/01/2013] [Indexed: 12/03/2022] Open
Abstract
The computation represented by a sensory neuron's response to stimuli is constructed from an array of physiological processes both belonging to that neuron and inherited from its inputs. Although many of these physiological processes are known to be nonlinear, linear approximations are commonly used to describe the stimulus selectivity of sensory neurons (i.e., linear receptive fields). Here we present an approach for modeling sensory processing, termed the Nonlinear Input Model (NIM), which is based on the hypothesis that the dominant nonlinearities imposed by physiological mechanisms arise from rectification of a neuron's inputs. Incorporating such 'upstream nonlinearities' within the standard linear-nonlinear (LN) cascade modeling structure implicitly allows for the identification of multiple stimulus features driving a neuron's response, which become directly interpretable as either excitatory or inhibitory. Because its form is analogous to an integrate-and-fire neuron receiving excitatory and inhibitory inputs, model fitting can be guided by prior knowledge about the inputs to a given neuron, and elements of the resulting model can often result in specific physiological predictions. Furthermore, by providing an explicit probabilistic model with a relatively simple nonlinear structure, its parameters can be efficiently optimized and appropriately regularized. Parameter estimation is robust and efficient even with large numbers of model components and in the context of high-dimensional stimuli with complex statistical structure (e.g. natural stimuli). We describe detailed methods for estimating the model parameters, and illustrate the advantages of the NIM using a range of example sensory neurons in the visual and auditory systems. We thus present a modeling framework that can capture a broad range of nonlinear response functions while providing physiologically interpretable descriptions of neural computation.
Collapse
Affiliation(s)
- James M McFarland
- Department of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, Maryland, USA.
| | | | | |
Collapse
|
229
|
Functional heterogeneity in neighboring neurons of cat primary visual cortex in response to both artificial and natural stimuli. J Neurosci 2013; 33:7325-44. [PMID: 23616540 DOI: 10.1523/jneurosci.4071-12.2013] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Neurons in primary visual cortex of many mammals are clustered according to their preference to stimulus parameters such as orientation and spatial frequency. Nevertheless, responses to complex visual stimuli are highly heterogeneous between adjacent neurons. To investigate the relation between these observations, we recorded from pairs of neighboring neurons in area 17 of anesthetized cats in response to stimuli of differing complexity: sinusoidal drifting gratings, binary dense noise, and natural movies. Comparisons of the tuning curves revealed similar orientation and direction preferences for neighboring neurons, but large differences in preferred phase, direction selectivity, and tuning width of spatial frequency. No pair was similar across all tuning properties. The neurons' firing rates averaged across multiple stimulus repetitions (the "signal") were also compared. Binned between 10 and 200 ms, the correlation between these signals was close to zero in the median across all pairs for all stimulus classes. Signal correlations agreed poorly with differences in tuning properties, except for receptive field offset and relative modulation (i.e., the strength of phase modulation). Nonetheless, signal correlations for different stimulus classes were well correlated with each other, even for gratings and movies. Conversely, trial-to-trial fluctuations (termed "noise") were poorly correlated between neighboring neurons, suggesting low degrees of common input. In response to gratings and visual noise, signal and noise correlations were well correlated with each other, but less so for responses to movies. These findings have relevance for our understanding of the processing of natural stimuli in a functionally heterogeneous cortical network.
Collapse
|
230
|
Kay KN, Winawer J, Rokem A, Mezer A, Wandell BA. A two-stage cascade model of BOLD responses in human visual cortex. PLoS Comput Biol 2013; 9:e1003079. [PMID: 23737741 PMCID: PMC3667759 DOI: 10.1371/journal.pcbi.1003079] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2012] [Accepted: 04/18/2013] [Indexed: 12/03/2022] Open
Abstract
Visual neuroscientists have discovered fundamental properties of neural representation through careful analysis of responses to controlled stimuli. Typically, different properties are studied and modeled separately. To integrate our knowledge, it is necessary to build general models that begin with an input image and predict responses to a wide range of stimuli. In this study, we develop a model that accepts an arbitrary band-pass grayscale image as input and predicts blood oxygenation level dependent (BOLD) responses in early visual cortex as output. The model has a cascade architecture, consisting of two stages of linear and nonlinear operations. The first stage involves well-established computations—local oriented filters and divisive normalization—whereas the second stage involves novel computations—compressive spatial summation (a form of normalization) and a variance-like nonlinearity that generates selectivity for second-order contrast. The parameters of the model, which are estimated from BOLD data, vary systematically across visual field maps: compared to primary visual cortex, extrastriate maps generally have larger receptive field size, stronger levels of normalization, and increased selectivity for second-order contrast. Our results provide insight into how stimuli are encoded and transformed in successive stages of visual processing. Much has been learned about how stimuli are represented in the visual system from measuring responses to carefully designed stimuli. Typically, different studies focus on different types of stimuli. Making sense of the large array of findings requires integrated models that explain responses to a wide range of stimuli. In this study, we measure functional magnetic resonance imaging (fMRI) responses in early visual cortex to a wide range of band-pass filtered images, and construct a computational model that takes the stimuli as input and predicts the fMRI responses as output. The model has a cascade architecture, consisting of two stages of linear and nonlinear operations. A novel component of the model is a nonlinear operation that generates selectivity for second-order contrast, that is, variations in contrast-energy across the visual field. We find that this nonlinearity is stronger in extrastriate areas V2 and V3 than in primary visual cortex V1. Our results provide insight into how stimuli are encoded and transformed in the visual system.
Collapse
Affiliation(s)
- Kendrick N Kay
- Department of Psychology, Stanford University, Stanford, California, USA.
| | | | | | | | | |
Collapse
|
231
|
Distinguishing theory from implementation in predictive coding accounts of brain function. Behav Brain Sci 2013; 36:231-2. [PMID: 23663497 DOI: 10.1017/s0140525x12002178] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
It is often helpful to distinguish between a theory (Marr's computational level) and a specific implementation of that theory (Marr's physical level). However, in the target article, a single implementation of predictive coding is presented as if this were the theory of predictive coding itself. Other implementations of predictive coding have been formulated which can explain additional neurobiological phenomena.
Collapse
|
232
|
Network interactions: non-geniculate input to V1. Curr Opin Neurobiol 2013; 23:195-201. [DOI: 10.1016/j.conb.2013.01.020] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Revised: 01/15/2013] [Accepted: 01/15/2013] [Indexed: 11/22/2022]
|
233
|
Lochmann T, Blanche TJ, Butts DA. Construction of direction selectivity through local energy computations in primary visual cortex. PLoS One 2013; 8:e58666. [PMID: 23554913 PMCID: PMC3598900 DOI: 10.1371/journal.pone.0058666] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2013] [Accepted: 02/07/2013] [Indexed: 11/18/2022] Open
Abstract
Despite detailed knowledge about the anatomy and physiology of neurons in primary visual cortex (V1), the large numbers of inputs onto a given V1 neuron make it difficult to relate them to the neuron's functional properties. For example, models of direction selectivity (DS), such as the Energy Model, can successfully describe the computation of phase-invariant DS at a conceptual level, while leaving it unclear how such computations are implemented by cortical circuits. Here, we use statistical modeling to derive a description of DS computation for both simple and complex cells, based on physiologically plausible operations on their inputs. We present a new method that infers the selectivity of a neuron's inputs using extracellular recordings in macaque in the context of random bar stimuli and natural movies in cat. Our results suggest that DS is initially constructed in V1 simple cells through summation and thresholding of non-DS inputs with appropriate spatiotemporal relationships. However, this de novo construction of DS is rare, and a majority of DS simple cells, and all complex cells, appear to receive both excitatory and suppressive inputs that are already DS. For complex cells, these numerous DS inputs typically span a fraction of their overall receptive fields and have similar spatiotemporal tuning but different phase and spatial positions, suggesting an elaboration to the Energy Model that incorporates spatially localized computation. Furthermore, we demonstrate how these computations might be constructed from biologically realizable components, and describe a statistical model consistent with the feed-forward framework suggested by Hubel and Wiesel.
Collapse
Affiliation(s)
- Timm Lochmann
- Department of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, Maryland, USA.
| | | | | |
Collapse
|
234
|
Abstract
A focal stimulus triggers neural activity that spreads to cortical regions far beyond the stimulation site, creating a so-called "cortical point spread" (CPS). Animal studies found that V1 neurons possess lateral connections with neighboring neurons that prefer similar orientations and to neurons representing visuotopic regions that are constrained by their preferred orientation axis. Although various roles in visual processing are proposed for this anatomical anisotropy of lateral connections, evidence for a corresponding "functional" anisotropy in CPS is lacking or inconsistent in animal studies and absent in humans. To explore functional anisotropy, we inspected axial constraints on CPS in human visual cortex using functional magnetic resonance imaging. We defined receptive fields (RFs) of unit gray matter volumes and delineated the spatial extents of CPS in visuotopic space. The CPS triggered by foveal stimuli exhibited coaxial anisotropy with larger spatial extents along the axis of stimulus orientation. Furthermore, the spatial extents of CPS along the coaxial direction increased with an increasing similarity of local sites to the CPS-inducing stimulus in orientation preference. From CPS driven by multifocal stimuli, the coaxially biased spread was also found in cortical regions in the periphery, albeit reduced in degree, and was invariant to a varying degree of radial relationship between stimuli and RF positions of local sites, rejecting radial bias as an origin of coaxial anisotropy. Our findings provide a bridge between the anatomical anisotropy seen in animal visual cortex and a possible network property supporting spatial contextual effects in human visual perception.
Collapse
|
235
|
Hou F, Huang CB, Liang J, Zhou Y, Lu ZL. Contrast gain-control in stereo depth and cyclopean contrast perception. J Vis 2013; 13:13.8.3. [PMID: 23820024 DOI: 10.1167/13.8.3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Although human observers can perceive depth from stereograms with considerable contrast difference between the images presented to the two eyes (Legge & Gu, 1989), how contrast gain control functions in stereo depth perception has not been systematically investigated. Recently, we developed a multipathway contrast gain-control model (MCM) for binocular phase and contrast perception (Huang, Zhou, Lu, & Zhou, 2011; Huang, Zhou, Zhou, & Lu, 2010) based on a contrast gain-control model of binocular phase combination (Ding & Sperling, 2006). To extend the MCM to simultaneously account for stereo depth and cyclopean contrast perception, we manipulated the contrasts (ranging from 0.08 to 0.4) of the dynamic random dot stereograms (RDS) presented to the left and right eyes independently and measured both disparity thresholds for depth perception and perceived contrasts of the cyclopean images. We found that both disparity threshold and perceived contrast depended strongly on the signal contrasts in the two eyes, exhibiting characteristic binocular contrast gain-control properties. The results were well accounted for by an extended MCM model, in which each eye exerts gain control on the other eye's signal in proportion to its own signal contrast energy and also gain control over the other eye's gain control; stereo strength is proportional to the product of the signal strengths in the two eyes after contrast gain control, and perceived contrast is computed by combining contrast energy from the two eyes. The new model provided an excellent account of our data (r(2) = 0.945), as well as some challenging results in the literature.
Collapse
Affiliation(s)
- Fang Hou
- Laboratory of Brain Processes, Department of Psychology, The Ohio State University, Columbus, OH, USA.
| | | | | | | | | |
Collapse
|
236
|
Vaingankar V, Soto-Sanchez C, Wang X, Sommer FT, Hirsch JA. Neurons in the thalamic reticular nucleus are selective for diverse and complex visual features. Front Integr Neurosci 2012; 6:118. [PMID: 23269915 PMCID: PMC3529363 DOI: 10.3389/fnint.2012.00118] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 11/29/2012] [Indexed: 11/13/2022] Open
Abstract
All visual signals the cortex receives are influenced by the perigeniculate sector (PGN) of the thalamic reticular nucleus, which receives input from relay cells in the lateral geniculate and provides feedback inhibition in return. Relay cells have been studied in quantitative depth; they behave in a roughly linear fashion and have receptive fields with a stereotyped center-surround structure. We know far less about reticular neurons. Qualitative studies indicate they simply pool ascending input to generate non-selective gain control. Yet the perigeniculate is complicated; local cells are densely interconnected and fire lengthy bursts. Thus, we employed quantitative methods to explore the perigeniculate using relay cells as controls. By adapting methods of spike-triggered averaging and covariance analysis for bursts, we identified both first and second order features that build reticular receptive fields. The shapes of these spatiotemporal subunits varied widely; no stereotyped pattern emerged. Companion experiments showed that the shape of the first but not second order features could be explained by the overlap of On and Off inputs to a given cell. Moreover, we assessed the predictive power of the receptive field and how much information each component subunit conveyed. Linear-non-linear (LN) models including multiple subunits performed better than those made with just one; further each subunit encoded different visual information. Model performance for reticular cells was always lesser than for relay cells, however, indicating that reticular cells process inputs non-linearly. All told, our results suggest that the perigeniculate encodes diverse visual features to selectively modulate activity transmitted downstream.
Collapse
Affiliation(s)
- Vishal Vaingankar
- Department of Biological Sciences and Neuroscience Graduate Program, University of Southern CaliforniaLos Angeles, CA, USA
| | - Cristina Soto-Sanchez
- Department of Biological Sciences and Neuroscience Graduate Program, University of Southern CaliforniaLos Angeles, CA, USA
| | - Xin Wang
- Computational Neurobiology Laboratory, The Salk Institute for Biological StudiesLa Jolla, CA, USA
| | - Friedrich T. Sommer
- Redwood Center for Theoretical Neuroscience, University of CaliforniaBerkeley, CA, USA
| | - Judith A. Hirsch
- Department of Biological Sciences and Neuroscience Graduate Program, University of Southern CaliforniaLos Angeles, CA, USA
| |
Collapse
|
237
|
Distinct functional organizations for processing different motion signals in V1, V2, and V4 of macaque. J Neurosci 2012; 32:13363-79. [PMID: 23015427 DOI: 10.1523/jneurosci.1900-12.2012] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Motion perception is qualitatively invariant across different objects and forms, namely, the same motion information can be conveyed by many different physical carriers, and it requires the processing of motion signals consisting of direction, speed, and axis or trajectory of motion defined by a moving object. Compared with the representation of orientation, the cortical processing of these different motion signals within the early ventral visual pathway of the primate remains poorly understood. Using drifting full-field noise stimuli and intrinsic optical imaging, along with cytochrome-oxidase staining, we found that the orientation domains in macaque V1, V2, and V4 that processed orientation signals also served to process motion signals associated with the axis and speed of motion. In contrast, direction domains within the thick stripes of V2 demonstrated preferences that were independent of motion speed. The population responses encoding the orientation and motion axis could be precisely reproduced by a spatiotemporal energy model. Thus, our observation of orientation domains with dual functions in V1, V2, and V4 directly support the notion that the linear representation of the temporal series of retinotopic activations may serve as another motion processing strategy in primate ventral visual pathway, contributing directly to fine form and motion analysis. Our findings further reveal that different types of motion information are differentially processed in parallel and segregated compartments within primate early visual cortices, before these motion features are fully combined in high-tier visual areas.
Collapse
|
238
|
Lin IC, Xing D, Shapley R. Integrate-and-fire vs Poisson models of LGN input to V1 cortex: noisier inputs reduce orientation selectivity. J Comput Neurosci 2012; 33:559-72. [PMID: 22684587 PMCID: PMC4104821 DOI: 10.1007/s10827-012-0401-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Revised: 05/22/2012] [Accepted: 05/23/2012] [Indexed: 11/27/2022]
Abstract
One of the reasons the visual cortex has attracted the interest of computational neuroscience is that it has well-defined inputs. The lateral geniculate nucleus (LGN) of the thalamus is the source of visual signals to the primary visual cortex (V1). Most large-scale cortical network models approximate the spike trains of LGN neurons as simple Poisson point processes. However, many studies have shown that neurons in the early visual pathway are capable of spiking with high temporal precision and their discharges are not Poisson-like. To gain an understanding of how response variability in the LGN influences the behavior of V1, we study response properties of model V1 neurons that receive purely feedforward inputs from LGN cells modeled either as noisy leaky integrate-and-fire (NLIF) neurons or as inhomogeneous Poisson processes. We first demonstrate that the NLIF model is capable of reproducing many experimentally observed statistical properties of LGN neurons. Then we show that a V1 model in which the LGN input to a V1 neuron is modeled as a group of NLIF neurons produces higher orientation selectivity than the one with Poisson LGN input. The second result implies that statistical characteristics of LGN spike trains are important for V1's function. We conclude that physiologically motivated models of V1 need to include more realistic LGN spike trains that are less noisy than inhomogeneous Poisson processes.
Collapse
Affiliation(s)
- I-Chun Lin
- Center for Neural Science, New York University, New York, NY 10003, USA.
| | | | | |
Collapse
|
239
|
Kelly SD, Hansen BC, Clark DT. "Slight" of hand: the processing of visually degraded gestures with speech. PLoS One 2012; 7:e42620. [PMID: 22912715 PMCID: PMC3415388 DOI: 10.1371/journal.pone.0042620] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Accepted: 07/10/2012] [Indexed: 11/18/2022] Open
Abstract
Co-speech hand gestures influence language comprehension. The present experiment explored what part of the visual processing system is optimized for processing these gestures. Participants viewed short video clips of speech and gestures (e.g., a person saying “chop” or “twist” while making a chopping gesture) and had to determine whether the two modalities were congruent or incongruent. Gesture videos were designed to stimulate the parvocellular or magnocellular visual pathways by filtering out low or high spatial frequencies (HSF versus LSF) at two levels of degradation severity (moderate and severe). Participants were less accurate and slower at processing gesture and speech at severe versus moderate levels of degradation. In addition, they were slower for LSF versus HSF stimuli, and this difference was most pronounced in the severely degraded condition. However, exploratory item analyses showed that the HSF advantage was modulated by the range of motion and amount of motion energy in each video. The results suggest that hand gestures exploit a wide range of spatial frequencies, and depending on what frequencies carry the most motion energy, parvocellular or magnocellular visual pathways are maximized to quickly and optimally extract meaning.
Collapse
Affiliation(s)
- Spencer D Kelly
- Department of Psychology and Neuroscience Program, Colgate University, Hamilton, New York, United States of America.
| | | | | |
Collapse
|
240
|
Laminar analysis of visually evoked activity in the primary visual cortex. Proc Natl Acad Sci U S A 2012; 109:13871-6. [PMID: 22872866 DOI: 10.1073/pnas.1201478109] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Studying the laminar pattern of neural activity is crucial for understanding the processing of neural signals in the cerebral cortex. We measured neural population activity [multiunit spike activity (MUA) and local field potential, LFP] in Macaque primary visual cortex (V1) in response to drifting grating stimuli. Sustained visually driven MUA was at an approximately constant level across cortical depth in V1. However, sustained, visually driven, local field potential power, which was concentrated in the γ-band (20-60 Hz), was greatest at the cortical depth corresponding to cortico-cortical output layers 2, 3, and 4B. γ-band power also tends to be more sustained in the output layers. Overall, cortico-cortical output layers accounted for 67% of total γ-band activity in V1, whereas 56% of total spikes evoked by drifting gratings were from layers 2, 3, and 4B. The high-resolution layer specificity of γ-band power, the laminar distribution of MUA and γ-band activity, and their dynamics imply that neural activity in V1 is generated by laminar-specific mechanisms. In particular, visual responses of MUA and γ-band activity in cortico-cortical output layers 2, 3, and 4B seem to be strongly influenced by laminar-specific recurrent circuitry and/or feedback.
Collapse
|
241
|
Wagemans J, Elder JH, Kubovy M, Palmer SE, Peterson MA, Singh M, von der Heydt R. A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. Psychol Bull 2012; 138:1172-217. [PMID: 22845751 DOI: 10.1037/a0029333] [Citation(s) in RCA: 558] [Impact Index Per Article: 42.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In 1912, Max Wertheimer published his paper on phi motion, widely recognized as the start of Gestalt psychology. Because of its continued relevance in modern psychology, this centennial anniversary is an excellent opportunity to take stock of what Gestalt psychology has offered and how it has changed since its inception. We first introduce the key findings and ideas in the Berlin school of Gestalt psychology, and then briefly sketch its development, rise, and fall. Next, we discuss its empirical and conceptual problems, and indicate how they are addressed in contemporary research on perceptual grouping and figure-ground organization. In particular, we review the principles of grouping, both classical (e.g., proximity, similarity, common fate, good continuation, closure, symmetry, parallelism) and new (e.g., synchrony, common region, element and uniform connectedness), and their role in contour integration and completion. We then review classic and new image-based principles of figure-ground organization, how it is influenced by past experience and attention, and how it relates to shape and depth perception. After an integrated review of the neural mechanisms involved in contour grouping, border ownership, and figure-ground perception, we conclude by evaluating what modern vision science has offered compared to traditional Gestalt psychology, whether we can speak of a Gestalt revival, and where the remaining limitations and challenges lie. A better integration of this research tradition with the rest of vision science requires further progress regarding the conceptual and theoretical foundations of the Gestalt approach, which is the focus of a second review article.
Collapse
Affiliation(s)
- Johan Wagemans
- University of Leuven (KU Leuven), Laboratory of Experimental Psychology, Tiensestraat 102, Box 3711, BE-3000 Leuven, Belgium.
| | | | | | | | | | | | | |
Collapse
|
242
|
Haslinger R, Pipa G, Lima B, Singer W, Brown EN, Neuenschwander S. Context matters: the illusive simplicity of macaque V1 receptive fields. PLoS One 2012; 7:e39699. [PMID: 22802940 PMCID: PMC3389039 DOI: 10.1371/journal.pone.0039699] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2012] [Accepted: 05/24/2012] [Indexed: 11/21/2022] Open
Abstract
Even in V1, where neurons have well characterized classical receptive fields (CRFs), it has been difficult to deduce which features of natural scenes stimuli they actually respond to. Forward models based upon CRF stimuli have had limited success in predicting the response of V1 neurons to natural scenes. As natural scenes exhibit complex spatial and temporal correlations, this could be due to surround effects that modulate the sensitivity of the CRF. Here, instead of attempting a forward model, we quantify the importance of the natural scenes surround for awake macaque monkeys by modeling it non-parametrically. We also quantify the influence of two forms of trial to trial variability. The first is related to the neuron’s own spike history. The second is related to ongoing mean field population activity reflected by the local field potential (LFP). We find that the surround produces strong temporal modulations in the firing rate that can be both suppressive and facilitative. Further, the LFP is found to induce a precise timing in spikes, which tend to be temporally localized on sharp LFP transients in the gamma frequency range. Using the pseudo R2 as a measure of model fit, we find that during natural scene viewing the CRF dominates, accounting for 60% of the fit, but that taken collectively the surround, spike history and LFP are almost as important, accounting for 40%. However, overall only a small proportion of V1 spiking statistics could be explained (R2∼5%), even when the full stimulus, spike history and LFP were taken into account. This suggests that under natural scene conditions, the dominant influence on V1 neurons is not the stimulus, nor the mean field dynamics of the LFP, but the complex, incoherent dynamics of the network in which neurons are embedded.
Collapse
Affiliation(s)
- Robert Haslinger
- Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America.
| | | | | | | | | | | |
Collapse
|
243
|
Murphy JW, Kelly SP, Foxe JJ, Lalor EC. Isolating early cortical generators of visual-evoked activity: a systems identification approach. Exp Brain Res 2012; 220:191-9. [PMID: 22644236 DOI: 10.1007/s00221-012-3129-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2011] [Accepted: 05/09/2012] [Indexed: 11/25/2022]
Abstract
The VESPA (visual-evoked spread spectrum analysis) method estimates the impulse response of the visual system using a continuously varying stimulus. It has been used recently to address both basic cognitive and neurophysiologic questions as well as those surrounding clinical populations. Although the components of the average VESPA response are highly reminiscent of the early components of the visual-evoked potential (VEP) when measured over midline occipital locations, the two responses are acquired in different ways and, thus, they cannot be regarded as being equivalent. To further characterize the relationship between the VESPA and the VEP and the generative mechanisms underlying them, we recorded EEG from 31 subjects in response to checkerboard-based VEP and VESPA stimuli. We found that, across subjects, the amplitudes of the VEP C1 component and the VESPA C1 component were highly correlated, whereas the VEP P1 and the VESPA P1 bore no statistical relationship. Furthermore, we found that C1 and P1 amplitudes were significantly correlated in the VESPA but not in the VEP. We believe these findings point to the presence of common generators underlying the VESPA C1 and the VEP C1. We argue further that the VESPA P1, in light of its strong relationship to the VESPA C1, likely reflects further activation of the same cortical generators. Given the lack of correlation between the VEP P1 and each of these three other components, it is likely that the underlying generators of this particular component are more varied and widespread, as suggested previously. We discuss the implications of these relationships for basic and clinical research using the VESPA and for the assessment of additive-evoked versus phase-reset contributions to the VEP.
Collapse
Affiliation(s)
- Jeremy W Murphy
- Program in Cognitive Neuroscience, Department of Psychology, City College of the City University of New York, New York, NY 10031, USA.
| | | | | | | |
Collapse
|
244
|
Abstract
Sensory receptive fields (RFs) vary as a function of stimulus properties and measurement methods. Previous stimuli or surrounding stimuli facilitate, suppress, or change the selectivity of sensory neurons' responses. Here, we propose that these spatiotemporal contextual dependencies are signatures of efficient perceptual inference and can be explained by a single neural mechanism, input targeted divisive inhibition. To respond both selectively and reliably, sensory neurons should behave as active predictors rather than passive filters. In particular, they should remove input they can predict ("explain away") from the synaptic inputs to all other neurons. This implies that RFs are constantly and dynamically reshaped by the spatial and temporal context, while the true selectivity of sensory neurons resides in their "predictive field." This approach motivates a reinvestigation of sensory representations and particularly the role and specificity of surround suppression and adaptation in sensory areas.
Collapse
|
245
|
Teichmann M, Wiltschut J, Hamker F. Learning Invariance from Natural Images Inspired by Observations in the Primary Visual Cortex. Neural Comput 2012; 24:1271-96. [DOI: 10.1162/neco_a_00268] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The human visual system has the remarkable ability to largely recognize objects invariant of their position, rotation, and scale. A good interpretation of neurobiological findings involves a computational model that simulates signal processing of the visual cortex. In part, this is likely achieved step by step from early to late areas of visual perception. While several algorithms have been proposed for learning feature detectors, only few studies at hand cover the issue of biologically plausible learning of such invariance. In this study, a set of Hebbian learning rules based on calcium dynamics and homeostatic regulations of single neurons is proposed. Their performance is verified within a simple model of the primary visual cortex to learn so-called complex cells, based on a sequence of static images. As a result, the learned complex-cell responses are largely invariant to phase and position.
Collapse
Affiliation(s)
| | - Jan Wiltschut
- Chemnitz University of Technology, Chemnitz 01907, Germany, and Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany
| | - Fred Hamker
- Chemnitz University of Technology, 09107 Chemnitz, Germany
| |
Collapse
|
246
|
Abstract
Stimulus visibility can be reduced by other stimuli that overlap the same region of visual space, a process known as masking. Here we studied the neural mechanisms of masking in humans using source-imaged steady state visual evoked potentials and frequency-domain analysis over a wide range of relative stimulus strengths of test and mask stimuli. Test and mask stimuli were tagged with distinct temporal frequencies and we quantified spectral response components associated with the individual stimuli (self terms) and responses due to interaction between stimuli (intermodulation terms). In early visual cortex, masking alters the self terms in a manner consistent with a reduction of input contrast. We also identify a novel signature of masking: a robust intermodulation term that peaks when the test and mask stimuli have equal contrast and disappears when they are widely different. We fit all of our data simultaneously with family of a divisive gain control models that differed only in their dynamics. Models with either very short or very long temporal integration constants for the gain pool performed worse than a model with an integration time of ∼30 ms. Finally, the absolute magnitudes of the response were controlled by the ratio of the stimulus contrasts, not their absolute values. This contrast-contrast invariance suggests that many neurons in early visual cortex code relative rather than absolute contrast. Together, these results provide a more complete description of masking within the normalization framework of contrast gain control and suggest that contrast normalization accomplishes multiple functional goals.
Collapse
|
247
|
Joo SJ, Boynton GM, Murray SO. Long-range, pattern-dependent contextual effects in early human visual cortex. Curr Biol 2012; 22:781-6. [PMID: 22503498 DOI: 10.1016/j.cub.2012.02.067] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Revised: 02/10/2012] [Accepted: 02/28/2012] [Indexed: 11/25/2022]
Abstract
The standard view of neurons in early visual cortex is that they behave like localized feature detectors. Here we demonstrate that processing in early visual areas goes beyond feature detection by showing that neural responses are greater when a feature deviates from its context compared to when it does not deviate from its context. Using psychophysics, fMRI, and electroencephalography methodologies, we measured neural responses to an oriented Gabor ("target") embedded in various visual patterns as defined by the relative orientation of flanking stimuli. We first show using psychophysical contrast adaptation and fMRI that a target that differs from its context results in more neural activity compared to a target that is contained within an alternating sequence, suggesting that neurons in early visual cortex are sensitive to large-scale orientation patterns. Next, we use event-related potentials to show that orientation deviations affect the earliest sensory components of the target response. Finally, we use forced-choice classification of "noise" stimuli to show that we are more likely to "see" orientations that deviate from the context. Our results suggest that early visual cortex is sensitive to global patterns in images in a way that is markedly different from the predictions of standard models of cortical visual processing.
Collapse
Affiliation(s)
- Sung Jun Joo
- Department of Psychology, University of Washington, Seattle, WA 98195, USA.
| | | | | |
Collapse
|
248
|
Hansen BC, Hess RF. On the effectiveness of noise masks: naturalistic vs. un-naturalistic image statistics. Vision Res 2012; 60:101-13. [PMID: 22484251 DOI: 10.1016/j.visres.2012.03.017] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2011] [Revised: 01/10/2012] [Accepted: 03/23/2012] [Indexed: 11/28/2022]
Abstract
It has been argued that the human visual system is optimized for identification of broadband objects embedded in stimuli possessing orientation averaged power spectra fall-offs that obey the 1/f(β) relationship typically observed in natural scene imagery (i.e., β=2.0 on logarithmic axes). Here, we were interested in whether individual spatial channels leading to recognition are functionally optimized for narrowband targets when masked by noise possessing naturalistic image statistics (β=2.0). The current study therefore explores the impact of variable β noise masks on the identification of narrowband target stimuli ranging in spatial complexity, while simultaneously controlling for physical or perceived differences between the masks. The results show that β=2.0 noise masks produce the largest identification thresholds regardless of target complexity, and thus do not seem to yield functionally optimized channel processing. The differential masking effects are discussed in the context of contrast gain control.
Collapse
Affiliation(s)
- Bruce C Hansen
- Department of Psychology & Neuroscience Program, Colgate University, Hamilton, NY 13346, USA.
| | | |
Collapse
|
249
|
Abstract
Neuroscience seeks to understand how neural circuits lead to behavior. However, the gap between circuits and behavior is too wide. An intermediate level is one of neural computations, which occur in individual neurons and populations of neurons. Some computations seem to be canonical: repeated and combined in different ways across the brain. To understand neural computations, we must record from a myriad of neurons in multiple brain regions. Understanding computation guides research in the underlying circuits and provides a language for theories of behavior.
Collapse
|
250
|
Natural versus synthetic stimuli for estimating receptive field models: a comparison of predictive robustness. J Neurosci 2012; 32:1560-76. [PMID: 22302799 DOI: 10.1523/jneurosci.4661-12.2012] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
An ultimate goal of visual neuroscience is to understand the neural encoding of complex, everyday scenes. Yet most of our knowledge of neuronal receptive fields has come from studies using simple artificial stimuli (e.g., bars, gratings) that may fail to reveal the full nature of a neuron's actual response properties. Our goal was to compare the utility of artificial and natural stimuli for estimating receptive field (RF) models. Using extracellular recordings from simple type cells in cat A18, we acquired responses to three types of broadband stimulus ensembles: two widely used artificial patterns (white noise and short bars), and natural images. We used a primary dataset to estimate the spatiotemporal receptive field (STRF) with two hold-back datasets for regularization and validation. STRFs were estimated using an iterative regression algorithm with regularization and subsequently fit with a zero-memory nonlinearity. Each RF model (STRF and zero-memory nonlinearity) was then used in simulations to predict responses to the same stimulus type used to estimate it, as well as to other broadband stimuli and sinewave gratings. White noise stimuli often elicited poor responses leading to noisy RF estimates, while short bars and natural image stimuli were more successful in driving A18 neurons and producing clear RF estimates with strong predictive ability. Natural image-derived RF models were the most robust at predicting responses to other broadband stimulus ensembles that were not used in their estimation and also provided good predictions of tuning curves for sinewave gratings.
Collapse
|