1
|
Perrett D. Charlie Gross: An inspiration. Prog Neurobiol 2020; 195:101928. [PMID: 33075448 DOI: 10.1016/j.pneurobio.2020.101928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Affiliation(s)
- David Perrett
- University of Saint Andrews, School of Psychology & Neuroscience, St Mary's Quadrangle, St Andrews, Fife, KY16 9JP, United Kingdom.
| |
Collapse
|
2
|
Barwich AS. The Value of Failure in Science: The Story of Grandmother Cells in Neuroscience. Front Neurosci 2019; 13:1121. [PMID: 31708726 PMCID: PMC6822296 DOI: 10.3389/fnins.2019.01121] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Accepted: 10/04/2019] [Indexed: 11/13/2022] Open
Abstract
The annals of science are filled with successes. Only in footnotes do we hear about the failures, the cul-de-sacs, and the forgotten ideas. Failure is how research advances. Yet it hardly features in theoretical perspectives on science. That is a mistake. Failures, whether clear-cut or ambiguous, are heuristically fruitful in their own right. Thinking about failure questions our measures of success, including the conceptual foundations of current practice, that can only be transient in an experimental context. This article advances the heuristics of failure analysis, meaning the explicit treatment of certain ideas or models as failures. The value of failures qua being a failure is illustrated with the example of grandmother cells; the contested idea of a hypothetical neuron that encodes a highly specific but complex stimulus, such as the image of one's grandmother. Repeatedly evoked in popular science and maintained in textbooks, there is sufficient reason to critically review the theoretical and empirical background of this idea.
Collapse
Affiliation(s)
- Ann-Sophie Barwich
- Department of History and Philosophy of Science and Medicine, Cognitive Science Program, Indiana University Bloomington, Bloomington, IN, United States
| |
Collapse
|
3
|
Abstract
Does the sense of smell involve the perception of odor objects? General discussion of perceptual objecthood centers on three criteria: stimulus representation, perceptual constancy, and figure-ground segregation. These criteria, derived from theories of vision, have been applied to olfaction in recent philosophical debates about psychology. An inherent problem with such framing of olfactory objecthood is that philosophers explicitly ignore the constitutive factors of the sensory systems that underpin the implementation of these criteria. The biological basis of odor coding is fundamentally different from the coding principles of the visual system. This article analyzes the three measures of perceptual objecthood against the biological background of the olfactory system. It contrasts the coding principles in olfaction with the visual system to show why these criteria of objecthood fail to be instantiated in odor perception. The argument demonstrates that olfaction affords perceptual categorization without the need to form odor objects.
Collapse
Affiliation(s)
- Ann-Sophie Barwich
- Cognitive Science Program, Indiana University, Bloomington, IN, United States
| |
Collapse
|
4
|
Spratling MW. A Hierarchical Predictive Coding Model of Object Recognition in Natural Images. Cognit Comput 2016; 9:151-167. [PMID: 28413566 PMCID: PMC5371651 DOI: 10.1007/s12559-016-9445-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 12/09/2016] [Indexed: 11/02/2022]
Abstract
Predictive coding has been proposed as a model of the hierarchical perceptual inference process performed in the cortex. However, results demonstrating that predictive coding is capable of performing the complex inference required to recognise objects in natural images have not previously been presented. This article proposes a hierarchical neural network based on predictive coding for performing visual object recognition. This network is applied to the tasks of categorising hand-written digits, identifying faces, and locating cars in images of street scenes. It is shown that image recognition can be performed with tolerance to position, illumination, size, partial occlusion, and within-category variation. The current results, therefore, provide the first practical demonstration that predictive coding (at least the particular implementation of predictive coding used here; the PC/BC-DIM algorithm) is capable of performing accurate visual object recognition.
Collapse
Affiliation(s)
- M. W. Spratling
- Department of Informatics, King’s College London, Strand, London, WC2R 2LS UK
| |
Collapse
|
5
|
A Required Paradigm Shift in Today’s Vision Research. KUNSTLICHE INTELLIGENZ 2015. [DOI: 10.1007/s13218-014-0347-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
6
|
Pooresmaeili A, Roelfsema PR. A growth-cone model for the spread of object-based attention during contour grouping. Curr Biol 2014; 24:2869-77. [PMID: 25456446 DOI: 10.1016/j.cub.2014.10.007] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 08/15/2014] [Accepted: 10/06/2014] [Indexed: 11/28/2022]
Abstract
BACKGROUND Object-based attention can group image elements of spatially extended objects into coherent representations, but its mechanisms have remained unclear. The mechanisms for object-based attention may include shape-selective neurons in higher visual cortical areas that feed back to lower areas to simultaneously enhance the representation of all image elements of a relevant shape. Another, nonexclusive mechanism is the spread of attention in early visual cortex according to Gestalt rules, which could successively add new elements to a growing object representation. RESULTS We investigated the dynamics of object-based attention in the primary visual cortex of monkeys trained to perform a contour-grouping task. The animals mentally traced a target curve through the visual field and ignored a distracting curve. Neuronal responses elicited by the target curve were enhanced relative to those elicited by distracting curve. Remarkably, the response enhancement was delayed for neurons with receptive fields farther from the start of the tracing process. We could therefore measure propagation speed and found that it was low if curves were nearby and that it increased if curves were far apart. The results are well explained by an "attentional growth-cone" model, which holds that the response enhancement can spread in multiple visual cortical areas with different receptive field sizes at a speed of approximately 50 ms per receptive field. CONCLUSIONS Our findings support an active role for early visual areas in object-based attention because neurons in these areas gradually spread enhanced activity over the representation of relevant objects.
Collapse
Affiliation(s)
- Arezoo Pooresmaeili
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands
| | - Pieter R Roelfsema
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands; Department of Integrative Neurophysiology, Centre for Neurogenomics and Cognitive Research, VU University, De Boelelaan 1085, 1081 HV Amsterdam, the Netherlands; Psychiatry Department, Academic Medical Center, Meibergdreef 5, 1105 AZ Amsterdam, the Netherlands.
| |
Collapse
|
7
|
Cortical gamma oscillations: the functional key is activation, not cognition. Neurosci Biobehav Rev 2013; 37:401-17. [PMID: 23333264 DOI: 10.1016/j.neubiorev.2013.01.013] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2012] [Revised: 12/28/2012] [Accepted: 01/07/2013] [Indexed: 12/19/2022]
Abstract
Cortical oscillatory synchrony in the gamma range has been attracting increasing attention in cognitive neuroscience ever since being proposed as a solution to the so-called binding problem. This growing literature is critically reviewed in both its basic neuroscience and cognitive aspects. A physiological "default assumption" regarding these oscillations is introduced, according to which they signal a state of physiological activation of cortical tissue, and the associated need to balance excitation with inhibition in particular. As such these oscillations would belong among a variety of generic neural control operations that enable neural tissue to perform its systems level functions, without implementing those functions themselves. Regional control of cerebral blood flow provides an analogy in this regard, and gamma oscillations are tightly correlated with this even more elementary control operation. As correlates of neural activation they will also covary with cognitive activity, and this typically suffices to account for the covariation between gamma activity and cognitive task variables. A number of specific cases of gamma synchrony are examined in this light, including the original impetus for attributing cognitive significance to gamma activity, namely the experiments interpreted as evidence for "binding by synchrony". This examination finds no compelling reasons to assign functional roles to oscillatory synchrony in the gamma range beyond its generic functions at the level of infrastructural neural control.
Collapse
|
8
|
Wi NTN, Loo CK, Chockalingam L. Biologically inspired face recognition: toward pose-invariance. Int J Neural Syst 2012. [PMID: 23186278 DOI: 10.1142/s0129065712500293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A small change in image will cause a dramatic change in signals. Visual system is required to be able to ignore these changes, yet specific enough to perform recognition. This work intends to provide biological-backed insights into 2D translation and scaling invariance and 3D pose-invariance without imposing strain on memory and with biological justification. The model can be divided into lower and higher visual stages. Lower visual stage models the visual pathway from retina to the striate cortex (V1), whereas the modeling of higher visual stage is mainly based on current psychophysical evidences.
Collapse
Affiliation(s)
- Noel Tay Nuo Wi
- Centre of Diploma Programmes, Multimedia University, JalanAyerKeroh Lama, Melaka, Malaysia.
| | | | | |
Collapse
|
9
|
Sebanz N, Knoblich G, Stumpf L, Prinz W. Far from action-blind: Representation of others' actions in individuals with Autism. Cogn Neuropsychol 2012; 22:433-54. [PMID: 21038260 DOI: 10.1080/02643290442000121] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
It has been suggested that theory of mind may rely on several precursors including gaze processing, joint attention, the ability to distinguish between actions of oneself and others, and the ability to represent goal-directed actions. Some of these processes have been shown to be impaired in individuals with autism, who experience difficulties in theory of mind. However, little is known about action representation in autism. Using two variants of a spatial compatibility reaction time (RT) task, we addressed the question of whether high-functioning individuals with autism have difficulties in controlling their own actions and in representing those of others. Participants with autism showed automatic response activation and had no difficulties with response inhibition. When two action alternatives were distributed among pairs of participants, participants with autism represented a co-actor's task, showing the same pattern of results as the matched control group. We discuss the possibility that in high-functioning individuals with autism, the system matching observed actions onto representations of one's own actions is intact, whereas difficulties in higher-level processing of social information persist.
Collapse
Affiliation(s)
- Natalie Sebanz
- Max Planck Institute for Psychological Research, Munich, Germany
| | | | | | | |
Collapse
|
10
|
PUGEAULT NICOLAS, WÖRGÖTTER FLORENTIN, KRÜGER NORBERT. VISUAL PRIMITIVES: LOCAL, CONDENSED, SEMANTICALLY RICH VISUAL DESCRIPTORS AND THEIR APPLICATIONS IN ROBOTICS. INT J HUM ROBOT 2011. [DOI: 10.1142/s0219843610002209] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We present a novel representation of visual information, based on local symbolic descriptors, that we call visual primitives. These primitives: (1) combine different visual modalities, (2) associate semantic to local scene information, and (3) reduce the bandwidth while increasing the predictability of the information exchanged across the system. This representation leads to the concept of early cognitive vision that we define as an intermediate level between dense, signal-based early vision and high-level cognitive vision. The framework's potential is demonstrated in several applications, in particular in the area of robotics and humanoid robotics, which are briefly outlined.
Collapse
Affiliation(s)
- NICOLAS PUGEAULT
- Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH Surrey, UK
| | - FLORENTIN WÖRGÖTTER
- Computational Neurosciences Group, Georg-August-Universität Göttingen, 37083 Göttingen, Germany
| | - NORBERT KRÜGER
- The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, DK-5230 Odense M, Denmark
| |
Collapse
|
11
|
Caggiano V, Fogassi L, Rizzolatti G, Pomper JK, Thier P, Giese MA, Casile A. View-Based Encoding of Actions in Mirror Neurons of Area F5 in Macaque Premotor Cortex. Curr Biol 2011; 21:144-8. [DOI: 10.1016/j.cub.2010.12.022] [Citation(s) in RCA: 153] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2010] [Revised: 11/02/2010] [Accepted: 12/10/2010] [Indexed: 11/28/2022]
|
12
|
Pugeault N, Wörgötter F, Krüger N. Disambiguating multi-modal scene representations using perceptual grouping constraints. PLoS One 2010; 5:e10663. [PMID: 20544006 PMCID: PMC2882939 DOI: 10.1371/journal.pone.0010663] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2009] [Accepted: 04/20/2010] [Indexed: 12/02/2022] Open
Abstract
In its early stages, the visual system suffers from a lot of ambiguity and noise that severely limits the performance of early vision algorithms. This article presents feedback mechanisms between early visual processes, such as perceptual grouping, stereopsis and depth reconstruction, that allow the system to reduce this ambiguity and improve early representation of visual information. In the first part, the article proposes a local perceptual grouping algorithm that - in addition to commonly used geometric information - makes use of a novel multi-modal measure between local edge/line features. The grouping information is then used to: 1) disambiguate stereopsis by enforcing that stereo matches preserve groups; and 2) correct the reconstruction error due to the image pixel sampling using a linear interpolation over the groups. The integration of mutual feedback between early vision processes is shown to reduce considerably ambiguity and noise without the need for global constraints.
Collapse
Affiliation(s)
- Nicolas Pugeault
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK.
| | | | | |
Collapse
|
13
|
Further evidence for the spread of attention during contour grouping: A reply to Crundall, Dewhurst, and Underwood (2008). Atten Percept Psychophys 2010; 72:849-62. [DOI: 10.3758/app.72.3.849] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
14
|
Westphal G, Würtz RP. Combining feature- and correspondence-based methods for visual object recognition. Neural Comput 2009; 21:1952-89. [PMID: 19292649 DOI: 10.1162/neco.2009.12-07-675] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We present an object recognition system built on a combination of feature- and correspondence-based pattern recognizers. The feature-based part, called preselection network, is a single-layer feedforward network weighted with the amount of information contributed by each feature to the decision at hand. For processing arbitrary objects, we employ small, regular graphs whose nodes are attributed with Gabor amplitudes, termed parquet graphs. The preselection network can quickly rule out most irrelevant matches and leaves only the ambiguous cases, so-called model candidates, to be verified by a rudimentary version of elastic graph matching, a standard correspondence-based technique for face and object recognition. According to the model, graphs are constructed that describe the object in the input image well. We report the results of experiments on standard databases for object recognition. The method achieved high recognition rates on identity and pose. Unlike many other models, it can also cope with varying background, multiple objects, and partial occlusion.
Collapse
Affiliation(s)
- Günter Westphal
- Mobile Vision Systems, Blücherstrasse 19, D-46397 Bocholt, Germany
| | | |
Collapse
|
15
|
Wolfrum P, von der Malsburg C. What Is the Optimal Architecture for Visual Information Routing? Neural Comput 2007; 19:3293-309. [DOI: 10.1162/neco.2007.19.12.3293] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Analyzing the design of networks for visual information routing is an underconstrained problem due to insufficient anatomical and physiological data. We propose here optimality criteria for the design of routing networks. For a very general architecture, we derive the number of routing layers and the fanout that minimize the required neural circuitry. The optimal fanout l is independent of network size, while the number k of layers scales logarithmically (with a prefactor below 1), with the number n of visual resolution units to be routed independently. The results are found to agree with data of the primate visual system.
Collapse
Affiliation(s)
- Philipp Wolfrum
- Frankfurt Institute for Advanced Studies, D-60438 Frankfurt am Main, Germany
| | - Christoph von der Malsburg
- Frankfurt Institute for Advanced Studies, D-60438 Frankfurt am Main, Germany, and Computer Science Department, University of Southern California, Los Angeles, CA 90089, U.S.A
| |
Collapse
|
16
|
Masquelier T, Thorpe SJ. Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput Biol 2007; 3:e31. [PMID: 17305422 PMCID: PMC1797822 DOI: 10.1371/journal.pcbi.0030031] [Citation(s) in RCA: 205] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2006] [Accepted: 01/02/2007] [Indexed: 11/18/2022] Open
Abstract
Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on afferents that systematically fire early, while postsynaptic spike latencies decrease. Here we use this learning rule in an asynchronous feedforward spiking neural network that mimics the ventral visual pathway and shows that when the network is presented with natural images, selectivity to intermediate-complexity visual features emerges. Those features, which correspond to prototypical patterns that are both salient and consistently present in the images, are highly informative and enable robust object recognition, as demonstrated on various classification tasks. Taken together, these results show that temporal codes may be a key to understanding the phenomenal processing speed achieved by the visual system and that STDP can lead to fast and selective responses.
Collapse
Affiliation(s)
- Timothée Masquelier
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique, Université Paul Sabatier, Faculté de Médecine de Rangueil, Toulouse, France.
| | | |
Collapse
|
17
|
Abstract
A fundamental task of vision is to group the image elements that belong to one object and to segregate them from other objects and the background. This review provides a conceptual framework of how perceptual grouping may be implemented in the visual cortex. According to this framework, two mechanisms are responsible for perceptual grouping: base-grouping and incremental grouping. Base-groupings are coded by single neurons tuned to multiple features, like the combination of a color and an orientation. They are computed rapidly because they reflect the selectivity of feedforward connections. However, not all conceivable feature combinations are coded by dedicated neurons. Therefore, a second, flexible form of grouping is required called incremental grouping. Incremental grouping enhances the responses of neurons coding features that are bound in perception, but it takes more time than does base-grouping because it relies also on horizontal and feedback connections. The modulation of neuronal response strength during incremental grouping has a correlate in psychology because attention is directed to those features that are labeled by the enhanced neuronal response.
Collapse
Affiliation(s)
- Pieter R Roelfsema
- The Netherlands Ophthalmic Research Institute, Meibergdreef 47, 1105 BA Amsterdam, The Netherlands.
| |
Collapse
|
18
|
de Kamps M, van der Velde F. Neural blackboard architectures: the realization of compositionality and systematicity in neural networks. J Neural Eng 2006; 3:R1-12. [PMID: 16510935 DOI: 10.1088/1741-2560/3/1/r01] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
In this paper, we will first introduce the notions of systematicity and combinatorial productivity and we will argue that these notions are essential for human cognition and probably for every agent that needs to be able to deal with novel, unexpected situations in a complex environment. Agents that use compositional representations are faced with the so-called binding problem and the question of how to create neural network architectures that can deal with it is essential for understanding higher level cognition. Moreover, an architecture that can solve this problem is likely to scale better with problem size than other neural network architectures. Then, we will discuss object-based attention. The influence of spatial attention is well known, but there is solid evidence for object-based attention as well. We will discuss experiments that demonstrate object-based attention and will discuss a model that can explain the data of these experiments very well. The model strongly suggests that this mode of attention provides a neural basis for parallel search. Next, we will show a model for binding in visual cortex. This model is based on a so-called neural blackboard architecture, where higher cortical areas act as processors, specialized for specific features of a visual stimulus, and lower visual areas act as a blackboard for communication between these processors. This implies that lower visual areas are involved in more than bottom-up visual processing, something which already was apparent from the large number of recurrent connections from higher to lower visual areas. This model identifies a specific role for these feedback connections. Finally, we will discuss the experimental evidence that exists for this architecture.
Collapse
Affiliation(s)
- Marc de Kamps
- Institut für Informatik, Technische Universität München, Boltzmannstrasse 3, D-85748 Garching bei München, Germany.
| | | |
Collapse
|
19
|
Perception and Synthesis of Biologically Plausible Motion: From Human Physiology to Virtual Reality. LECTURE NOTES IN COMPUTER SCIENCE 2006. [DOI: 10.1007/11678816_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
20
|
A Convolutional Neural Network Tolerant of Synaptic Faults for Low-Power Analog Hardware. ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION 2006. [DOI: 10.1007/11829898_11] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
21
|
Oram MW. Integrating neuronal coding into cognitive models: predicting reaction time distributions. NETWORK (BRISTOL, ENGLAND) 2005; 16:377-400. [PMID: 16611591 DOI: 10.1080/09548980500445039] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Neurophysiological studies have examined many aspects of neuronal activity in terms of neuronal codes and postulated roles for these codes in brain processing. There has been relatively little work, however, examining the relationship between different neuronal codes and the behavioural phenomena associated with cognitive processes. Here, predictions about reaction time distributions derived from an accumulator model incorporating known neurophysiological data in temporal lobe visual areas of the macaque are examined. Results from human experimental studies examining the effects of changing stimulus orientation, size and contrast are consistent with the model, including qualitatively different changes in reaction time distributions with different stimulus manipulations. The different changes in reaction time distributions depend on whether the image manipulation changes neuronal response latency or magnitude and can be related to parallel or serial cognitive processes respectively. The results indicate that neuronal coding can be productively incorporated into computational models to provide mechanistic accounts of behavioural results related to cognitive phenomena.
Collapse
Affiliation(s)
- Mike W Oram
- School of Psychology, University of St Andrews, Fife, KY16 9JP, UK.
| |
Collapse
|
22
|
Patel AD, Iversen JR, Chen Y, Repp BH. The influence of metricality and modality on synchronization with a beat. Exp Brain Res 2005; 163:226-38. [PMID: 15654589 DOI: 10.1007/s00221-004-2159-8] [Citation(s) in RCA: 180] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2004] [Accepted: 10/18/2004] [Indexed: 11/28/2022]
Abstract
The great majority of the world's music is metrical, i.e., has periodic structure at multiple time scales. Does the metrical structure of a non-isochronous rhythm improve synchronization with a beat compared to synchronization with an isochronous sequence at the beat period? Beat synchronization is usually associated with auditory stimuli, but are people able to extract a beat from rhythmic visual sequences with metrical structure? We addressed these questions by presenting listeners with rhythmic patterns which were either isochronous or non-isochronous in either the auditory or visual modality, and by asking them to tap to the beat, which was prescribed to occur at 800-ms intervals. For auditory patterns, we found that a strongly metrical structure did not improve overall accuracy of synchronization compared with isochronous patterns of the same beat period, though it did influence the higher-level patterning of taps. Synchronization was impaired in weakly metrical patterns in which some beats were silent. For the visual patterns, we found that participants were generally unable to synchronize to metrical non-isochronous rhythms, or to rapid isochronous rhythms. This suggests that beat perception and synchronization have a special affinity with the auditory system.
Collapse
Affiliation(s)
- Aniruddh D Patel
- The Neurosciences Institute, 10640 John Jay Hopkins Drive, San Diego, CA 92121, USA.
| | | | | | | |
Collapse
|
23
|
Fiser J, Aslin RN. Encoding Multielement Scenes: Statistical Learning of Visual Feature Hierarchies. ACTA ACUST UNITED AC 2005; 134:521-37. [PMID: 16316289 DOI: 10.1037/0096-3445.134.4.521] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The authors investigated how human adults encode and remember parts of multielement scenes composed of recursively embedded visual shape combinations. The authors found that shape combinations that are parts of larger configurations are less well remembered than shape combinations of the same kind that are not embedded. Combined with basic mechanisms of statistical learning, this embeddedness constraint enables the development of complex new features for acquiring internal representations efficiently without being computationally intractable. The resulting representations also encode parts and wholes by chunking the visual input into components according to the statistical coherence of their constituents. These results suggest that a bootstrapping approach of constrained statistical learning offers a unified framework for investigating the formation of different internal representations in pattern and scene perception.
Collapse
Affiliation(s)
- József Fiser
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA.
| | | |
Collapse
|
24
|
Panzeri S, Petroni F, Bracci E. Exploring structure-function relationships in neocortical networks by means of neuromodelling techniques. Med Eng Phys 2004; 26:699-710. [PMID: 15564107 DOI: 10.1016/j.medengphy.2004.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2004] [Accepted: 06/29/2004] [Indexed: 11/17/2022]
Abstract
Determining the neuronal architecture underlying certain visual functions is of fundamental importance for understanding how sensory processing is implemented in the brain. The wealth of anatomical, physiological and biophysical data that is being currently acquired on the neocortex could be used to constrain its functional architecture. However, given the intrinsic complexity and diversity of the data, it is difficult to provide a comprehensive framework to use these data in order to characterize structure-function relationships. Here, we discuss the use of biophysically plausible models of dynamics of neuronal networks, constructed to reflect the known properties of neocortical connectivity and modularity, as a tool to bring together anatomy and physiology. We illustrate the utility and rationale of the neuro-dynamics modelling approach by considering recent studies on the relationship between functional structure of the visual cortex and its response timing, and on the cellular and network origin of neuronal oscillations in the gamma frequency range. We also critically discuss how an interaction between theory and experiments could help this approach to become directly relevant for clinical applications.
Collapse
Affiliation(s)
- Stefano Panzeri
- Department of Optometry and Neuroscience, UMIST, P.O. Box 88, Manchester M60 1QD, UK.
| | | | | |
Collapse
|
25
|
Abstract
The correlation between relative neocortex size and longevity in mammals encourages a search for a cortical function specifically related to the life-span. A candidate in the domain of permanent and cumulative memory storage is proposed and explored in relation to basic aspects of cortical organization. The pattern of cortico-cortical connectivity between functionally specialized areas and the laminar organization of that connectivity converges on a globally coherent representational space in which contextual embedding of information emerges as an obligatory feature of cortical function. This brings a powerful mode of inductive knowledge within reach of mammalian adaptations. It combines item specificity with classificatory generality, as embodied in "latent semantic analysis" algorithms. Its neural implementation is proposed to depend on an obligatory interaction between the oppositely directed feedforward and feedback currents of cortical activity, in countercurrent fashion. Direct interaction of the two streams along their cortex-wide local interface supports a scheme of "contextual capture" for information storage responsible for the lifelong cumulative growth of a uniquely cortical form of memory termed "personal history." This approach to cortical function helps elucidate key features of cortical organization as well as cognitive aspects of mammalian life history strategies.
Collapse
Affiliation(s)
- Bjorn Merker
- Department of Psychology, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
26
|
Eifuku S, De Souza WC, Tamura R, Nishijo H, Ono T. Neuronal Correlates of Face Identification in the Monkey Anterior Temporal Cortical Areas. J Neurophysiol 2004; 91:358-71. [PMID: 14715721 DOI: 10.1152/jn.00198.2003] [Citation(s) in RCA: 95] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
To investigate the neuronal basis underlying face identification, the activity of face neurons in the anterior superior temporal sulcus (STS) and the anterior inferior temporal gyrus (ITG) of macaque monkeys was analyzed during their performance of a face-identification task. The face space was composed by the activities of face neurons during the face-identification task, based on a multidimensional scaling (MDS) method; the face space composed by the anterior STS neurons represented facial views, whereas that composed by the anterior ITG neurons represented facial identity. The temporal correlation between the behavioral reaction time of the animal and the latency of face-related neuronal responses was also analyzed. The response latency of some of the face neurons in the anterior ITG exhibited a significant correlation with the behavioral reaction time, whereas this correlation was not significant in the anterior STS. The correlation of the latency of face-related neuronal responses in the anterior ITG with the behavioral reaction time was not found to be attributed to the correlation between the response latency and the magnitude of the neuronal responses. The present results suggest that the anterior ITG is closely related to judgments of facial identity, and that the anterior STS is closely related to analyses of incoming perceptual information; face identification in monkeys might involve interactions between the two areas.
Collapse
Affiliation(s)
- Satoshi Eifuku
- Department of Physiology, Faculty of Medicine, Toyama Medical and Pharmaceutical University, 2630 Sugitani, Toyama 930-0194, Japan
| | | | | | | | | |
Collapse
|
27
|
Oram MW, Xiao D, Dritschel B, Payne KR. The temporal resolution of neural codes: does response latency have a unique role? Philos Trans R Soc Lond B Biol Sci 2002; 357:987-1001. [PMID: 12217170 PMCID: PMC1693013 DOI: 10.1098/rstb.2002.1113] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
This article reviews the nature of the neural code in non-human primate cortex and assesses the potential for neurons to carry two or more signals simultaneously. Neurophysiological recordings from visual and motor systems indicate that the evidence for a role for precisely timed spikes relative to other spike times (ca. 1-10 ms resolution) is inconclusive. This indicates that the visual system does not carry a signal that identifies whether the responses were elicited when the stimulus was attended or not. Simulations show that the absence of such a signal reduces, but does not eliminate, the increased discrimination between stimuli that are attended compared with when the stimuli are unattended. The increased accuracy asymptotes with increased gain control, indicating limited benefit from increasing attention. The absence of a signal identifying the attentional state under which stimuli were viewed can produce the greatest discrimination between attended and unattended stimuli. Furthermore, the greatest reduction in discrimination errors occurs for a limited range of gain control, again indicating that attention effects are limited. By contrast to precisely timed patterns of spikes where the timing is relative to other spikes, response latency provides a fine temporal resolution signal (ca. 10 ms resolution) that carries information that is unavailable from coarse temporal response measures. Changes in response latency and changes in response magnitude can give rise to different predictions for the patterns of reaction times. The predictions are verified, and it is shown that the standard method for distinguishing executive and slave processes is only valid if the representations of interest, as evidenced by the neural code, are known. Overall, the data indicate that the signalling evident in neural signals is restricted to the spike count and the precise times of spikes relative to stimulus onset (response latency). These coding issues have implications for our understanding of cognitive models of attention and the roles of executive and slave systems.
Collapse
Affiliation(s)
- M W Oram
- School of Psychology, University of St Andrews, St Andrews, Fife KY16 9JU, UK.
| | | | | | | |
Collapse
|
28
|
A Component Association Architecture for Image Understanding. ACTA ACUST UNITED AC 2002. [DOI: 10.1007/3-540-46084-5_201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
29
|
Abstract
Invariant features of temporally varying signals are useful for analysis and classification. Slow feature analysis (SFA) is a new method for learning invariant or slowly varying features from a vectorial input signal. It is based on a nonlinear expansion of the input signal and application of principal component analysis to this expanded signal and its time derivative. It is guaranteed to find the optimal solution within a family of functions directly and can learn to extract a large number of decorrelated features, which are ordered by their degree of invariance. SFA can be applied hierarchically to process high-dimensional input signals and extract complex features. SFA is applied first to complex cell tuning properties based on simple cell output, including disparity and motion. Then more complicated input-output functions are learned by repeated application of SFA. Finally, a hierarchical network of SFA modules is presented as a simple model of the visual system. The same unstructured network can learn translation, size, rotation, contrast, or, to a lesser degree, illumination invariance for one-dimensional objects, depending on only the training stimulus. Surprisingly, only a few training objects suffice to achieve good generalization to new objects. The generated representation is suitable for object recognition. Performance degrades if the network is trained to learn multiple invariances simultaneously.
Collapse
Affiliation(s)
- Laurenz Wiskott
- Computational Neurobiology Laboratory, Salk Institute for Biological Studies, San Diego, CA 92168, USA.
| | | |
Collapse
|
30
|
Abstract
We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors-Malsburg's classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word size, and feature counts. We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded. We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.
Collapse
Affiliation(s)
- B W Mel
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | | |
Collapse
|
31
|
Abstract
We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors-Malsburg's classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word size, and feature counts. We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded. We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.
Collapse
Affiliation(s)
- B Mel
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | | |
Collapse
|
32
|
Abstract
A novel architecture and set of learning rules for cortical self-organization is proposed. The model is based on the idea that multiple information channels can modulate one another's plasticity. Features learned from bottom-up information sources can thus be influenced by those learned from contextual pathways, and vice versa. A maximum likelihood cost function allows this scheme to be implemented in a biologically feasible, hierarchical neural circuit. In simulations of the model, we first demonstrate the utility of temporal context in modulating plasticity. The model learns a representation that categorizes people's faces according to identity, independent of viewpoint, by taking advantage of the temporal continuity in image sequences. In a second set of simulations, we add plasticity to the contextual stream and explore variations in the architecture. In this case, the model learns a two-tiered representation, starting with a coarse view-based clustering and proceeding to a finer clustering of more specific stimulus features. This model provides a tenable account of how people may perform 3D object recognition in a hierarchical, bottom-up fashion.
Collapse
Affiliation(s)
- S Becker
- Department of Psychology, Psychology Building, Room 312, Mc Master University, 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada.
| |
Collapse
|
33
|
Abstract
We describe and analyze an appearance-based 3-D object recognition system that avoids some of the problems of previous appearance-based schemes. We describe various large-scale performance tests and report good performance for full-sphere/hemisphere recognition of up to 24 complex, curved objects, robustness against clutter and occlusion, and some intriguing generic recognition behavior. We also establish a protocol that permits performance in the presence of quantifiable amounts of clutter and occlusion to be predicted on the basis of simple score statistics derived from clean test images and pure clutter images.
Collapse
Affiliation(s)
- R C Nelson
- Department of Computer Science, University of Rochester, NY 14627, USA.
| | | |
Collapse
|
34
|
Perrett DI, Oram MW, Ashbridge E. Evidence accumulation in cell populations responsive to faces: an account of generalisation of recognition without mental transformations. Cognition 1998; 67:111-45. [PMID: 9735538 DOI: 10.1016/s0010-0277(98)00015-8] [Citation(s) in RCA: 190] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In this paper we analyse the time course of neuronal activity in temporal cortex to the sight of the head and body. Previous studies have already demonstrated the impact of view, orientation and part occlusion on individual cells. We consider the cells as a population providing evidence in the form of neuronal activity for perceptual decisions related to recognition. The time course on neural responses to stimuli provides an explanation of the variation in speed of recognition across different viewing circumstances that is seen in behavioural experiments. A simple unifying explanation of the behavioural effects is that the speed of recognition of an object depends on the rate of accumulation of activity from neurones selective for the object, evoked by a particular viewing circumstance. This in turn depends on the extent that the object has been seen previously under the particular circumstance. For any familiar object, more cells will be tuned to the configuration of the object's features present in the view or views most frequently experienced. Therefore, activity amongst the population of cells selective for the object's appearance will accumulate more slowly when the object is seen in an unusual view, orientation or size. This accounts for the increased time to recognise rotated views without the need to postulate 'mental rotation' or 'transformations' of novel views to align with neural representations of familiar views.
Collapse
Affiliation(s)
- D I Perrett
- Psychological Laboratory, St. Andrews University, UK.
| | | | | |
Collapse
|
35
|
Oram MW, Földiák P, Perrett DI, Sengpiel F. The 'Ideal Homunculus': decoding neural population signals. Trends Neurosci 1998; 21:259-65. [PMID: 9641539 DOI: 10.1016/s0166-2236(97)01216-2] [Citation(s) in RCA: 165] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Information processing in the nervous system involves the activity of large populations of neurons. It is possible, however, to interpret the activity of relatively small numbers of cells in terms of meaningful aspects of the environment. 'Bayesian inference' provides a systematic and effective method of combining information from multiple cells to accomplish this. It is not a model of a neural mechanism (neither are alternative methods, such as the population vector approach) but a tool for analysing neural signals. It does not require difficult assumptions about the nature of the dimensions underlying cell selectivity, about the distribution and tuning of cell responses or about the way in which information is transmitted and processed. It can be applied to any parameter of neural activity (for example, firing rate or temporal pattern). In this review, we demonstrate the power of Bayesian analysis using examples of visual responses of neurons in primary visual and temporal cortices. We show that interaction between correlation in mean responses to different stimuli (signal) and correlation in response variability within stimuli (noise) can lead to marked improvement of stimulus discrimination using population responses.
Collapse
Affiliation(s)
- M W Oram
- School of Psychology, University of St Andrews, UK
| | | | | | | |
Collapse
|
36
|
Abstract
The hypothesis that cortical processing of the millisecond time range is performed by latency competition between the first spikes produced by neuronal populations is analyzed. First, theorems that describe how the mechanism of latency competition works in a model cortex are presented. The model is a sequence of cortical areas, each of which is an array of neuronal populations that laterally inhibit each other. Model neurons are integrate-and-fire neurons. Second, the model is applied to the ventral pathway of the temporal lobe, and neuronal activity of the superior temporal sulcus of the monkey is reproduced with the model pathway. It consists of seven areas: V1, V2/V3, V4, PIT, CIT, AIT, and STPa. Neural activity predicted with the model is compared with empirical data. There are four main results: (1) Neural responses of the area STPa of the model showed the same fast discrimination between stimuli that the corresponding responses of the monkey did: both were significant within 5 ms of the response onset. (2) The hypothesis requires that the response latency of cortical neurons should be shorter for stronger responses. This requirement was verified by both the model simulation and the empirical data. (3) The model reproduced fast discrimination even when spontaneous random firing of 9 Hz was introduced to all the cells. This suggests that the latency competition performed by neuronal populations is robust. (4) After the first few competitions, the mechanism of latency competition always detected the strongest of input activations with different latencies.
Collapse
|
37
|
Deubel H, Schneider WX, Paprotta I. Selective Dorsal and Ventral Processing: Evidence for a Common Attentional Mechanism in Reaching and Perception. VISUAL COGNITION 1998. [DOI: 10.1080/713756776] [Citation(s) in RCA: 97] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
38
|
Masutani T, Tsujino H, Koerner E. A Cortical-type Modular Neural Network for Hypothetical Reasoning. Neural Netw 1997; 10:791-814. [PMID: 12662871 DOI: 10.1016/s0893-6080(96)00126-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We propose a multilayer neural network architecture that can implement the kind of hypothetical reasoning that the cortex seems to perform in making sense of the sensory input. The elementary processing nodes of each homogeneous sheet are not single formal neurons, but complex modules abducted from the functional organization of neocortical columns. As an example, we simulate face recognition in this neocortical architecture. A holistic but coarse initial hypothesis is generated by express forward input description and subsequently refined under the constraints of this hypothesis. Separation of forward input description and feedback generated hypothesis, while using the difference in both descriptions at each of the modular units to control the refinement, enables robust recognition and has the potential for autonomous learning. Copyright 1997 Elsevier Science Ltd.
Collapse
|
39
|
Johnson MH, Shrager J. Dynamic Plasticity Influences the Emergence of Function in a Simple Cortical Array. Neural Netw 1996; 9:1119-1129. [PMID: 12662587 DOI: 10.1016/0893-6080(96)00033-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
In computational experiments with a simplified cortical array we investigated the factors that give rise to the functional organization of the cerebral cortex during brain development. We show that a dynamical spatial modulation of plasticity in the substrate (i.e., a "wave of plasticity") induces higher functional development in the later-developing parts of the cortical array. This result suggests an account of the role that changes in developmental timing may have in the development of different cortical structures. Copyri
Collapse
|
40
|
|
41
|
Abstract
The development of the concept of feature binding as fundamental to neural dynamics has made possible recent advances in the modeling of difficult problems of perception and brain function. Major weaknesses of past neural modeling (most prominently its inability to work with natural stimuli and its 'learning-time' barrier) have been traced back to improper treatment of the binding issue. Signal synchrony is now seen as playing a major role in binding. Inclusion of temporal binding in neural models has led to recent breakthroughs in solving important perceptual problems. Among them is perceptual segmentation, invariant object recognition and natural language parsing, as well as overcoming the 'learning-time' barrier.
Collapse
|