1
|
Price BH, Gavornik JP. Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions. Front Comput Neurosci 2022; 16:929348. [PMID: 35874317 PMCID: PMC9298461 DOI: 10.3389/fncom.2022.929348] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/13/2022] [Indexed: 01/16/2023] Open
Abstract
While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.
Collapse
|
2
|
Affiliation(s)
- Karin S. Pilz
- School of Psychology, University of Aberdeen, Aberdeen, Scotland, UK
| | - Ian M. Thornton
- Department of Cognitive Science, Faculty of Media & Knowledge Science, University of Malta, Msida, Malta
| |
Collapse
|
3
|
Wang C, Zhang X, Li Y, Lyu C. Additivity of Feature-Based and Symmetry-Based Grouping Effects in Multiple Object Tracking. Front Psychol 2016; 7:657. [PMID: 27199875 PMCID: PMC4854980 DOI: 10.3389/fpsyg.2016.00657] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 04/19/2016] [Indexed: 11/13/2022] Open
Abstract
Multiple object tracking (MOT) is an attentional process wherein people track several moving targets among several distractors. Symmetry, an important indicator of regularity, is a general spatial pattern observed in natural and artificial scenes. According to the “laws of perceptual organization” proposed by Gestalt psychologists, regularity is a principle of perceptual grouping, such as similarity and closure. A great deal of research reported that feature-based similarity grouping (e.g., grouping based on color, size, or shape) among targets in MOT tasks can improve tracking performance. However, no additive feature-based grouping effects have been reported where the tracking objects had two or more features. “Additive effect” refers to a greater grouping effect produced by grouping based on multiple cues instead of one cue. Can spatial symmetry produce a similar grouping effect similar to that of feature similarity in MOT tasks? Are the grouping effects based on symmetry and feature similarity additive? This study includes four experiments to address these questions. The results of Experiments 1 and 2 demonstrated the automatic symmetry-based grouping effects. More importantly, an additive grouping effect of symmetry and feature similarity was observed in Experiments 3 and 4. Our findings indicate that symmetry can produce an enhanced grouping effect in MOT and facilitate the grouping effect based on color or shape similarity. The “where” and “what” pathways might have played an important role in the additive grouping effect.
Collapse
Affiliation(s)
- Chundi Wang
- Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal University Beijing, China
| | - Xuemin Zhang
- Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal UniversityBeijing, China; State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijing, China; Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal UniversityBeijing, China
| | - Yongna Li
- Department of Psychology, RenMin University of China Beijing, China
| | - Chuang Lyu
- Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal University Beijing, China
| |
Collapse
|
4
|
Tian M, Grill-Spector K. Spatiotemporal information during unsupervised learning enhances viewpoint invariant object recognition. J Vis 2015; 15:7. [PMID: 26024454 DOI: 10.1167/15.6.7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning is used to link among object views. Specifically, researchers argue whether temporal proximity, motion, or spatiotemporal continuity among object views during unsupervised learning is beneficial. Here, we untangled the role of each of these factors in unsupervised learning of novel three-dimensional (3-D) objects. We found that after unsupervised training with 24 object views spanning a 180° view space, participants showed significant improvement in their ability to recognize 3-D objects across rotation. Surprisingly, there was no advantage to unsupervised learning with spatiotemporal continuity or motion information than training with temporal proximity. However, we discovered that when participants were trained with just a third of the views spanning the same view space, unsupervised learning via spatiotemporal continuity yielded significantly better recognition performance on novel views than learning via temporal proximity. These results suggest that while it is possible to obtain view-invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances performance by producing representations with broader view tuning than learning via temporal association. Our findings have important implications for theories of object recognition and for the development of computational algorithms that learn from examples.
Collapse
|
5
|
Gavornik JP, Bear MF. Higher brain functions served by the lowly rodent primary visual cortex. ACTA ACUST UNITED AC 2014; 21:527-33. [PMID: 25225298 PMCID: PMC4175492 DOI: 10.1101/lm.034355.114] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
It has been more than 50 years since the first description of ocular dominance plasticity--the profound modification of primary visual cortex (V1) following temporary monocular deprivation. This discovery immediately attracted the intense interest of neurobiologists focused on the general question of how experience and deprivation modify the brain as a potential substrate for learning and memory. The pace of discovery has quickened considerably in recent years as mice have become the preferred species to study visual cortical plasticity, and new studies have overturned the dogma that primary sensory cortex is immutable after a developmental critical period. Recent work has shown that, in addition to ocular dominance plasticity, adult visual cortex exhibits several forms of response modification previously considered the exclusive province of higher cortical areas. These "higher brain functions" include neural reports of stimulus familiarity, reward-timing prediction, and spatiotemporal sequence learning. Primary visual cortex can no longer be viewed as a simple visual feature detector with static properties determined during early development. Rodent V1 is a rich and dynamic cortical area in which functions normally associated only with "higher" brain regions can be studied at the mechanistic level.
Collapse
Affiliation(s)
- Jeffrey P Gavornik
- Howard Hughes Medical Institute, The Picower Institute for Learning and Memory, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Mark F Bear
- Howard Hughes Medical Institute, The Picower Institute for Learning and Memory, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
6
|
On the advantage of being left-handed in volleyball: further evidence of the specificity of skilled visual perception. Atten Percept Psychophys 2012; 74:446-53. [PMID: 22147534 DOI: 10.3758/s13414-011-0252-1] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
High ball speeds and close distances between competitors require athletes in interactive sports to correctly anticipate an opponent's intentions in order to render appropriate reactions. Although it is considered crucial for successful performance, such skill appears impaired when athletes are confronted with a left-handed opponent, possibly because of athletes' reduced perceptual familiarity with rarely encountered left-handed actions. To test this negative perceptual frequency effect hypothesis, we invited 18 skilled and 18 novice volleyball players to predict shot directions of left- and right-handed attacks in a video-based visual anticipation task. In accordance with our predictions, and with recent reports on laterality differences in visual perception, the outcome of left-handed actions was significantly less accurately predicted than the outcome of right-handed attacks. In addition, this left-right bias was most distinct when predictions had to be based on preimpact (i.e., before hand-ball contact) kinematic cues, and skilled players were generally more affected by the opponents' handedness than were novices. The study's findings corroborate the assumption that skilled visual perception is attuned to more frequently encountered actions.
Collapse
|
7
|
Baeck A, Windey I, Op de Beeck HP. The transfer of object learning across exemplars and their orientation is related to perceptual similarity. Vision Res 2012; 68:40-7. [PMID: 22819729 DOI: 10.1016/j.visres.2012.06.023] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Revised: 06/06/2012] [Accepted: 06/29/2012] [Indexed: 11/19/2022]
Abstract
Recognition of objects improves after training. The exact characteristics of this visual learning process remain unclear. We examined to which extent object learning depends on the exact exemplar and orientation used during training. Participants were trained to name object pictures at as short a picture presentation time as possible. The required presentation time diminished over training. After training participants were tested with a completely new set of objects as well as with two variants of the trained object set, namely an orientation change and a change of the exact exemplar shown. Both manipulations led to a decrease in performance compared to the original picture set. Nevertheless, performance with the manipulated versions of the trained stimuli was better than performance with the completely new set, at least when only one manipulation was performed. Amount of transfer to new images of an object was related to perceptual similarity, but not to pixel overlap or to measurements of similarity in the different layers of a popular hierarchical object recognition model (HMAX). Thus, object learning generalizes only partially over changes in exemplars and orientation, which is consistent with the tuning properties of neurons in object-selective cortical regions and the role of perceptual similarity in these representations.
Collapse
Affiliation(s)
- Annelies Baeck
- Laboratory of Biological Psychology, University of Leuven (KU Leuven), Tiensestraat 102, 3000 Leuven, Belgium.
| | | | | |
Collapse
|
8
|
Chuang LL, Vuong QC, Bülthoff HH. Learned Non-Rigid Object Motion is a View-Invariant Cue to Recognizing Novel Objects. Front Comput Neurosci 2012; 6:26. [PMID: 22661939 PMCID: PMC3357528 DOI: 10.3389/fncom.2012.00026] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2012] [Accepted: 04/22/2012] [Indexed: 11/25/2022] Open
Abstract
There is evidence that observers use learned object motion to recognize objects. For instance, studies have shown that reversing the learned direction in which a rigid object rotated in depth impaired recognition accuracy. This motion reversal can be achieved by playing animation sequences of moving objects in reverse frame order. In the current study, we used this sequence-reversal manipulation to investigate whether observers encode the motion of dynamic objects in visual memory, and whether such dynamic representations are encoded in a way that is dependent on the viewing conditions. Participants first learned dynamic novel objects, presented as animation sequences. Following learning, they were then tested on their ability to recognize these learned objects when their animation sequence was shown in the same sequence order as during learning or in the reverse sequence order. In Experiment 1, we found that non-rigid motion contributed to recognition performance; that is, sequence-reversal decreased sensitivity across different tasks. In subsequent experiments, we tested the recognition of non-rigidly deforming (Experiment 2) and rigidly rotating (Experiment 3) objects across novel viewpoints. Recognition performance was affected by viewpoint changes for both experiments. Learned non-rigid motion continued to contribute to recognition performance and this benefit was the same across all viewpoint changes. By comparison, learned rigid motion did not contribute to recognition performance. These results suggest that non-rigid motion provides a source of information for recognizing dynamic objects, which is not affected by changes to viewpoint.
Collapse
Affiliation(s)
- Lewis L Chuang
- Department of Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics Tübingen, Germany
| | | | | |
Collapse
|
9
|
Rigid facial motion influences featural, but not holistic, face processing. Vision Res 2012; 57:26-34. [PMID: 22342561 DOI: 10.1016/j.visres.2012.01.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2011] [Revised: 01/18/2012] [Accepted: 01/26/2012] [Indexed: 11/21/2022]
Abstract
We report three experiments in which we investigated the effect of rigid facial motion on face processing. Specifically, we used the face composite effect to examine whether rigid facial motion influences primarily featural or holistic processing of faces. In Experiments 1-3, participants were first familiarized with dynamic displays in which a target face turned from one side to another; then at test, participants judged whether the top half of a composite face (the top half of the target/foil face aligned or misaligned with the bottom half of a foil face) belonged to the target face. We compared performance in the dynamic condition to various static control conditions in Experiments 1-3, which differed from each other in terms of the display order of the multiple static images or the inter-stimulus interval (ISI) between the images. We found that the size of the face composite effect in the dynamic condition was significantly smaller than that in the static conditions. In other words, the dynamic face display influenced participants to process the target faces in a part-based manner and consequently their recognition of the upper portion of the composite face at test became less interfered with by the aligned lower part of the foil face. The findings from the present experiments provide the strongest evidence to date to suggest that the rigid facial motion mainly influences facial featural, but not holistic, processing.
Collapse
|
10
|
Arnold G, Siéroff E. Timing constraints of temporal view association in face recognition. Vision Res 2012; 54:61-7. [DOI: 10.1016/j.visres.2011.12.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Revised: 11/15/2011] [Accepted: 12/03/2011] [Indexed: 11/26/2022]
|
11
|
Wu B, Klatzky RL, Stetten GD. Mental visualization of objects from cross-sectional images. Cognition 2012; 123:33-49. [PMID: 22217386 DOI: 10.1016/j.cognition.2011.12.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Revised: 11/16/2011] [Accepted: 12/07/2011] [Indexed: 10/14/2022]
Abstract
We extended the classic anorthoscopic viewing procedure to test a model of visualization of 3D structures from 2D cross-sections. Four experiments were conducted to examine key processes described in the model, localizing cross-sections within a common frame of reference and spatiotemporal integration of cross sections into a hierarchical object representation. Participants used a hand-held device to reveal a hidden object as a sequence of cross-sectional images. The process of localization was manipulated by contrasting two displays, in situ vs. ex situ, which differed in whether cross sections were presented at their source locations or displaced to a remote screen. The process of integration was manipulated by varying the structural complexity of target objects and their components. Experiments 1 and 2 demonstrated visualization of 2D and 3D line-segment objects and verified predictions about display and complexity effects. In Experiments 3 and 4, the visualized forms were familiar letters and numbers. Errors and orientation effects showed that displacing cross-sectional images to a remote display (ex situ viewing) impeded the ability to determine spatial relationships among pattern components, a failure of integration at the object level.
Collapse
Affiliation(s)
- Bing Wu
- Cognitive Science and Engineering Program, Arizona State University, Mesa, AZ 85212, USA.
| | | | | |
Collapse
|
12
|
Abstract
The face-inversion effect (FIE) refers to increased response times or error
rates for faces that are presented upside-down relative to those seen in a
canonical, upright orientation. Here we report one situation in which this
FIE can be amplified when observers are shown dynamic facial expressions,
rather than static facial expressions. In two experiments observers were
asked to assign gender to a random sequence of un-degraded static or moving
faces. Each face was seen both upright and inverted. For static images, this
task led to little or no effect of inversion. For moving faces, the cost of
inversion was a response time increase of approximately 100 ms relative to
upright. Motion thus led to a disadvantage in the context of inversion. The
fact that such motion could not be ignored in favour of available form cues
suggests that dynamic processing may be mandatory. In two control experiments
a difference between static and dynamic inversion was not observed for
whole-body stimuli or for human-animal decisions. These latter findings
suggest that the processing of upside-down movies is not always more
difficult for the visual system than the processing of upside-down static
images.
Collapse
Affiliation(s)
- Ian Thornton
- Psychology Department, Swansea University, Swansea, UK
| | - Emma Mullins
- Psychology Department, Swansea University, Swansea, UK
| | - Kara Banahan
- Psychology Department, Swansea University, Swansea, UK
| |
Collapse
|
13
|
Walk this way: Approaching bodies can influence the processing of faces. Cognition 2011; 118:17-31. [DOI: 10.1016/j.cognition.2010.09.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2010] [Revised: 09/20/2010] [Accepted: 09/24/2010] [Indexed: 11/20/2022]
|
14
|
|
15
|
|
16
|
Setti A, Newell FN. The effect of body and part-based motion on the recognition of unfamiliar objects. VISUAL COGNITION 2010. [DOI: 10.1080/13506280902830561] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
17
|
Vuong QC, Friedman A, Plante C. Modulation of viewpoint effects in object recognition by shape and motion cues. Perception 2010; 38:1628-48. [PMID: 20120262 DOI: 10.1068/p6430] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In three experiments, we examined the role of structural similarity and different types of motion on the efficiency of performing same--different shape judgments across changes in viewpoints. In all experiments, participants judged whether two novel, multi-part objects were structurally identical, and they were to ignore any viewpoint or motion differences between the objects. In experiment 1, participants were affected by viewpoint differences more for structurally similar than structurally distinct objects, but this interaction was mitigated by rigid motion. In experiments 2 and 3, we used only structurally similar objects that moved only some of their parts, either in a similar way between objects within a pair or in distinctive ways. Participants' recognition performance was facilitated by this articulated motion relative to both static and scrambled controls. We conclude that coherent motion facilitates generalisation across different views of dynamic objects under some conditions.
Collapse
Affiliation(s)
- Quoc C Vuong
- Institute of Neuroscience, University of Newcastle, Newcastle upon Tyne NE2 4HH, UK.
| | | | | |
Collapse
|
18
|
Friedman A, Vuong QC, Spetch M. Facilitation by view combination and coherent motion in dynamic object recognition. Vision Res 2009; 50:202-10. [PMID: 19925823 DOI: 10.1016/j.visres.2009.11.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Revised: 10/30/2009] [Accepted: 11/12/2009] [Indexed: 10/20/2022]
Abstract
We compared the effect of motion cues on people's ability to: (1) recognize dynamic objects by combining information from more than one view and (2) perform more efficiently on views that followed the global direction of the trained views. Participants learned to discriminate two objects that were either structurally similar or distinct and that were rotating in depth in either a coherent or scrambled motion sequence. The Training views revealed 60 degrees of the object, with a center 30 degrees segment missing. For similar stimuli only, there was a facilitative effect of motion: Performance in the coherent condition was better on views following the training views than on equidistant preceding views. Importantly, the viewpoint between the two training viewpoints was responded to more efficiently than either the Pre- or Post-Training viewpoints for both the coherent and scrambled condition. The results indicate that view combination and processing coherent motion cues may occur through different processes.
Collapse
Affiliation(s)
- Alinda Friedman
- Department of Psychology, University of Alberta, Edmonton, Alberta, Canada T6G 2E9.
| | | | | |
Collapse
|
19
|
The role of sequence order in determining view canonicality for novel wire-frame objects. Atten Percept Psychophys 2009; 71:712-23. [PMID: 19429954 DOI: 10.3758/app.71.4.712] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Objects are best recognized from so-called "canonical" views. The characteristics of canonical views of arbitrary objects have been qualitatively described using a variety of different criteria, but little is known regarding how these views might be acquired during object learning. We address this issue, in part, by examining the role of object motion in the selection of preferred views of novel objects. Specifically, we adopt a modeling approach to investigate whether or not the sequence of views seen during initial exposure to an object contributes to observers' preferences for particular images in the sequence. In two experiments, we exposed observers to short sequences depicting rigidly rotating novel objects and subsequently collected subjective ratings of view canonicality (Experiment 1) and recall rates for individual views (Experiment 2). Given these two operational definitions of view canonicality, we attempted to fit both sets of behavioral data with a computational model incorporating 3-D shape information (object foreshortening), as well as information relevant to the temporal order of views presented during training (the rate of change for object foreshortening). Both sets of ratings were reasonably well predicted using only 3-D shape; the inclusion of terms that capture sequence order improved model performance significantly.
Collapse
|
20
|
|
21
|
View combination in moving objects: The role of motion in discriminating between novel views of similar and distinctive objects by humans and pigeons. Vision Res 2009; 49:594-607. [DOI: 10.1016/j.visres.2009.01.019] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2008] [Revised: 01/26/2009] [Accepted: 01/31/2009] [Indexed: 11/22/2022]
|
22
|
Chaigneau SE, Barsalou LW, Zamani M. Situational information contributes to object categorization and inference. Acta Psychol (Amst) 2009; 130:81-94. [PMID: 19041083 DOI: 10.1016/j.actpsy.2008.10.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Revised: 10/09/2008] [Accepted: 10/13/2008] [Indexed: 11/28/2022] Open
Abstract
Three experiments demonstrated that situational information contributes to the categorization of functional object categories, as well as to inferences about these categories. When an object was presented in the context of setting and event information, categorization was more accurate than when the object was presented in isolation. Inferences about the object similarly became more accurate as the amount of situational information present during categorization increased. The benefits of situational information were higher when both setting and event information were available than when only setting information was available. These findings indicate that situational information about settings and events is stored with functional object categories in memory. Categorization and inference become increasingly accurate as the information available during categorization matches situational information stored with the category.
Collapse
Affiliation(s)
- Sergio E Chaigneau
- Escuela de Psicología, Universidad Adolfo Ibáñez, Avenida Diagonal Las Torres 2640, Peñalolén, Santiago, Chile.
| | | | | |
Collapse
|
23
|
The integration of higher order form and motion by the human brain. Neuroimage 2008; 42:1529-36. [DOI: 10.1016/j.neuroimage.2008.04.265] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2008] [Revised: 04/23/2008] [Accepted: 04/25/2008] [Indexed: 11/24/2022] Open
|
24
|
Liu T. Learning sequence of views of three-dimensional objects: the effect of temporal coherence on object memory. Perception 2008; 36:1320-33. [PMID: 18196699 DOI: 10.1068/p5778] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
How humans recognize objects remains a contentious issue in current research on high-level vision. Here, I test the proposal by Wallis and Bülthoff (1999 Trends in Cognitive Sciences 3 22-31) suggesting that object representations can be learned through temporal association of multiple views of the same object. Participants first studied image sequences of novel, three-dimensional objects in a study block. On each trial, the images were from either an orderly sequence of depth-rotated views of the same object (SS), a scrambled sequence of those views (SR), or a sequence of different objects (RR). Recognition memory was assessed in a following test block. A within-object advantage was consistently observed --greater accuracy in the SR than the RR condition in all four experiments, greater accuracy in the SS than the RR condition in two experiments. Furthermore, spatiotemporal coherence did not produce better recognition than temporal coherence alone (similar or less accuracy in the SS compared to the SR condition). These results suggest that the visual system can use temporal regularity to build invariant object representations, via the temporal-association mechanism.
Collapse
Affiliation(s)
- Taosheng Liu
- Department of Psychology, New York University, New York 10003, USA.
| |
Collapse
|
25
|
Weigelt S, Kourtzi Z, Kohler A, Singer W, Muckli L. The cortical representation of objects rotating in depth. J Neurosci 2007; 27:3864-74. [PMID: 17409251 PMCID: PMC6672396 DOI: 10.1523/jneurosci.0340-07.2007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The perception of motion provides valuable interpolations of the visual scene. This fundamental capacity of the visual system is evident in apparent rotation: by presenting only two images of an object rotated in space, a vivid illusion of a smooth apparent motion in three dimensions can be induced. The unseen interpolated rotation views are filled in by the visual system. In the present study, we identified the cortical network responsible for this filling-in process. We argue that cross talk between areas of the ventral and dorsal visual pathways promote the illusion of smooth apparent rotation. Most interestingly, the network represents the unseen object views. Using functional magnetic resonance adaptation, we are able to show that the cortical network selectively adapts to the illusory object views. Our findings provide strong evidence for cortical representations of three-dimensional rotating objects that are view invariant with respect to the rotation path. Furthermore, our results confirm psychophysical investigations that unseen interpolated rotation views can be primed by apparent motion. By applying functional magnetic resonance adaptation, we show for the first time cortical adaptation to unseen objects. Together, our neuroimaging study advances the understanding of the cortical mechanisms mediating the influence of motion on object processing.
Collapse
Affiliation(s)
- Sarah Weigelt
- Max Planck Institute for Brain Research, D-60528 Frankfurt am Main, Germany.
| | | | | | | | | |
Collapse
|
26
|
Bennett DJ, Vuong QC. A stereo advantage in generalizing over changes in viewpoint on object recognition tasks. ACTA ACUST UNITED AC 2007; 68:1082-93. [PMID: 17355033 DOI: 10.3758/bf03193711] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In four experiments, we examined whether generalization to unfamiliar views was better under stereo viewing or under nonstereo viewing across different tasks and stimuli. In the first three experiments, we used a sequential matching task in which observers matched the identities of shaded tube-like objects. Across Experiments 1-3, we manipulated the presentation method of the nonstereo stimuli (having observers wear an eye patch vs. showing observers the same screen image) and the magnitude of the viewpoint change (30 degrees vs. 38 degrees). In Experiment 4, observers identified "easy" and "hard" rotating wire-frame objects at the individual level under stereo and nonstereo viewing conditions. We found a stereo advantage for generalizing to unfamiliar views in all the experiments. However, in these experiments, performance remained view dependent even under stereo viewing. These results strongly argue against strictly 2-D image-based models of object recognition, at least for the stimuli and recognition tasks used, and suggest that observers used representations that contained view-specific local depth information.
Collapse
Affiliation(s)
- David J Bennett
- Department of Cognitive and Linguistic Sciences, P.O. Box 1978, Brown University, Providence, RI 02912, USA.
| | | |
Collapse
|
27
|
Tse PU, Caplovitz GP. Contour discontinuities subserve two types of form analysis that underlie motion processing. PROGRESS IN BRAIN RESEARCH 2007; 154:271-92. [PMID: 17010718 DOI: 10.1016/s0079-6123(06)54015-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
Form analysis subserves motion processing in at least two ways: first, in terms of figural segmentation dedicated to solving the problem of figure-to-figure matching over time, and second, in terms of defining trackable features whose unambiguous motion signals can be generalized to ambiguously moving portions of an object. The former is a primarily ventral process involving the lateral occipital complex and also retinotopic areas such as V2 and V4, and the latter is a dorsal process involving V3A. Contour discontinuities, such as corners, deep concavities, maxima of positive curvature, junctions, and terminators, play a central role in both types of form analysis. Transformational apparent motion will be discussed in the context of figural segmentation and matching, and rotational motion in the context of trackable features. In both cases the analysis of form must proceed in parallel with the analysis of motion, in order to constrain the ongoing analysis of motion.
Collapse
Affiliation(s)
- Peter Ulric Tse
- H B 6207, Moore Hall, Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA.
| | | |
Collapse
|
28
|
Vuong QC, Tarr MJ. Structural similarity and spatiotemporal noise effects on learning dynamic novel objects. Perception 2006; 35:497-510. [PMID: 16700292 DOI: 10.1068/p5491] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The spatiotemporal pattern projected by a moving object is specific to that object, as it depends on both the shape and the dynamics of the object. Previous research has shown that observers learn to make use of this spatiotemporal signature to recognize dynamic faces and objects. In two experiments, we assessed the extent to which the structural similarity of the objects and the presence of spatiotemporal noise affect how these signatures are learned and subsequently used in recognition. Observers first learned to identify novel, structurally distinctive or structurally similar objects that rotated with a particular motion. At test, each learned object moved with its studied motion or with a non-studied motion. In the non-studied motion condition we manipulated either dynamic information alone (experiment 1) or both static and dynamic information (experiment 2). Across both experiments we found that changing the learned motion of an object impaired recognition performance when 3-D shape was similar or when the visual input was noisy during learning. These results are consistent with the hypothesis that observers use learned spatiotemporal signatures and that such information becomes progressively more important as shape information becomes less reliable.
Collapse
Affiliation(s)
- Quoc C Vuong
- Max Planck Institute for Biological Cybernetics, Spemannstrasse 38, D 72076 Tübingen, Germany.
| | | |
Collapse
|
29
|
Suganuma M, Yokosawa K. Grouping and trajectory storage in multiple object tracking: impairments due to common item motions. Perception 2006; 35:483-95. [PMID: 16700291 DOI: 10.1068/p5487] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
In our natural viewing, we notice that objects change their locations across space and time. However, there has been relatively little consideration of the role of motion information in the construction and maintenance of object representations. We investigated this question in the context of the multiple object tracking (MOT) paradigm, wherein observers must keep track of target objects as they move randomly amid featurally identical distractors. In three experiments, we observed impairments in tracking ability when the motions of the target and distractor items shared particular properties. Specifically, we observed impairments when the target and distractor items were in a chasing relationship or moved in a uniform direction. Surprisingly, tracking ability was impaired by these manipulations even when observers failed to notice them. Our results suggest that differentiable trajectory information is an important factor in successful performance of MOT tasks. More generally, these results suggest that various types of common motion can serve as cues to form more global object representations even in the absence of other grouping cues.
Collapse
Affiliation(s)
- Mutsumi Suganuma
- Department of Psychology, Graduate School of Humanities and Sociology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.
| | | |
Collapse
|
30
|
Hsieh PJ, Caplovitz GP, Tse PU. Bistable illusory rebound motion: Event-related functional magnetic resonance imaging of perceptual states and switches. Neuroimage 2006; 32:728-39. [PMID: 16702003 DOI: 10.1016/j.neuroimage.2006.03.047] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2005] [Revised: 03/21/2006] [Accepted: 03/27/2006] [Indexed: 10/24/2022] Open
Abstract
The neural correlates of a recently discovered visual illusion that we call 'illusory rebound motion' (IRM) are described. This illusion is remarkable because motion is perceived in the absence of any net motion energy in the stimulus. When viewing bars alternating between white and black on a gray background, the percept alternates between one of flashing bars (veridical) and the IRM illusion, where the bars appear to shoot back and forth rather like the opening and closing of a zipper. The event-related functional magnetic resonance imaging (fMRI) data reported here reveal that (1) the blood-oxygen-level-dependent (BOLD) signal in the human analog of macaque motion processing area MT (hMT+) increases when there is a perceptual change from "no-IRM" to "see-IRM" and decreases when there is a perceptual change from "see-IRM" to "no-IRM," although the stimulus remains constant; and (2) the BOLD signal in early retinotopic areas (V1, V2, and V3d) shows switch-related activation whenever there is a perceptual change, regardless whether from IRM to no-IRM or vice versa. We conclude that hMT+ is a neural correlate of this novel illusory motion percept because BOLD signal in hMT+ modulates with the perception of IRM.
Collapse
Affiliation(s)
- P-J Hsieh
- Department of Psychological and Brain Sciences, Moore Hall, H.B. 6207, Dartmouth College, Hanover, NH 03755, USA
| | | | | |
Collapse
|
31
|
Abstract
We investigated the role of dynamic information in human and pigeon object recognition. Both species were trained to discriminate between two objects that each had a characteristic motion, so that either cue could be used to perform the task successfully. The objects were either easy or difficult to decompose into parts. At test, the learned objects could appear in their learned motions, the reverse of the learned motions, or an entirely new motion, or a new object could appear in one of the learned motions. For humans, any change in the learned motion produced a decrement in performance for both the decomposable and the nondecomposable objects, but participants did not respond differentially to new objects that appeared in the learned motions. Pigeons showed the same pattern of responding as did humans for the decomposable objects, except that pigeons responded differentially to new objects in the learned motions. For the nondecomposable objects, pigeons used motion cues exclusively. We suggest that for some types of objects, dynamic information may be weighted differently by pigeons and humans.
Collapse
Affiliation(s)
- Marcia L Spetch
- Department of Psychology, University of Alberta, Edmonton, Canada.
| | | | | |
Collapse
|
32
|
Tse PU. Neural correlates of transformational apparent motion. Neuroimage 2006; 31:766-73. [PMID: 16488628 DOI: 10.1016/j.neuroimage.2005.12.029] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2005] [Revised: 11/29/2005] [Accepted: 12/20/2005] [Indexed: 11/30/2022] Open
Abstract
UNLABELLED When a figure discretely and instantaneously changes its shape, observers typically do not perceive the abrupt transition between shapes that in fact occurs. Rather, a continuous shape change is perceived. Although this illusory "transformational apparent motion" (TAM) is a faulty construction of the visual system, it is not arbitrary. From the many possible shape changes that could have been inferred, usually just one is perceived because only one is consistent with the shape-based rules that the visual system uses to (1) segment figures from one another within a scene and (2) match figures to themselves across successive scenes. TAM requires an interaction between neuronal circuits that process form relationships with circuits that compute motion trajectories. In particular, this form-motion interaction must happen before TAM is perceived because the direction of perceived motion is dictated by form relationships among figures in successive images. The present fMRI study (n = 19) provides the first evidence that both form (LOC, posterior fusiform gyrus) and motion (hMT+) processing areas are more active when TAM is perceived than in a control stimulus where it is not. Retinotopic areas (n = 10), hMT+ (n = 7), and LOC (n = 7) were mapped in a subset of subjects. RESULTS There is greater BOLD response to TAM than to the control condition in V1 and all subsequent retinotopic areas, as well as in hMT+ and the LOC, suggesting that areas that process form interact with hMT+ to construct the perception of moving figures.
Collapse
Affiliation(s)
- P U Tse
- Department of Psychological and Brain Sciences, H. B. 6207, Moore Hall, Dartmouth College, Hanover NH 03755, Germany.
| |
Collapse
|
33
|
|
34
|
Pilz KS, Thornton IM, Bülthoff HH. A search advantage for faces learned in motion. Exp Brain Res 2005; 171:436-47. [PMID: 16331505 DOI: 10.1007/s00221-005-0283-8] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2005] [Accepted: 10/20/2005] [Indexed: 10/25/2022]
Abstract
Recently there has been growing interest in the role that motion might play in the perception and representation of facial identity. Most studies have considered old/new recognition as a task. However, especially for non-rigid motion, these studies have often produced contradictory results. Here, we used a delayed visual search paradigm to explore how learning is affected by non-rigid facial motion. In the current studies we trained observers on two frontal view faces, one moving non-rigidly, the other a static picture. After a delay, observers were asked to identify the targets in static search arrays containing 2, 4 or 6 faces. On a given trial target and distractor faces could be shown in one of five viewpoints, frontal, 22 degrees or 45 degrees to the left or right. We found that familiarizing observers with dynamic faces led to a constant reaction time advantage across all setsizes and viewpoints compared to static familiarization. This suggests that non-rigid motion affects identity decisions even across extended periods of time and changes in viewpoint. Furthermore, it seems as if such effects may be difficult to observe using more traditional old/new recognition tasks.
Collapse
Affiliation(s)
- Karin S Pilz
- Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany.
| | | | | |
Collapse
|
35
|
|
36
|
Vuong QC, Tarr MJ. Rotation direction affects object recognition. Vision Res 2004; 44:1717-30. [PMID: 15136006 DOI: 10.1016/j.visres.2004.02.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2003] [Revised: 12/08/2003] [Indexed: 10/26/2022]
Abstract
What role does dynamic information play in object recognition? To address this question, we probed observers' memory for novel objects rotating in depth. Irrespective of object discriminability, performance was affected by an object's rotation direction. This effect was obtained despite the same shape information and views being shown for different rotation directions. This direction effect was eliminated when either static images or animations that did not depict globally coherent rotation were used. Overall, these results suggest that dynamic information, that is, the spatiotemporal ordering of object views, provides information independent of shape or view information to a recognition system.
Collapse
Affiliation(s)
- Quoc C Vuong
- Department of Cognitive and Linguistic Sciences, Brown University, 190 Thayer Street, Providence, Rhode Island 02912, USA.
| | | |
Collapse
|
37
|
Liu T, Slotnick SD, Yantis S. Human MT+ mediates perceptual filling-in during apparent motion. Neuroimage 2004; 21:1772-80. [PMID: 15050597 DOI: 10.1016/j.neuroimage.2003.12.025] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2003] [Revised: 10/29/2003] [Accepted: 12/08/2003] [Indexed: 10/26/2022] Open
Abstract
During apparent motion, spatially distinct items presented in alternation cause the perception of a visual stimulus smoothly traversing the intervening space where no physical stimulus exists. We used fMRI to determine whether the perceptual 'filling-in' that underlies this phenomenon has an early or late cortical locus. Subjects viewed a display comprised of concentric rings that elicited apparent motion (two concentric rings presented in alternation), flicker (the same rings presented simultaneously), or real motion. We independently localized the cortical regions corresponding to the path of apparent motion in early visual areas (V1, V2, VP, V3, V4v, V3A), as well as the human motion processing complex (MT+). Cortical activity in the path of apparent motion in early visual areas was similar in amplitude during both apparent motion and flicker. In contrast, cortical activity in MT+ was higher in amplitude during apparent motion than during flicker, but was lower in amplitude than during real motion. In addition, we observed overlap in the cortical loci of MT+ and the lateral occipital complex (LOC), a region involved in shape and object processing. This overlap suggests that these regions could directly interact and thereby support perceived object continuity during apparent motion.
Collapse
Affiliation(s)
- Taosheng Liu
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD 21218, USA.
| | | | | |
Collapse
|
38
|
Knappmeyer B, Thornton IM, Bülthoff HH. The use of facial motion and facial form during the processing of identity. Vision Res 2003; 43:1921-36. [PMID: 12831755 DOI: 10.1016/s0042-6989(03)00236-0] [Citation(s) in RCA: 130] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Previous research has shown that facial motion can carry information about age, gender, emotion and, at least to some extent, identity. By combining recent computer animation techniques with psychophysical methods, we show that during the computation of identity the human face recognition system integrates both types of information: individual non-rigid facial motion and individual facial form. This has important implications for cognitive and neural models of face perception, which currently emphasize a separation between the processing of invariant aspects (facial form) and changeable aspects (facial motion) of faces.
Collapse
Affiliation(s)
- Barbara Knappmeyer
- Max Planck Institute for Biological Cybernetics, Spemannstr. 38, Tübingen 72076, Germany.
| | | | | |
Collapse
|
39
|
Abstract
Although both the object and the observer often move in natural environments, the effect of motion on visual object recognition ha not been well documented. The authors examined the effect of a reversal in the direction of rotation on both explicit and implicit memory for novel, 3-dimensional objects. Participants viewed a series of continuously rotating objects and later made either an old-new recognition judgment or a symmetric-asymmetric decision. For both tasks, memory for rotating objects was impaired when the direction of rotation was reversed at test. These results demonstrate that dynamic information can play a role in visual object recognition and suggest that object representations can encode spatiotemporal information.
Collapse
Affiliation(s)
- Taosheng Liu
- Department of Psychology, Columbia University, New York, New York 10027, USA
| | | |
Collapse
|
40
|
Abstract
In a series of three experiments, we used a sequential matching task to explore the impact of non-rigid facial motion on the perception of human faces. Dynamic prime images, in the form of short video sequences, facilitated matching responses relative to a single static prime image. This advantage was observed whenever the prime and target showed the same face but an identity match was required across expression (experiment 1) or view (experiment 2). No facilitation was observed for identical dynamic prime sequences when the matching dimension was shifted from identity to expression (experiment 3). We suggest that the observed dynamic advantage, the first reported for non-degraded facial images, arises because the matching task places more emphasis on visual working memory than typical face recognition tasks. More specifically, we believe that representational mechanisms optimised for the processing of motion and/or change-over-time are established and maintained in working memory and that such 'dynamic representations' (Freyd, 1987 Psychological Review 94 427-438) capitalise on the increased information content of the dynamic primes to enhance performance.
Collapse
Affiliation(s)
- Ian M Thornton
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | | |
Collapse
|
41
|
Abstract
A recent study has shown that, when people talk, their changing facial expressions and head movements provide dynamic cues for recognition.
Collapse
Affiliation(s)
- J Stone
- Psychology Department, Sheffield University, Sheffield S10 2UR, UK.
| |
Collapse
|
42
|
Abstract
We demonstrate that performance on an object recognition task can be explained in terms of observer-specific perceptual profiles. These profiles are derived from a battery of tests, including the effects of stereo, texture, outline (occluding contour), and motion cues on amplitude judgements of curved surfaces. Using a task in which observers learned to recognise 'amoeboid' objects, a multivariate regression analysis revealed that three psychometric variables derived from the test battery account for 74% of the variance in learning rate. These variables are choice reaction time, and the relative dependence of amplitude judgements on motion and outline cues. The implications of these findings for the existence of observer-specific perceptual profiles, and their relation to the fundamental psychophysical competences associated with object recognition are discussed.
Collapse
Affiliation(s)
- J V Stone
- Psychology Department, Sheffield University, S10 2UR, Sheffield, UK.
| | | | | |
Collapse
|