1
|
Price BH, Gavornik JP. Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions. Front Comput Neurosci 2022; 16:929348. [PMID: 35874317 PMCID: PMC9298461 DOI: 10.3389/fncom.2022.929348] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/13/2022] [Indexed: 01/16/2023] Open
Abstract
While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.
Collapse
|
2
|
Affiliation(s)
- Karin S. Pilz
- School of Psychology, University of Aberdeen, Aberdeen, Scotland, UK
| | - Ian M. Thornton
- Department of Cognitive Science, Faculty of Media & Knowledge Science, University of Malta, Msida, Malta
| |
Collapse
|
3
|
Wang C, Zhang X, Li Y, Lyu C. Additivity of Feature-Based and Symmetry-Based Grouping Effects in Multiple Object Tracking. Front Psychol 2016; 7:657. [PMID: 27199875 PMCID: PMC4854980 DOI: 10.3389/fpsyg.2016.00657] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 04/19/2016] [Indexed: 11/13/2022] Open
Abstract
Multiple object tracking (MOT) is an attentional process wherein people track several moving targets among several distractors. Symmetry, an important indicator of regularity, is a general spatial pattern observed in natural and artificial scenes. According to the “laws of perceptual organization” proposed by Gestalt psychologists, regularity is a principle of perceptual grouping, such as similarity and closure. A great deal of research reported that feature-based similarity grouping (e.g., grouping based on color, size, or shape) among targets in MOT tasks can improve tracking performance. However, no additive feature-based grouping effects have been reported where the tracking objects had two or more features. “Additive effect” refers to a greater grouping effect produced by grouping based on multiple cues instead of one cue. Can spatial symmetry produce a similar grouping effect similar to that of feature similarity in MOT tasks? Are the grouping effects based on symmetry and feature similarity additive? This study includes four experiments to address these questions. The results of Experiments 1 and 2 demonstrated the automatic symmetry-based grouping effects. More importantly, an additive grouping effect of symmetry and feature similarity was observed in Experiments 3 and 4. Our findings indicate that symmetry can produce an enhanced grouping effect in MOT and facilitate the grouping effect based on color or shape similarity. The “where” and “what” pathways might have played an important role in the additive grouping effect.
Collapse
Affiliation(s)
- Chundi Wang
- Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal University Beijing, China
| | - Xuemin Zhang
- Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal UniversityBeijing, China; State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijing, China; Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal UniversityBeijing, China
| | - Yongna Li
- Department of Psychology, RenMin University of China Beijing, China
| | - Chuang Lyu
- Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal University Beijing, China
| |
Collapse
|
4
|
Wood JN. A smoothness constraint on the development of object recognition. Cognition 2016; 153:140-5. [PMID: 27208825 DOI: 10.1016/j.cognition.2016.04.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Revised: 03/22/2016] [Accepted: 04/23/2016] [Indexed: 11/17/2022]
Abstract
Understanding how the brain learns to recognize objects is one of the ultimate goals in the cognitive sciences. To date, however, we have not yet characterized the environmental factors that cause object recognition to emerge in the newborn brain. Here, I present the results of a high-throughput controlled-rearing experiment that examined whether the development of object recognition requires experience with temporally smooth visual objects. When newborn chicks (Gallus gallus) were raised with virtual objects that moved smoothly over time, the chicks developed accurate color recognition, shape recognition, and color-shape binding abilities. In contrast, when newborn chicks were raised with virtual objects that moved non-smoothly over time, the chicks' object recognition abilities were severely impaired. These results provide evidence for a "smoothness constraint" on newborn object recognition. Experience with temporally smooth objects facilitates the development of object recognition.
Collapse
Affiliation(s)
- Justin N Wood
- University of Southern California, Department of Psychology, 3620 South McClintock Ave., Los Angeles, CA 90089, United States.
| |
Collapse
|
5
|
Reinl M, Bartels A. Perception of temporal asymmetries in dynamic facial expressions. Front Psychol 2015; 6:1107. [PMID: 26300807 PMCID: PMC4523710 DOI: 10.3389/fpsyg.2015.01107] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 07/20/2015] [Indexed: 11/13/2022] Open
Abstract
In the current study we examined whether timeline-reversals and emotional direction of dynamic facial expressions affect subjective experience of human observers. We recorded natural movies of faces that increased or decreased their expressions of fear, and played them either in the natural frame order or reversed from last to first frame (reversed timeline). This led to four conditions of increasing or decreasing fear, either following the natural or reversed temporal trajectory of facial dynamics. This 2-by-2 factorial design controlled for visual low-level properties, static visual content, and motion energy across the different factors. It allowed us to examine perceptual consequences that would occur if the timeline trajectory of facial muscle movements during the increase of an emotion are not the exact mirror of the timeline during the decrease. It additionally allowed us to study perceptual differences between increasing and decreasing emotional expressions. Perception of these time-dependent asymmetries have not yet been quantified. We found that three emotional measures, emotional intensity, artificialness of facial movement, and convincingness or plausibility of emotion portrayal, were affected by timeline-reversals as well as by the emotional direction of the facial expressions. Our results imply that natural dynamic facial expressions contain temporal asymmetries, and show that deviations from the natural timeline lead to a reduction of perceived emotional intensity and convincingness, and to an increase of perceived artificialness of the dynamic facial expression. In addition, they show that decreasing facial expressions are judged as less plausible than increasing facial expressions. Our findings are of relevance for both, behavioral as well as neuroimaging studies, as processing and perception are influenced by temporal asymmetries.
Collapse
Affiliation(s)
| | - Andreas Bartels
- Vision and Cognition Lab, Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
| |
Collapse
|
6
|
Tian M, Grill-Spector K. Spatiotemporal information during unsupervised learning enhances viewpoint invariant object recognition. J Vis 2015; 15:7. [PMID: 26024454 DOI: 10.1167/15.6.7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning is used to link among object views. Specifically, researchers argue whether temporal proximity, motion, or spatiotemporal continuity among object views during unsupervised learning is beneficial. Here, we untangled the role of each of these factors in unsupervised learning of novel three-dimensional (3-D) objects. We found that after unsupervised training with 24 object views spanning a 180° view space, participants showed significant improvement in their ability to recognize 3-D objects across rotation. Surprisingly, there was no advantage to unsupervised learning with spatiotemporal continuity or motion information than training with temporal proximity. However, we discovered that when participants were trained with just a third of the views spanning the same view space, unsupervised learning via spatiotemporal continuity yielded significantly better recognition performance on novel views than learning via temporal proximity. These results suggest that while it is possible to obtain view-invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances performance by producing representations with broader view tuning than learning via temporal association. Our findings have important implications for theories of object recognition and for the development of computational algorithms that learn from examples.
Collapse
|
7
|
Kawabe T, Maruya K, Fleming RW, Nishida S. Seeing liquids from visual motion. Vision Res 2015; 109:125-38. [DOI: 10.1016/j.visres.2014.07.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 07/14/2014] [Accepted: 07/19/2014] [Indexed: 10/24/2022]
|
8
|
Gavornik JP, Bear MF. Higher brain functions served by the lowly rodent primary visual cortex. ACTA ACUST UNITED AC 2014; 21:527-33. [PMID: 25225298 PMCID: PMC4175492 DOI: 10.1101/lm.034355.114] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
It has been more than 50 years since the first description of ocular dominance plasticity--the profound modification of primary visual cortex (V1) following temporary monocular deprivation. This discovery immediately attracted the intense interest of neurobiologists focused on the general question of how experience and deprivation modify the brain as a potential substrate for learning and memory. The pace of discovery has quickened considerably in recent years as mice have become the preferred species to study visual cortical plasticity, and new studies have overturned the dogma that primary sensory cortex is immutable after a developmental critical period. Recent work has shown that, in addition to ocular dominance plasticity, adult visual cortex exhibits several forms of response modification previously considered the exclusive province of higher cortical areas. These "higher brain functions" include neural reports of stimulus familiarity, reward-timing prediction, and spatiotemporal sequence learning. Primary visual cortex can no longer be viewed as a simple visual feature detector with static properties determined during early development. Rodent V1 is a rich and dynamic cortical area in which functions normally associated only with "higher" brain regions can be studied at the mechanistic level.
Collapse
Affiliation(s)
- Jeffrey P Gavornik
- Howard Hughes Medical Institute, The Picower Institute for Learning and Memory, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Mark F Bear
- Howard Hughes Medical Institute, The Picower Institute for Learning and Memory, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
9
|
Wallis G. Toward a unified model of face and object recognition in the human visual system. Front Psychol 2013; 4:497. [PMID: 23966963 PMCID: PMC3744012 DOI: 10.3389/fpsyg.2013.00497] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2012] [Accepted: 07/15/2013] [Indexed: 11/29/2022] Open
Abstract
Our understanding of the mechanisms and neural substrates underlying visual recognition has made considerable progress over the past 30 years. During this period, accumulating evidence has led many scientists to conclude that objects and faces are recognised in fundamentally distinct ways, and in fundamentally distinct cortical areas. In the psychological literature, in particular, this dissociation has led to a palpable disconnect between theories of how we process and represent the two classes of object. This paper follows a trend in part of the recognition literature to try to reconcile what we know about these two forms of recognition by considering the effects of learning. Taking a widely accepted, self-organizing model of object recognition, this paper explains how such a system is affected by repeated exposure to specific stimulus classes. In so doing, it explains how many aspects of recognition generally regarded as unusual to faces (holistic processing, configural processing, sensitivity to inversion, the other-race effect, the prototype effect, etc.) are emergent properties of category-specific learning within such a system. Overall, the paper describes how a single model of recognition learning can and does produce the seemingly very different types of representation associated with faces and objects.
Collapse
Affiliation(s)
- Guy Wallis
- Centre for Sensorimotor Neuroscience, School of Human Movement Studies, University of QueenslandQLD, Australia
| |
Collapse
|
10
|
Mice discriminate between stationary and moving 2D shapes: application to the object recognition task to increase attention. Behav Brain Res 2013; 242:95-101. [PMID: 23291156 DOI: 10.1016/j.bbr.2012.12.040] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Revised: 12/20/2012] [Accepted: 12/24/2012] [Indexed: 12/26/2022]
Abstract
Selective attention can be assessed with the novel object recognition (NOR) test. In the standard version of this test the selection of objects to be used is critical. We created a modified version of NOR, the virtual object recognition test (VORT) in mice, where the 3D objects were replaced with highly discriminated geometrical shapes and presented on two 3.5-inch widescreen displays. No difference in the discrimination index (from 5min to 96h of inter-trial) was found between NOR and VORT. Scopolamine and mecamylamine decreased the discrimination index. Conversely, the discrimination index increased when nicotine was given to mice. No further improvement in the discrimination index was observed when nicotine was injected in mice presented with highly discriminable shapes. To test the possibility that object movements increased mice's attention in the VORT, different movements were applied to the same geometrical shapes previously presented. Mice were able to distinguish among different movements (horizontal, vertical, oblique). Notably, the shapes previously found not distinguishable when stationary were better discriminated when moving. Collectively, these findings indicate that VORT, based on virtual geometric simple shapes, offers the possibility to obtain rapid information on amnesic/pro-amnestic potential of new drugs. The introduction of motion is a strong cue that makes the task more valuable to study attention.
Collapse
|
11
|
On the advantage of being left-handed in volleyball: further evidence of the specificity of skilled visual perception. Atten Percept Psychophys 2012; 74:446-53. [PMID: 22147534 DOI: 10.3758/s13414-011-0252-1] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
High ball speeds and close distances between competitors require athletes in interactive sports to correctly anticipate an opponent's intentions in order to render appropriate reactions. Although it is considered crucial for successful performance, such skill appears impaired when athletes are confronted with a left-handed opponent, possibly because of athletes' reduced perceptual familiarity with rarely encountered left-handed actions. To test this negative perceptual frequency effect hypothesis, we invited 18 skilled and 18 novice volleyball players to predict shot directions of left- and right-handed attacks in a video-based visual anticipation task. In accordance with our predictions, and with recent reports on laterality differences in visual perception, the outcome of left-handed actions was significantly less accurately predicted than the outcome of right-handed attacks. In addition, this left-right bias was most distinct when predictions had to be based on preimpact (i.e., before hand-ball contact) kinematic cues, and skilled players were generally more affected by the opponents' handedness than were novices. The study's findings corroborate the assumption that skilled visual perception is attuned to more frequently encountered actions.
Collapse
|
12
|
Sarkheil P, Goebel R, Schneider F, Mathiak K. Emotion unfolded by motion: a role for parietal lobe in decoding dynamic facial expressions. Soc Cogn Affect Neurosci 2012; 8:950-7. [PMID: 22962061 DOI: 10.1093/scan/nss092] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Facial expressions convey important emotional and social information and are frequently applied in investigations of human affective processing. Dynamic faces may provide higher ecological validity to examine perceptual and cognitive processing of facial expressions. Higher order processing of emotional faces was addressed by varying the task and virtual face models systematically. Blood oxygenation level-dependent activation was assessed using functional magnetic resonance imaging in 20 healthy volunteers while viewing and evaluating either emotion or gender intensity of dynamic face stimuli. A general linear model analysis revealed that high valence activated a network of motion-responsive areas, indicating that visual motion areas support perceptual coding for the motion-based intensity of facial expressions. The comparison of emotion with gender discrimination task revealed increased activation of inferior parietal lobule, which highlights the involvement of parietal areas in processing of high level features of faces. Dynamic emotional stimuli may help to emphasize functions of the hypothesized 'extended' over the 'core' system for face processing.
Collapse
Affiliation(s)
- Pegah Sarkheil
- Department of Psychiatry, Psychotherapy and Psychosomatics, Aachen University Hospital, Pauwelsstr. 30, 52074 Aachen, Germany. Tel.: +49 241 80 89633, Fax: +49 241 80 82401.
| | | | | | | |
Collapse
|
13
|
Chuang LL, Vuong QC, Bülthoff HH. Learned Non-Rigid Object Motion is a View-Invariant Cue to Recognizing Novel Objects. Front Comput Neurosci 2012; 6:26. [PMID: 22661939 PMCID: PMC3357528 DOI: 10.3389/fncom.2012.00026] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2012] [Accepted: 04/22/2012] [Indexed: 11/25/2022] Open
Abstract
There is evidence that observers use learned object motion to recognize objects. For instance, studies have shown that reversing the learned direction in which a rigid object rotated in depth impaired recognition accuracy. This motion reversal can be achieved by playing animation sequences of moving objects in reverse frame order. In the current study, we used this sequence-reversal manipulation to investigate whether observers encode the motion of dynamic objects in visual memory, and whether such dynamic representations are encoded in a way that is dependent on the viewing conditions. Participants first learned dynamic novel objects, presented as animation sequences. Following learning, they were then tested on their ability to recognize these learned objects when their animation sequence was shown in the same sequence order as during learning or in the reverse sequence order. In Experiment 1, we found that non-rigid motion contributed to recognition performance; that is, sequence-reversal decreased sensitivity across different tasks. In subsequent experiments, we tested the recognition of non-rigidly deforming (Experiment 2) and rigidly rotating (Experiment 3) objects across novel viewpoints. Recognition performance was affected by viewpoint changes for both experiments. Learned non-rigid motion continued to contribute to recognition performance and this benefit was the same across all viewpoint changes. By comparison, learned rigid motion did not contribute to recognition performance. These results suggest that non-rigid motion provides a source of information for recognizing dynamic objects, which is not affected by changes to viewpoint.
Collapse
Affiliation(s)
- Lewis L Chuang
- Department of Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics Tübingen, Germany
| | | | | |
Collapse
|
14
|
Balas B, Kanwisher N, Saxe R. Thin-slice perception develops slowly. J Exp Child Psychol 2012; 112:257-64. [PMID: 22417920 DOI: 10.1016/j.jecp.2012.01.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2011] [Revised: 01/03/2012] [Accepted: 01/03/2012] [Indexed: 10/28/2022]
Abstract
Body language and facial gesture provide sufficient visual information to support high-level social inferences from "thin slices" of behavior. Given short movies of nonverbal behavior, adults make reliable judgments in a large number of tasks. Here we find that the high precision of adults' nonverbal social perception depends on the slow development, over childhood, of sensitivity to subtle visual cues. Children and adult participants watched short silent clips in which a target child played with Lego blocks either in the (off-screen) presence of an adult or alone. Participants judged whether the target was playing alone or not; that is, they detected the presence of a social interaction (from the behavior of one participant in that interaction). This task allowed us to compare performance across ages with the true answer. Children did not reach adult levels of performance on this task until 9 or 10 years of age, and we observed an interaction between age and video reversal. Adults and older children benefitted from the videos being played in temporal sequence, rather than reversed, suggesting that adults (but not young children) are sensitive to natural movement in social interactions.
Collapse
Affiliation(s)
- Benjamin Balas
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA.
| | | | | |
Collapse
|
15
|
Deepak KS, Sivaswamy J. Automatic assessment of macular edema from color retinal images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2012; 31:766-776. [PMID: 22167598 DOI: 10.1109/tmi.2011.2178856] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Diabetic macular edema (DME) is an advanced symptom of diabetic retinopathy and can lead to irreversible vision loss. In this paper, a two-stage methodology for the detection and classification of DME severity from color fundus images is proposed. DME detection is carried out via a supervised learning approach using the normal fundus images. A feature extraction technique is introduced to capture the global characteristics of the fundus images and discriminate the normal from DME images. Disease severity is assessed using a rotational asymmetry metric by examining the symmetry of macular region. The performance of the proposed methodology and features are evaluated against several publicly available datasets. The detection performance has a sensitivity of 100% with specificity between 74% and 90%. Cases needing immediate referral are detected with a sensitivity of 100% and specificity of 97%. The severity classification accuracy is 81% for the moderate case and 100% for severe cases. These results establish the effectiveness of the proposed solution.
Collapse
Affiliation(s)
- K Sai Deepak
- Center for Visual Information Technology, International Institute of Information Technology, Hyderabad, AP, India.
| | | |
Collapse
|
16
|
Rigid facial motion influences featural, but not holistic, face processing. Vision Res 2012; 57:26-34. [PMID: 22342561 DOI: 10.1016/j.visres.2012.01.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2011] [Revised: 01/18/2012] [Accepted: 01/26/2012] [Indexed: 11/21/2022]
Abstract
We report three experiments in which we investigated the effect of rigid facial motion on face processing. Specifically, we used the face composite effect to examine whether rigid facial motion influences primarily featural or holistic processing of faces. In Experiments 1-3, participants were first familiarized with dynamic displays in which a target face turned from one side to another; then at test, participants judged whether the top half of a composite face (the top half of the target/foil face aligned or misaligned with the bottom half of a foil face) belonged to the target face. We compared performance in the dynamic condition to various static control conditions in Experiments 1-3, which differed from each other in terms of the display order of the multiple static images or the inter-stimulus interval (ISI) between the images. We found that the size of the face composite effect in the dynamic condition was significantly smaller than that in the static conditions. In other words, the dynamic face display influenced participants to process the target faces in a part-based manner and consequently their recognition of the upper portion of the composite face at test became less interfered with by the aligned lower part of the foil face. The findings from the present experiments provide the strongest evidence to date to suggest that the rigid facial motion mainly influences facial featural, but not holistic, processing.
Collapse
|
17
|
Mayer KM, Vuong QC. The influence of unattended features on object processing depends on task demand. Vision Res 2012; 56:20-7. [PMID: 22306678 DOI: 10.1016/j.visres.2012.01.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Revised: 01/19/2012] [Accepted: 01/20/2012] [Indexed: 10/14/2022]
Abstract
Objects consist of features such as shape, motion and color, all of which can be selectively used for different object processing tasks. The present study investigated whether task demands influenced how well participants attended to features of novel colored dynamic objects that were task-relevant while ignoring those that were task-irrelevant. To address this, we used tasks which had different perceptual, learning and memory demands. The unattended features were systematically changed to measure their effects on how well participants could process the attended feature. In Experiment 1, participants discriminated simultaneously presented objects on the basis their shape or motion. We found that changes to unattended motion and color did not affect participants' sensitivity to discriminate the attended feature but changes to unattended shape did. We also found that changes to unattended motion impaired how quickly observers responded. In Experiment 2, participants identified learned objects at the individual level on the basis of their shape or motion. We found that changes to any unattended features affected accuracy and reaction times. Overall, these results point to an important role of task demands in object processing: Task demands can influence whether task-irrelevant features affect object-processing performance.
Collapse
Affiliation(s)
- Katja M Mayer
- Institute of Neuroscience, Newcastle University, UK.
| | | |
Collapse
|
18
|
Wu B, Klatzky RL, Stetten GD. Mental visualization of objects from cross-sectional images. Cognition 2012; 123:33-49. [PMID: 22217386 DOI: 10.1016/j.cognition.2011.12.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Revised: 11/16/2011] [Accepted: 12/07/2011] [Indexed: 10/14/2022]
Abstract
We extended the classic anorthoscopic viewing procedure to test a model of visualization of 3D structures from 2D cross-sections. Four experiments were conducted to examine key processes described in the model, localizing cross-sections within a common frame of reference and spatiotemporal integration of cross sections into a hierarchical object representation. Participants used a hand-held device to reveal a hidden object as a sequence of cross-sectional images. The process of localization was manipulated by contrasting two displays, in situ vs. ex situ, which differed in whether cross sections were presented at their source locations or displaced to a remote screen. The process of integration was manipulated by varying the structural complexity of target objects and their components. Experiments 1 and 2 demonstrated visualization of 2D and 3D line-segment objects and verified predictions about display and complexity effects. In Experiments 3 and 4, the visualized forms were familiar letters and numbers. Errors and orientation effects showed that displacing cross-sectional images to a remote display (ex situ viewing) impeded the ability to determine spatial relationships among pattern components, a failure of integration at the object level.
Collapse
Affiliation(s)
- Bing Wu
- Cognitive Science and Engineering Program, Arizona State University, Mesa, AZ 85212, USA.
| | | | | |
Collapse
|
19
|
Matthews WJ. Stimulus repetition and the perception of time: the effects of prior exposure on temporal discrimination, judgment, and production. PLoS One 2011; 6:e19815. [PMID: 21573020 PMCID: PMC3090413 DOI: 10.1371/journal.pone.0019815] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2011] [Accepted: 04/13/2011] [Indexed: 11/29/2022] Open
Abstract
It has been suggested that repeated stimuli have shorter subjective duration than novel items, perhaps because of a reduction in the neural response to repeated presentations of the same object. Five experiments investigated the effects of repetition on time perception and found further evidence that immediate repetition reduces apparent duration, consistent with the idea that subjective duration is partly based on neural coding efficiency. In addition, the experiments found (a) no effect of repetition on the precision of temporal discrimination, (b) that the effects of repetition disappeared when there was a modest lag between presentations, (c) that, across participants, the size of the repetition effect correlated with temporal discrimination, and (d) that the effects of repetition suggested by a temporal production task were the opposite of those suggested by temporal judgments. The theoretical and practical implications of these results are discussed.
Collapse
Affiliation(s)
- William J Matthews
- Department of Psychology, University of Essex, Colchester, United Kingdom.
| |
Collapse
|
20
|
Walk this way: Approaching bodies can influence the processing of faces. Cognition 2011; 118:17-31. [DOI: 10.1016/j.cognition.2010.09.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2010] [Revised: 09/20/2010] [Accepted: 09/24/2010] [Indexed: 11/20/2022]
|
21
|
|
22
|
Kourtzi Z, Nakayama K. Distinct mechanisms for the representation of moving and static objects. VISUAL COGNITION 2010. [DOI: 10.1080/13506280143000421] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
23
|
Continuous transformation learning of translation invariant representations. Exp Brain Res 2010; 204:255-70. [PMID: 20544186 DOI: 10.1007/s00221-010-2309-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2009] [Accepted: 05/21/2010] [Indexed: 01/24/2023]
Abstract
We show that spatial continuity can enable a network to learn translation invariant representations of objects by self-organization in a hierarchical model of cortical processing in the ventral visual system. During 'continuous transformation learning', the active synapses from each overlapping transform are associatively modified onto the set of postsynaptic neurons. Because other transforms of the same object overlap with previously learned exemplars, a common set of postsynaptic neurons is activated by the new transforms, and learning of the new active inputs onto the same postsynaptic neurons is facilitated. We show that the transforms must be close for this to occur; that the temporal order of presentation of each transformed image during training is not crucial for learning to occur; that relatively large numbers of transforms can be learned; and that such continuous transformation learning can be usefully combined with temporal trace training.
Collapse
|
24
|
Abstract
Evidence suggests than human time perception is likely to reflect an ensemble of recent temporal experience. For example, prolonged exposure to consistent temporal patterns can adaptively realign the perception of event order, both within and between sensory modalities (e.g. Fujisaki et al., 2004 Nat. Neurosci., 7, 773-778). In addition, the observation that 'a watched pot never boils' serves to illustrate the fact that dynamic shifts in our attentional state can also produce marked distortions in our temporal estimates. In the current study we provide evidence for a hitherto unknown link between adaptation, temporal perception and our attentional state. We show that our ability to use recent sensory history as a perceptual baseline for ongoing temporal judgments is subject to striking top-down modulation via shifts in the observer's selective attention. Specifically, attending to the temporal structure of asynchronous auditory and visual adapting stimuli generates a substantial increase in the temporal recalibration induced by these stimuli. We propose a conceptual framework accounting for our findings whereby attention modulates the perceived salience of temporal patterns. This heightened salience allows the formation of audiovisual perceptual 'objects', defined solely by their temporal structure. Repeated exposure to these objects induces high-level pattern adaptation effects, akin to those found in visual and auditory domains (e.g. Leopold & Bondar (2005) Fitting the Mind to the World: Adaptation and Aftereffects in High-Level Vision. Oxford University Press, Oxford, 189-211; Schweinberger et al. (2008) Curr. Biol., 18, 684-688).
Collapse
Affiliation(s)
- James Heron
- Bradford School of Optometry and Vision Science, University of Bradford, Bradford, UK.
| | | | | | | |
Collapse
|
25
|
Seeing an unfamiliar face in rotational motion does not aid identity discrimination across viewpoints. Vision Res 2010; 50:854-9. [DOI: 10.1016/j.visres.2010.02.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Accepted: 02/18/2010] [Indexed: 11/19/2022]
|
26
|
Setti A, Newell FN. The effect of body and part-based motion on the recognition of unfamiliar objects. VISUAL COGNITION 2010. [DOI: 10.1080/13506280902830561] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
27
|
Vuong QC, Friedman A, Plante C. Modulation of viewpoint effects in object recognition by shape and motion cues. Perception 2010; 38:1628-48. [PMID: 20120262 DOI: 10.1068/p6430] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In three experiments, we examined the role of structural similarity and different types of motion on the efficiency of performing same--different shape judgments across changes in viewpoints. In all experiments, participants judged whether two novel, multi-part objects were structurally identical, and they were to ignore any viewpoint or motion differences between the objects. In experiment 1, participants were affected by viewpoint differences more for structurally similar than structurally distinct objects, but this interaction was mitigated by rigid motion. In experiments 2 and 3, we used only structurally similar objects that moved only some of their parts, either in a similar way between objects within a pair or in distinctive ways. Participants' recognition performance was facilitated by this articulated motion relative to both static and scrambled controls. We conclude that coherent motion facilitates generalisation across different views of dynamic objects under some conditions.
Collapse
Affiliation(s)
- Quoc C Vuong
- Institute of Neuroscience, University of Newcastle, Newcastle upon Tyne NE2 4HH, UK.
| | | | | |
Collapse
|
28
|
Chan JS, Simões-Franklin C, Garavan H, Newell FN. Static images of novel, moveable objects learned through touch activate visual area hMT+. Neuroimage 2010; 49:1708-16. [DOI: 10.1016/j.neuroimage.2009.09.068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2009] [Revised: 09/28/2009] [Accepted: 09/29/2009] [Indexed: 11/16/2022] Open
|
29
|
Spratling M. Learning Posture Invariant Spatial Representations Through Temporal Correlations. ACTA ACUST UNITED AC 2009. [DOI: 10.1109/tamd.2009.2038494] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
30
|
Friedman A, Vuong QC, Spetch M. Facilitation by view combination and coherent motion in dynamic object recognition. Vision Res 2009; 50:202-10. [PMID: 19925823 DOI: 10.1016/j.visres.2009.11.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Revised: 10/30/2009] [Accepted: 11/12/2009] [Indexed: 10/20/2022]
Abstract
We compared the effect of motion cues on people's ability to: (1) recognize dynamic objects by combining information from more than one view and (2) perform more efficiently on views that followed the global direction of the trained views. Participants learned to discriminate two objects that were either structurally similar or distinct and that were rotating in depth in either a coherent or scrambled motion sequence. The Training views revealed 60 degrees of the object, with a center 30 degrees segment missing. For similar stimuli only, there was a facilitative effect of motion: Performance in the coherent condition was better on views following the training views than on equidistant preceding views. Importantly, the viewpoint between the two training viewpoints was responded to more efficiently than either the Pre- or Post-Training viewpoints for both the coherent and scrambled condition. The results indicate that view combination and processing coherent motion cues may occur through different processes.
Collapse
Affiliation(s)
- Alinda Friedman
- Department of Psychology, University of Alberta, Edmonton, Alberta, Canada T6G 2E9.
| | | | | |
Collapse
|
31
|
The role of sequence order in determining view canonicality for novel wire-frame objects. Atten Percept Psychophys 2009; 71:712-23. [PMID: 19429954 DOI: 10.3758/app.71.4.712] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Objects are best recognized from so-called "canonical" views. The characteristics of canonical views of arbitrary objects have been qualitatively described using a variety of different criteria, but little is known regarding how these views might be acquired during object learning. We address this issue, in part, by examining the role of object motion in the selection of preferred views of novel objects. Specifically, we adopt a modeling approach to investigate whether or not the sequence of views seen during initial exposure to an object contributes to observers' preferences for particular images in the sequence. In two experiments, we exposed observers to short sequences depicting rigidly rotating novel objects and subsequently collected subjective ratings of view canonicality (Experiment 1) and recall rates for individual views (Experiment 2). Given these two operational definitions of view canonicality, we attempted to fit both sets of behavioral data with a computational model incorporating 3-D shape information (object foreshortening), as well as information relevant to the temporal order of views presented during training (the rate of change for object foreshortening). Both sets of ratings were reasonably well predicted using only 3-D shape; the inclusion of terms that capture sequence order improved model performance significantly.
Collapse
|
32
|
|
33
|
View combination in moving objects: The role of motion in discriminating between novel views of similar and distinctive objects by humans and pigeons. Vision Res 2009; 49:594-607. [DOI: 10.1016/j.visres.2009.01.019] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2008] [Revised: 01/26/2009] [Accepted: 01/31/2009] [Indexed: 11/22/2022]
|
34
|
The integration of higher order form and motion by the human brain. Neuroimage 2008; 42:1529-36. [DOI: 10.1016/j.neuroimage.2008.04.265] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2008] [Revised: 04/23/2008] [Accepted: 04/25/2008] [Indexed: 11/24/2022] Open
|
35
|
Paradis AL, Droulez J, Cornilleau-Pérès V, Poline JB. Processing 3D form and 3D motion: respective contributions of attention-based and stimulus-driven activity. Neuroimage 2008; 43:736-47. [PMID: 18805496 DOI: 10.1016/j.neuroimage.2008.08.027] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2007] [Revised: 07/31/2008] [Accepted: 08/19/2008] [Indexed: 11/30/2022] Open
Abstract
This study aims at segregating the neural substrate for the 3D-form and 3D-motion attributes in structure-from-motion perception, and at disentangling the stimulus-driven and endogenous-attention-driven processing of these attributes. Attention and stimulus were manipulated independently: participants had to detect the transitions of one attribute--form, 3D motion or colour--while the visual stimulus underwent successive transitions of all attributes. We compared the BOLD activity related to form and 3D motion in three conditions: stimulus-driven processing (unattended transitions), endogenous attentional selection (task) or both stimulus-driven processing and attentional selection (attended transitions). In all conditions, the form versus 3D-motion contrasts revealed a clear dorsal/ventral segregation. However, while the form-related activity is consistent with previously described shape-selective areas, the activity related to 3D motion does not encompass the usual "visual motion" areas, but rather corresponds to a high-level motion system, including IPL and STS areas. Second, we found a dissociation between the neural processing of unattended attributes and that involved in endogenous attentional selection. Areas selective for 3D-motion and form showed either increased activity at transitions of these respective attributes or decreased activity when subjects' attention was directed to a competing attribute. We propose that both facilitatory and suppressive mechanisms of attribute selection are involved depending on the conditions driving this selection. Therefore, attentional selection is not limited to an increased activity in areas processing stimulus properties, and may unveil different functional localization from stimulus modulation.
Collapse
Affiliation(s)
- A-L Paradis
- CNRS, UPR640, Laboratoire de Neurosciences Cognitives et Imagerie Cérébrale, 75013 Paris, France.
| | | | | | | |
Collapse
|
36
|
Liu T. Learning sequence of views of three-dimensional objects: the effect of temporal coherence on object memory. Perception 2008; 36:1320-33. [PMID: 18196699 DOI: 10.1068/p5778] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
How humans recognize objects remains a contentious issue in current research on high-level vision. Here, I test the proposal by Wallis and Bülthoff (1999 Trends in Cognitive Sciences 3 22-31) suggesting that object representations can be learned through temporal association of multiple views of the same object. Participants first studied image sequences of novel, three-dimensional objects in a study block. On each trial, the images were from either an orderly sequence of depth-rotated views of the same object (SS), a scrambled sequence of those views (SR), or a sequence of different objects (RR). Recognition memory was assessed in a following test block. A within-object advantage was consistently observed --greater accuracy in the SR than the RR condition in all four experiments, greater accuracy in the SS than the RR condition in two experiments. Furthermore, spatiotemporal coherence did not produce better recognition than temporal coherence alone (similar or less accuracy in the SS compared to the SR condition). These results suggest that the visual system can use temporal regularity to build invariant object representations, via the temporal-association mechanism.
Collapse
Affiliation(s)
- Taosheng Liu
- Department of Psychology, New York University, New York 10003, USA.
| |
Collapse
|
37
|
Schultz J, Chuang L, Vuong QC. A Dynamic Object-Processing Network: Metric Shape Discrimination of Dynamic Objects by Activation of Occipitotemporal, Parietal, and Frontal Cortices. Cereb Cortex 2007; 18:1302-13. [DOI: 10.1093/cercor/bhm162] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
38
|
Enhanced Experience of Emotional Arousal in Response to Dynamic Facial Expressions. JOURNAL OF NONVERBAL BEHAVIOR 2007. [DOI: 10.1007/s10919-007-0025-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
39
|
Vuong QC, Tarr MJ. Structural similarity and spatiotemporal noise effects on learning dynamic novel objects. Perception 2006; 35:497-510. [PMID: 16700292 DOI: 10.1068/p5491] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The spatiotemporal pattern projected by a moving object is specific to that object, as it depends on both the shape and the dynamics of the object. Previous research has shown that observers learn to make use of this spatiotemporal signature to recognize dynamic faces and objects. In two experiments, we assessed the extent to which the structural similarity of the objects and the presence of spatiotemporal noise affect how these signatures are learned and subsequently used in recognition. Observers first learned to identify novel, structurally distinctive or structurally similar objects that rotated with a particular motion. At test, each learned object moved with its studied motion or with a non-studied motion. In the non-studied motion condition we manipulated either dynamic information alone (experiment 1) or both static and dynamic information (experiment 2). Across both experiments we found that changing the learned motion of an object impaired recognition performance when 3-D shape was similar or when the visual input was noisy during learning. These results are consistent with the hypothesis that observers use learned spatiotemporal signatures and that such information becomes progressively more important as shape information becomes less reliable.
Collapse
Affiliation(s)
- Quoc C Vuong
- Max Planck Institute for Biological Cybernetics, Spemannstrasse 38, D 72076 Tübingen, Germany.
| | | |
Collapse
|
40
|
Abstract
We investigated the role of dynamic information in human and pigeon object recognition. Both species were trained to discriminate between two objects that each had a characteristic motion, so that either cue could be used to perform the task successfully. The objects were either easy or difficult to decompose into parts. At test, the learned objects could appear in their learned motions, the reverse of the learned motions, or an entirely new motion, or a new object could appear in one of the learned motions. For humans, any change in the learned motion produced a decrement in performance for both the decomposable and the nondecomposable objects, but participants did not respond differentially to new objects that appeared in the learned motions. Pigeons showed the same pattern of responding as did humans for the decomposable objects, except that pigeons responded differentially to new objects in the learned motions. For the nondecomposable objects, pigeons used motion cues exclusively. We suggest that for some types of objects, dynamic information may be weighted differently by pigeons and humans.
Collapse
Affiliation(s)
- Marcia L Spetch
- Department of Psychology, University of Alberta, Edmonton, Canada.
| | | | | |
Collapse
|
41
|
|
42
|
Abstract
This article considers how people judge the identity of objects (e.g., how people decide that a description of an object at one time, t(0), belongs to the same object as a description of it at another time, t(1)). The authors propose a causal continuer model for these judgments, based on an earlier theory by Nozick (1981). According to this model, the 2 descriptions belong to the same object if (a) the object at t(1) is among those that are causally close enough to be genuine continuers of the original and (b) it is the closest of these close-enough contenders. A quantitative version of the model makes accurate predictions about judgments of which a pair of objects is identical to an original (Experiments 1 and 2). The model makes correct qualitative predictions about identity across radical disassembly (Experiment 1) as well as more ordinary transformations (Experiments 2 and 3).
Collapse
Affiliation(s)
- Lance J Rips
- Psychology Department, Northwestern University, Evanston, IL 60208, USA.
| | | | | |
Collapse
|
43
|
Spratling MW. Learning viewpoint invariant perceptual representations from cluttered images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2005; 27:753-61. [PMID: 15875796 DOI: 10.1109/tpami.2005.105] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalize across changes in location, rotation, and scale. A standard method for learning perceptual representations that are invariant to viewpoint is to form temporal associations across image sequences showing object transformations. However, this method requires that individual stimuli be presented in isolation and is therefore unlikely to succeed in real-world applications where multiple objects can co-occur in the visual input. This paper proposes a simple modification to the learning method that can overcome this limitation and results in more robust learning of invariant representations.
Collapse
|
44
|
Abstract
Current research towards retina implants for partial restoration of vision in blind humans with retinal degenerative dysfunctions focuses on implant and stimulation experiments and technologies. In contrast, our approach takes the availability of an epiretinal multi-electrode neural interface for granted and studies the conditions for successful joint information processing of both retinal prosthesis and brain. Our proposed learning retina encoder (RE) includes information processing modules to simulate the complex mapping operation of parts of the 5-layered neural retina and to provide an iterative, perception-based dialog between RE and human subject. Alternative information processing technologies in the learning RE are being described, which allow an individual optimization of the RE mapping operation by means of iterative tuning with learning algorithms in a dialog between implant wearing subject and RE. The primate visual system is modeled by a retina module (RM) composed of spatio-temporal (ST) filters and a central visual system module (VM). RM performs a mapping 1 of an optical pattern P1 in the physical domain onto a retinal output vector R1(t) in a neural domain, whereas VM performs a mapping 2 of R1(t) in a neural domain onto a visual percept P2 in the perceptual domain. Retinal ganglion cell properties represent non-invertible ST filters in RE, which generate ambiguous output signals. VM generates visual percepts only if the corresponding R1(t) is properly encoded, contains sufficient information, and can be disambiguated. Based on the learning RE and the proposed visual system model, a novel retina encoder (RE*) is proposed, which considers both ambiguity removal and miniature eye movements during fixation. Our simulation results suggest that VM requires miniature eye movements under control of the visual system to retrieve unambiguous patterns P2 corresponding to P1. For retina implant applications, RE* can be tuned to generate optimal ganglion cell codes for epiretinal stimulation.
Collapse
Affiliation(s)
- Rolf Eckmiller
- Division of Neural Computation, Department of Computer Science, University of Bonn, 53117 Bonn, Germany
| | | | | |
Collapse
|
45
|
Munhall KG, Kroos C, Jozan G, Vatikiotis-Bateson E. Spatial frequency requirements for audiovisual speech perception. ACTA ACUST UNITED AC 2004; 66:574-83. [PMID: 15311657 DOI: 10.3758/bf03194902] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Spatial frequency band-pass and low-pass filtered images of a talker were used in an audiovisual speech-in-noise task. Three experiments tested subjects' use of information contained in the different filter bands with center frequencies ranging from 2.7 to 44.1 cycles/face (c/face). Experiment 1 demonstrated that information from a broad range of spatial frequencies enhanced auditory intelligibility. The frequency bands differed in the degree of enhancement, with a peak being observed in a mid-range band (11-c/face center frequency). Experiment 2 showed that this pattern was not influenced by viewing distance and, thus, that the results are best interpreted in object spatial frequency, rather than in retinal coordinates. Experiment 3 showed that low-pass filtered images could produce a performance equivalent to that produced by unfiltered images. These experiments are consistent with the hypothesis that high spatial resolution information is not necessary for audiovisual speech perception and that a limited range of spatial frequency spectrum is sufficient.
Collapse
Affiliation(s)
- K G Munhall
- Department of Psychology, Queen's University, Kingston, Ontario, Canada.
| | | | | | | |
Collapse
|
46
|
Vuong QC, Tarr MJ. Rotation direction affects object recognition. Vision Res 2004; 44:1717-30. [PMID: 15136006 DOI: 10.1016/j.visres.2004.02.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2003] [Revised: 12/08/2003] [Indexed: 10/26/2022]
Abstract
What role does dynamic information play in object recognition? To address this question, we probed observers' memory for novel objects rotating in depth. Irrespective of object discriminability, performance was affected by an object's rotation direction. This effect was obtained despite the same shape information and views being shown for different rotation directions. This direction effect was eliminated when either static images or animations that did not depict globally coherent rotation were used. Overall, these results suggest that dynamic information, that is, the spatiotemporal ordering of object views, provides information independent of shape or view information to a recognition system.
Collapse
Affiliation(s)
- Quoc C Vuong
- Department of Cognitive and Linguistic Sciences, Brown University, 190 Thayer Street, Providence, Rhode Island 02912, USA.
| | | |
Collapse
|
47
|
Jellema T, Perrett DI. Perceptual History Influences Neural Responses to Face and Body Postures. J Cogn Neurosci 2003; 15:961-71. [PMID: 14614807 DOI: 10.1162/089892903770007353] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
We show that under natural viewing, the responses of cells in the temporal lobe of the macaque to the sight of static head and body postures is controlled by the sight of immediately preceding actions. Cells in the anterior part of the superior temporal sulcus responded vigorously to the sight of a face or body posture that followed a particular body action, but not when it followed other actions. The effective action or posture presented in isolation or in different sequences failed to produce a response. Our results demonstrate that cells in the temporal cortex could support the formation of expectations about impending behavior of others.
Collapse
Affiliation(s)
- Tjeerd Jellema
- Department of Experimental Psychology, Helmholtz Research Institute,Utrecht University, The Netherlands.
| | | |
Collapse
|
48
|
Abstract
Although both the object and the observer often move in natural environments, the effect of motion on visual object recognition ha not been well documented. The authors examined the effect of a reversal in the direction of rotation on both explicit and implicit memory for novel, 3-dimensional objects. Participants viewed a series of continuously rotating objects and later made either an old-new recognition judgment or a symmetric-asymmetric decision. For both tasks, memory for rotating objects was impaired when the direction of rotation was reversed at test. These results demonstrate that dynamic information can play a role in visual object recognition and suggest that object representations can encode spatiotemporal information.
Collapse
Affiliation(s)
- Taosheng Liu
- Department of Psychology, Columbia University, New York, New York 10027, USA
| | | |
Collapse
|
49
|
Munhall KG, Servos P, Santi A, Goodale MA. Dynamic visual speech perception in a patient with visual form agnosia. Neuroreport 2002; 13:1793-6. [PMID: 12395125 DOI: 10.1097/00001756-200210070-00020] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
To examine the role of dynamic cues in visual speech perception, a patient with visual form agnosia (DF) was tested with a set of static and dynamic visual displays of three vowels. Five conditions were tested: (1) auditory only which provided only vocal pitch information, (2) dynamic visual only, (3) dynamic audiovisual with vocal pitch information, (4) dynamic audiovisual with full voice information and (5) static visual only images of postures during vowel production. DF showed normal performance in all conditions except the static visual only condition in which she scored at chance. Control subjects scored close to ceiling in this condition. The results suggest that spatiotemporal signatures for objects and events are processed separately from static form cues.
Collapse
Affiliation(s)
- K G Munhall
- Department of Psychology, Queen's University, Kingston, Ontario, Canada.
| | | | | | | |
Collapse
|
50
|
Abstract
In a series of three experiments, we used a sequential matching task to explore the impact of non-rigid facial motion on the perception of human faces. Dynamic prime images, in the form of short video sequences, facilitated matching responses relative to a single static prime image. This advantage was observed whenever the prime and target showed the same face but an identity match was required across expression (experiment 1) or view (experiment 2). No facilitation was observed for identical dynamic prime sequences when the matching dimension was shifted from identity to expression (experiment 3). We suggest that the observed dynamic advantage, the first reported for non-degraded facial images, arises because the matching task places more emphasis on visual working memory than typical face recognition tasks. More specifically, we believe that representational mechanisms optimised for the processing of motion and/or change-over-time are established and maintained in working memory and that such 'dynamic representations' (Freyd, 1987 Psychological Review 94 427-438) capitalise on the increased information content of the dynamic primes to enhance performance.
Collapse
Affiliation(s)
- Ian M Thornton
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | | |
Collapse
|