1
|
Deng H, Gao Y, Mo L, Mo C. Concurrent attention to hetero-depth surfaces in 3-D visual space is governed by theta rhythm. Psychophysiology 2024; 61:e14494. [PMID: 38041416 DOI: 10.1111/psyp.14494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 10/30/2023] [Accepted: 11/08/2023] [Indexed: 12/03/2023]
Abstract
When simultaneously confronted with multiple attentional targets, visual system employs a time-multiplexing approach in which each target alternates for prioritized access, a mechanism broadly known as rhythmic attentional sampling. For the past decade, rhythmic attentional sampling has received mounting support from converging behavioral and neural findings. However, so compelling are these findings that a critical test ground has been long overshadowed, namely the 3-D visual space where attention is complicated by extraction of the spatial layout of surfaces extending beyond 2-D planes. It remains unknown how attentional deployment to multiple targets is accomplished in the 3-D space. Here, we provided a time-resolved portrait of the behavioral and neural dynamics when participants concurrently attended to two surfaces defined by motion-depth conjunctions. To characterize the moment-to-moment attentional modulation effects, we measured perceptual sensitivity to the hetero-depth surface motions on a fine temporal scale and reconstructed their neural representations using a time-resolved multivariate inverted encoding model. We found that the perceptual sensitivity to the two surface motions rhythmically fluctuated over time at ~4 Hz, with one's enhancement closely tracked by the other's diminishment. Moreover, the behavioral pattern was coupled with an ongoing periodic alternation in strength between the two surface motion representations in the same frequency. Together, our findings provide the first converging evidence of an attentional "pendulum" that rhythmically traverses different stereoscopic depth planes and are indicative of a ubiquitous attentional time multiplexor based on theta rhythm in the 3-D visual space.
Collapse
Affiliation(s)
- Hongyu Deng
- School of Psychology, Center for Studies of Psychological Application, South China Normal University, Guangzhou, P.R. China
| | - Yuan Gao
- School of Psychology, Center for Studies of Psychological Application, South China Normal University, Guangzhou, P.R. China
| | - Lei Mo
- School of Psychology, Center for Studies of Psychological Application, South China Normal University, Guangzhou, P.R. China
| | - Ce Mo
- Department of Psychology, Sun-Yat-Sen University, Guangzhou, P.R. China
| |
Collapse
|
2
|
Wu H, Zuo Z, Yuan Z, Zhou T, Zhuo Y, Zheng N, Chen B. Neural representation of gestalt grouping and attention effect in human visual cortex. J Neurosci Methods 2023; 399:109980. [PMID: 37783351 DOI: 10.1016/j.jneumeth.2023.109980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/29/2023] [Accepted: 09/29/2023] [Indexed: 10/04/2023]
Abstract
BACKGROUND The brain aggregates meaningless local sensory elements to form meaningful global patterns in a process called perceptual grouping. Current brain imaging studies have found that neural activities in V1 are modulated during visual grouping. However, how grouping is represented in each of the early visual areas, and how attention alters these representations, is still unknown. NEW METHOD We adopted MVPA to decode the specific content of perceptual grouping by comparing neural activity patterns between gratings and dot lattice stimuli which can be grouped with proximity law. Furthermore, we quantified the grouping effect by defining the strength of grouping, and assessed the effect of attention on grouping. RESULTS We found that activity patterns to proximity grouped stimuli in early visual areas resemble these to grating stimuli with the same orientations. This similarity exists even when there is no attention focused on the stimuli. The results also showed a progressive increase of representational strength of grouping from V1 to V3, and attention modulation to grouping is only significant in V3 among all the visual areas. COMPARISON WITH EXISTING METHODS Most of the previous work on perceptual grouping has focused on how activity amplitudes are modulated by grouping. Using MVPA, the present work successfully decoded the contents of neural activity patterns corresponding to proximity grouping stimuli, thus shed light on the availability of content-decoding approach in the research on perceptual grouping. CONCLUSIONS Our work found that the content of the neural activity patterns during perceptual grouping can be decoded in the early visual areas under both attended and unattended task, and provide novel evidence that there is a cascade processing for proximity grouping through V1 to V3. The strength of grouping was larger in V3 than in any other visual areas, and the attention modulation to the strength of grouping was only significant in V3 among all the visual areas, implying that V3 plays an important role in proximity grouping.
Collapse
Affiliation(s)
- Hao Wu
- School of Electrical Engineering, Xi'an University of Technology, Xi'an, Shaanxi 710048, China
| | - Zhentao Zuo
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China.
| | - Zejian Yuan
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an, Shaanxi 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Tiangang Zhou
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China
| | - Yan Zhuo
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China
| | - Nanning Zheng
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an, Shaanxi 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Badong Chen
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an, Shaanxi 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China.
| |
Collapse
|
3
|
Ke SC, Gupta A, Lo YH, Ting CC, Tseng P. The hidden arrow in the FedEx logo: Do we really unconsciously "see" it? Cogn Res Princ Implic 2023; 8:40. [PMID: 37395853 DOI: 10.1186/s41235-023-00494-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Accepted: 06/18/2023] [Indexed: 07/04/2023] Open
Abstract
The FedEx logo makes clever use of figure-ground ambiguity to create an "invisible" arrow in the background space between "E" and "x". Most designers believe the hidden arrow can convey an unconscious impression of speed and precision about the FedEx brand, which may influence subsequent behavior. To test this assumption, we designed similar images with hidden arrows to serve as endogenous (but camouflaged) directional cues in a Posner's orienting task, where a cueing effect would suggest subliminal processing of the hidden arrow. Overall, we observed no cue congruency effect, unless the arrow is explicitly highlighted (Experiment 4). However, there was a general effect of prior knowledge: when people were under pressure to suppress background information, those who knew about the arrow could do so faster in all congruence conditions (i.e., neutral, congruent, incongruent), although they fail to report seeing the arrow during the experiment. This was true in participants from North America who had heard of the FedEx arrow before (Experiment 1 & 3), and also in our Taiwanese sample who were just informed of such design (Experiment 2). These results can be well explained by the Biased Competition Model in figure-ground research, and together suggest: (1) people do not unconsciously perceive the FedEx arrow, at least not enough to exhibit a cueing effect in attention, but (2) knowing about the arrow can fundamentally change the way we visually process these negative-space logos in the future, making people react faster to images with negative space regardless of the hidden content.
Collapse
Affiliation(s)
- Shih-Chiang Ke
- Graduate Institute of Mind, Brain and Consciousness, Taipei Medical University, Taipei, Taiwan
| | - Ankit Gupta
- Graduate Institute of Mind, Brain and Consciousness, Taipei Medical University, Taipei, Taiwan
| | - Yu-Hui Lo
- Graduate Institute of Mind, Brain and Consciousness, Taipei Medical University, Taipei, Taiwan
| | - Chih-Chung Ting
- Institute of Psychology, University of Hamburg, Hamburg, Germany
- Center for Research in Experimental Economics and Political Decision Making, University of Amsterdam, Amsterdam, Netherlands
| | - Philip Tseng
- Graduate Institute of Mind, Brain and Consciousness, Taipei Medical University, Taipei, Taiwan.
- Cross College Elite Program, National Cheng Kung University, Tainan, Taiwan.
- Department of Psychology, National Taiwan University, Taipei, Taiwan.
- Psychiatric Research Center, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan.
- Research Center for Mind, Brain & Learning, National Chengchi University, Taipei, Taiwan.
| |
Collapse
|
4
|
Huang Z, Zaidi Q. Perceptual scale for transparency: Common fate overrides geometrical and color cues. J Vis 2022; 22:6. [PMID: 35536722 PMCID: PMC9106975 DOI: 10.1167/jov.22.6.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 04/01/2022] [Indexed: 11/24/2022] Open
Abstract
Objects that pass light through are considered transparent, and we generally expect that the light coming out will match the color of the object. However, when the object is placed on a colored surface, the light coming back to our eyes becomes a composite of surface, illumination, and transparency properties. Despite that, we can often perceive separate overlaid and overlaying layers differing in colors. How neurons separate the information to extract the transparent layer remains unknown, but the physical characteristics of transparent filters generate geometrical and color features in retinal images, which could provide cues for separating layers. We estimated the relative importance of such cues in a perceptual scale for transparency, using stimuli in which X- or T-junctions, different relative motions, and consistent or inconsistent colors cooperated or competed in forced-preference psychophysics experiments. Maximum-likelihood Thurstone scaling revealed that motion increased transparency for X-junctions, but decreased transparency for T-junctions by creating the percept of an opaque patch. However, if the motion of a filter uncovered a dynamically changing but stationary pattern, sharing a common fate with the surround but forming T-junctions, the probability of seeing transparency was almost as high as for moving X-junctions, despite the stimulus being physically improbable. In addition, geometric cues overrode color inconsistency to a great degree. Finally, a linear model of transparency perception as a function of relative motions between filter, overlay, and surround layers, contour continuation, and color consistency, quantified a hierarchy of latent influences on when the filter is seen as a separate transparent layer.
Collapse
Affiliation(s)
- Zhehao Huang
- Graduate Center for Vision Research, State University of New York, College of Optometry, New York, New York, USA
| | - Qasim Zaidi
- Graduate Center for Vision Research, State University of New York, College of Optometry, New York, New York, USA
| |
Collapse
|
5
|
Hu B, von der Heydt R, Niebur E. Figure-Ground Organization in Natural Scenes: Performance of a Recurrent Neural Model Compared with Neurons of Area V2. eNeuro 2019; 6:ENEURO.0479-18.2019. [PMID: 31167850 PMCID: PMC6635809 DOI: 10.1523/eneuro.0479-18.2019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 04/15/2019] [Accepted: 05/07/2019] [Indexed: 12/02/2022] Open
Abstract
A crucial step in understanding visual input is its organization into meaningful components, in particular object contours and partially occluded background structures. This requires that all contours are assigned to either the foreground or the background (border ownership assignment). While earlier studies showed that neurons in primate extrastriate cortex signal border ownership for simple geometric shapes, recent studies show consistent border ownership coding also for complex natural scenes. In order to understand how the brain performs this task, we developed a biologically plausible recurrent neural network that is fully image computable. Our model uses local edge detector ( B ) cells and grouping ( G ) cells whose activity represents proto-objects based on the integration of local feature information. G cells send modulatory feedback connections to those B cells that caused their activation, making the B cells border ownership selective. We found close agreement between our model and neurophysiological results in terms of the timing of border ownership signals (BOSs) as well as the consistency of BOSs across scenes. We also benchmarked our model on the Berkeley Segmentation Dataset and achieved performance comparable to recent state-of-the-art computer vision approaches. Our proposed model provides insight into the cortical mechanisms of figure-ground organization.
Collapse
Affiliation(s)
- Brian Hu
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD 21218
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205
| | - Rüdiger von der Heydt
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD 21218
- Solomon Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD 21205
| | - Ernst Niebur
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD 21218
- Solomon Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD 21205
| |
Collapse
|
6
|
Jeck DM, Qin M, Egeth H, Niebur E. Unique objects attract attention even when faint. Vision Res 2019; 160:60-71. [PMID: 31047908 DOI: 10.1016/j.visres.2019.04.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 04/11/2019] [Accepted: 04/14/2019] [Indexed: 11/20/2022]
Abstract
Locally contrasting objects, e.g. a red apple surrounded by green apples, attract attention. Does this generalize to differences in feature space? That is, do unique objects-regardless of their location-stand out from a collection of objects that are similar to one another, even when the unique object has lower local contrast with the background than the other objects? Behavioral data show indeed a preference for unique items but previous experiments enabled viewers to anticipate what response they were "supposed" to give. We developed a new experimental paradigm that minimizes such top-down effects. Pitting local contrast against global uniqueness, we show that unique stimuli attract attention even in not-anticipated, never-seen images, and even when the unique stimuli are faint (low contrast). A computational model explains how competition between objects in feature space favors dissimilar objects over those with similar features. The model explains how humans select unique objects, without a loss of performance on natural scenes.
Collapse
Affiliation(s)
- Daniel M Jeck
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Michael Qin
- Department of Biomedical Engineering, University of Connecticut at Storrs, USA
| | - Howard Egeth
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Ernst Niebur
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD, USA; Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA; Solomon Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
7
|
Self MW, Jeurissen D, van Ham AF, van Vugt B, Poort J, Roelfsema PR. The Segmentation of Proto-Objects in the Monkey Primary Visual Cortex. Curr Biol 2019; 29:1019-1029.e4. [PMID: 30853432 DOI: 10.1016/j.cub.2019.02.016] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 01/07/2019] [Accepted: 02/05/2019] [Indexed: 11/28/2022]
Abstract
During visual perception, the brain enhances the representations of image regions that belong to figures and suppresses those that belong to the background. Natural images contain many regions that initially appear to be part of a figure when analyzed locally (proto-objects) but are actually part of the background if the whole image is considered. These proto-grounds must be correctly assigned to the background to allow correct shape identification and guide behavior. To understand how the brain resolves this conflict between local and global processing, we recorded neuronal activity from the primary visual cortex (V1) of macaque monkeys while they discriminated between n/u shapes that have a central proto-ground region. We studied the fine-grained spatiotemporal profile of neural activity evoked by the n/u shape and found that neural representation of the object proceeded from a coarse-to-fine resolution. Approximately 100 ms after the stimulus onset, the representation of the proto-ground region was enhanced together with the rest of the n/u surface, but after ∼115 ms, the proto-ground was suppressed back to the level of the background. Suppression of the proto-ground was only present in animals that had been trained to perform the shape-discrimination task, and it predicted the choice of the animal on a trial-by-trial basis. Attention enhanced figure-ground modulation, but it had no effect on the strength of proto-ground suppression. The results indicate that the accuracy of scene segmentation is sharpened by a suppressive process that resolves local ambiguities by assigning proto-grounds to the background.
Collapse
Affiliation(s)
- Matthew W Self
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands.
| | - Danique Jeurissen
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands; Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10032, USA
| | - Anne F van Ham
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands
| | - Bram van Vugt
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands
| | - Jasper Poort
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands; Department of Psychology, University of Cambridge, Cambridge CB2 3EB, UK
| | - Pieter R Roelfsema
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands; Department of Integrative Neurophysiology, Center for Neurogenomics and Cognitive Research, VU University, De Boelelaan 1085, 1081HV Amsterdam, the Netherlands; Psychiatry department, Academic Medical Center, Postbus 22660, 1100DD Amsterdam, the Netherlands
| |
Collapse
|
8
|
von der Heydt R, Zhang NR. Figure and ground: how the visual cortex integrates local cues for global organization. J Neurophysiol 2018; 120:3085-3098. [PMID: 30044171 DOI: 10.1152/jn.00125.2018] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Inferring figure-ground organization in two-dimensional images may require different complementary strategies. For isolated objects, it has been shown that mechanisms in visual cortex exploit the overall distribution of contours, but in images of cluttered scenes where the grouping of contours is not obvious, that strategy would fail. However, natural scenes contain local features, specifically contour junctions, that may contribute to the definition of object regions. To study the role of local features in the assignment of border ownership, we recorded single-cell activity from visual cortex in awake behaving Macaca mulatta. We tested configurations perceived as two overlapping figures in which T- and L-junctions depend on the direction of overlap, whereas the overall distribution of contours provides no valid information. While recording responses to the occluding contour, we varied direction of overlap and variably masked some of the critical contour features to determine their influences and their interactions. On average, most features influenced the responses consistently, producing either enhancement or suppression depending on border ownership. Different feature types could have opposite effects even at the same location. Features far from the receptive field produced effects as strong as near features and with the same short latency. Summation was highly nonlinear: any single feature produced more than two-thirds of the effect of all features together. These findings reveal fast and highly specific organization mechanisms, supporting a previously proposed model in which "grouping cells" integrate widely distributed edge signals with specific end-stopped signals to modulate the original edge signals by feedback. NEW & NOTEWORTHY Seeing objects seems effortless, but defining objects in a scene requires sophisticated neural mechanisms. For isolated objects, the visual cortex groups contours based on overall distribution, but this strategy does not work for cluttered scenes. Here, we demonstrate mechanisms that integrate local contour features like T- and L-junctions to resolve clutter. The process is fast, evaluates widely distributed features, and gives any single feature a decisive influence on figure-ground representation.
Collapse
Affiliation(s)
- Rüdiger von der Heydt
- Department of Neuroscience, Johns Hopkins University School of Medicine , Baltimore, Maryland.,Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University , Baltimore, Maryland
| | - Nan R Zhang
- Department of Neuroscience, Johns Hopkins University School of Medicine , Baltimore, Maryland
| |
Collapse
|
9
|
Hu B, Niebur E. A recurrent neural model for proto-object based contour integration and figure-ground segregation. J Comput Neurosci 2017; 43:227-242. [PMID: 28924628 PMCID: PMC5693639 DOI: 10.1007/s10827-017-0659-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Revised: 06/22/2017] [Accepted: 09/08/2017] [Indexed: 12/01/2022]
Abstract
Visual processing of objects makes use of both feedforward and feedback streams of information. However, the nature of feedback signals is largely unknown, as is the identity of the neuronal populations in lower visual areas that receive them. Here, we develop a recurrent neural model to address these questions in the context of contour integration and figure-ground segregation. A key feature of our model is the use of grouping neurons whose activity represents tentative objects ("proto-objects") based on the integration of local feature information. Grouping neurons receive input from an organized set of local feature neurons, and project modulatory feedback to those same neurons. Additionally, inhibition at both the local feature level and the object representation level biases the interpretation of the visual scene in agreement with principles from Gestalt psychology. Our model explains several sets of neurophysiological results (Zhou et al. Journal of Neuroscience, 20(17), 6594-6611 2000; Qiu et al. Nature Neuroscience, 10(11), 1492-1499 2007; Chen et al. Neuron, 82(3), 682-694 2014), and makes testable predictions about the influence of neuronal feedback and attentional selection on neural responses across different visual areas. Our model also provides a framework for understanding how object-based attention is able to select both objects and the features associated with them.
Collapse
Affiliation(s)
- Brian Hu
- Zanvyl Krieger Mind/Brain Institute and Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA, Tel.: +1 410 516-8640, Fax.: +1 410 516-8648,
| | - Ernst Niebur
- Zanvyl Krieger Mind/Brain Institute and Solomon Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD 21218, USA,
| |
Collapse
|
10
|
George D, Lehrach W, Kansky K, Lázaro-Gredilla M, Laan C, Marthi B, Lou X, Meng Z, Liu Y, Wang H, Lavin A, Phoenix DS. A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs. Science 2017; 358:science.aag2612. [PMID: 29074582 DOI: 10.1126/science.aag2612] [Citation(s) in RCA: 73] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 09/08/2017] [Indexed: 11/02/2022]
Abstract
Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence.
Collapse
Affiliation(s)
- Dileep George
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA.
| | | | - Ken Kansky
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA
| | | | | | | | - Xinghua Lou
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA
| | - Zhaoshi Meng
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA
| | - Yi Liu
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA
| | - Huayan Wang
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA
| | - Alex Lavin
- Vicarious AI, 2 Union Square, Union City, CA 94587, USA
| | | |
Collapse
|
11
|
Consistency of Border-Ownership Cells across Artificial Stimuli, Natural Stimuli, and Stimuli with Ambiguous Contours. J Neurosci 2017; 36:11338-11349. [PMID: 27807174 DOI: 10.1523/jneurosci.1857-16.2016] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 09/10/2016] [Indexed: 11/21/2022] Open
Abstract
Segmentation and recognition of objects in a visual scene are two problems that are hard to solve separately from each other. When segmenting an ambiguous scene, it is helpful to already know the present objects and their shapes. However, for recognizing an object in clutter, one would like to consider its isolated segment alone to avoid confounds from features of other objects. Border-ownership cells (Zhou et al., 2000) appear to play an important role in segmentation, as they signal the side-of-figure of artificial stimuli. The present work explores the role of border-ownership cells in dorsal macaque visual areas V2 and V3 in the segmentation of natural object stimuli and locally ambiguous stimuli. We report two major results. First, compared with previous estimates, we found a smaller percentage of cells that were consistent across artificial stimuli used previously. Second, we found that the average response of those neurons that did respond consistently to the side-of-figure of artificial stimuli also consistently signaled, as a population, the side-of-figure for borders of single faces, occluding faces and, with higher latencies, even stimuli with illusory contours, such as Mooney faces and natural faces completely missing local edge information. In contrast, the local edge or the outlines of the face alone could not always evoke a significant border-ownership signal. Our results underscore that border ownership is coded by a population of cells, and indicate that these cells integrate a variety of cues, including low-level features and global object context, to compute the segmentation of the scene. SIGNIFICANCE STATEMENT To distinguish different objects in a natural scene, the brain must segment the image into regions corresponding to objects. The so-called "border-ownership" cells appear to be dedicated to this task, as they signal for a given edge on which side the object is that owns it. Here, we report that individual border-ownership cells are unreliable when tested across a battery of artificial stimuli used previously but can signal border-ownership consistently as a population. We show that these border-ownership population signals are also suited for signaling border-ownership for natural objects and at longer latency, even for stimuli without local edge information. Our results suggest that border-ownership cells integrate both local, low-level and global, high-level cues to segment the scene.
Collapse
|
12
|
|
13
|
von der Heydt R. Figure-ground organization and the emergence of proto-objects in the visual cortex. Front Psychol 2015; 6:1695. [PMID: 26579062 PMCID: PMC4630502 DOI: 10.3389/fpsyg.2015.01695] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Accepted: 10/20/2015] [Indexed: 11/13/2022] Open
Abstract
A long history of studies of perception has shown that the visual system organizes the incoming information early on, interpreting the 2D image in terms of a 3D world and producing a structure that provides perceptual continuity and enables object-based attention. Recordings from monkey visual cortex show that many neurons, especially in area V2, are selective for border ownership. These neurons are edge selective and have ordinary classical receptive fields (CRF), but in addition their responses are modulated (enhanced or suppressed) depending on the location of a 'figure' relative to the edge in their receptive field. Each neuron has a fixed preference for location on one side or the other. This selectivity is derived from the image context far beyond the CRF. This paper reviews evidence indicating that border ownership selectivity reflects the formation of early object representations ('proto-objects'). The evidence includes experiments showing (1) reversal of border ownership signals with change of perceived object structure, (2) border ownership specific enhancement of responses in object-based selective attention, (3) persistence of border ownership signals in accordance with continuity of object perception, and (4) remapping of border ownership signals across saccades and object movements. Findings 1 and 2 can be explained by hypothetical grouping circuits that sum contour feature signals in search of objectness, and, via recurrent projections, enhance the corresponding low-level feature signals. Findings 3 and 4 might be explained by assuming that the activity of grouping circuits persists and can be remapped. Grouping, persistence, and remapping are fundamental operations of vision. Finding these operations manifest in low-level visual areas challenges traditional views of visual processing. New computational models need to be developed for a comprehensive understanding of the function of the visual cortex.
Collapse
|
14
|
Marquardt G, Cross ES, de Sousa AA, Edelstein E, Farnè A, Leszczynski M, Patterson M, Quadflieg S. There or not there? A multidisciplinary review and research agenda on the impact of transparent barriers on human perception, action, and social behavior. Front Psychol 2015; 6:1381. [PMID: 26441756 PMCID: PMC4569749 DOI: 10.3389/fpsyg.2015.01381] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2014] [Accepted: 08/28/2015] [Indexed: 11/13/2022] Open
Abstract
Through advances in production and treatment technologies, transparent glass has become an increasingly versatile material and a global hallmark of modern architecture. In the shape of invisible barriers, it defines spaces while simultaneously shaping their lighting, noise, and climate conditions. Despite these unique architectural qualities, little is known regarding the human experience with glass barriers. Is a material that has been described as being simultaneously there and not there from an architectural perspective, actually there and/or not there from perceptual, behavioral, and social points of view? In this article, we review systematic observations and experimental studies that explore the impact of transparent barriers on human cognition and action. In doing so, the importance of empirical and multidisciplinary approaches to inform the use of glass in contemporary architecture is highlighted and key questions for future inquiry are identified.
Collapse
Affiliation(s)
| | - Emily S. Cross
- School of Psychology, Bangor UniversityBangor, UK
- Department of Social and Cultural Psychology, Behavioural Science Institute, Donders Institute for Brain, Cognition and Behaviour, Radboud University NijmegenNijmegen, Netherlands
| | - Alexandra A. de Sousa
- Faculty of Psychology, School of Society, Enterprise, and Environment, Bath Spa UniversitySomerset, UK
| | - Eve Edelstein
- College of Architecture, Planning and Landscape Architecture, University of ArizonaTucson, AZ, USA
| | - Alessandro Farnè
- ImpAct Team, Lyon Neuroscience Research Center, INSERM U1028, CNRS UMR5292, University Claude Bernard Lyon ILyon, France
| | | | - Miles Patterson
- Department of Psychology, University of Missouri–St. Louis, St. LouisMO, USA
| | - Susanne Quadflieg
- School of Experimental Psychology, University of BristolBristol, UK
- Division of Psychology, New York University Abu Dhabi, Abu DhabiUAE
| |
Collapse
|
15
|
Abstract
Neurons at early stages of the visual cortex signal elemental features, such as pieces of contour, but how these signals are organized into perceptual objects is unclear. Theories have proposed that spiking synchrony between these neurons encodes how features are grouped (binding-by-synchrony), but recent studies did not find the predicted increase in synchrony with binding. Here we propose that features are grouped to "proto-objects" by intrinsic feedback circuits that enhance the responses of the participating feature neurons. This hypothesis predicts synchrony exclusively between feature neurons that receive feedback from the same grouping circuit. We recorded from neurons in macaque visual cortex and used border-ownership selectivity, an intrinsic property of the neurons, to infer whether or not two neurons are part of the same grouping circuit. We found that binding produced synchrony between same-circuit neurons, but not between other pairs of neurons, as predicted by the grouping hypothesis. In a selective attention task, synchrony emerged with ignored as well as attended objects, and higher synchrony was associated with faster behavioral responses, as would be expected from early grouping mechanisms that provide the structure for object-based processing. Thus, synchrony could be produced by automatic activation of intrinsic grouping circuits. However, the binding-related elevation of synchrony was weak compared with its random fluctuations, arguing against synchrony as a code for binding. In contrast, feedback grouping circuits encode binding by modulating the response strength of related feature neurons. Thus, our results suggest a novel coding mechanism that might underlie the proto-objects of perception.
Collapse
|
16
|
Layton OW, Yazdanbakhsh A. A neural model of border-ownership from kinetic occlusion. Vision Res 2014; 106:64-80. [PMID: 25448117 DOI: 10.1016/j.visres.2014.11.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 10/29/2014] [Accepted: 11/04/2014] [Indexed: 11/19/2022]
Abstract
Camouflaged animals that have very similar textures to their surroundings are difficult to detect when stationary. However, when an animal moves, humans readily see a figure at a different depth than the background. How do humans perceive a figure breaking camouflage, even though the texture of the figure and its background may be statistically identical in luminance? We present a model that demonstrates how the primate visual system performs figure-ground segregation in extreme cases of breaking camouflage based on motion alone. Border-ownership signals develop as an emergent property in model V2 units whose receptive fields are nearby kinetically defined borders that separate the figure and background. Model simulations support border-ownership as a general mechanism by which the visual system performs figure-ground segregation, despite whether figure-ground boundaries are defined by luminance or motion contrast. The gradient of motion- and luminance-related border-ownership signals explains the perceived depth ordering of the foreground and background surfaces. Our model predicts that V2 neurons, which are sensitive to kinetic edges, are selective to border-ownership (magnocellular B cells). A distinct population of model V2 neurons is selective to border-ownership in figures defined by luminance contrast (parvocellular B cells). B cells in model V2 receive feedback from neurons in V4 and MT with larger receptive fields to bias border-ownership signals toward the figure. We predict that neurons in V4 and MT sensitive to kinetically defined figures play a crucial role in determining whether the foreground surface accretes, deletes, or produces a shearing motion with respect to the background.
Collapse
Affiliation(s)
- Oliver W Layton
- Department of Cognitive Science, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, USA; Center for Computational Neuroscience and Neural Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | - Arash Yazdanbakhsh
- Center for Computational Neuroscience and Neural Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| |
Collapse
|
17
|
Topographic representation of an occluded object and the effects of spatiotemporal context in human early visual areas. J Neurosci 2013; 33:16992-7007. [PMID: 24155304 DOI: 10.1523/jneurosci.1455-12.2013] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Occlusion is a primary challenge facing the visual system in perceiving object shapes in intricate natural scenes. Although behavior, neurophysiological, and modeling studies have shown that occluded portions of objects may be completed at the early stage of visual processing, we have little knowledge on how and where in the human brain the completion is realized. Here, we provide functional magnetic resonance imaging (fMRI) evidence that the occluded portion of an object is indeed represented topographically in human V1 and V2. Specifically, we find the topographic cortical responses corresponding to the invisible object rotation in V1 and V2. Furthermore, by investigating neural responses for the occluded target rotation within precisely defined cortical subregions, we could dissociate the topographic neural representation of the occluded portion from other types of neural processing such as object edge processing. We further demonstrate that the early topographic representation in V1 can be modulated by prior knowledge of a whole appearance of an object obtained before partial occlusion. These findings suggest that primary "visual" area V1 has the ability to process not only visible or virtually (illusorily) perceived objects but also "invisible" portions of objects without concurrent visual sensation such as luminance enhancement to these portions. The results also suggest that low-level image features and higher preceding cognitive context are integrated into a unified topographic representation of occluded portion in early areas.
Collapse
|
18
|
Perceptual separation of transparent motion components: the interaction of motion, luminance and shape cues. Exp Brain Res 2013; 230:71-86. [PMID: 23831850 DOI: 10.1007/s00221-013-3631-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Accepted: 06/20/2013] [Indexed: 10/26/2022]
Abstract
Transparency is perceived when two or more objects or surfaces can be separated by the visual system whilst they are presented in the same region of the visual field at the same time. This segmentation of distinct entities on the basis of overlapping local visual cues poses an interesting challenge for the understanding of cortical information processing. In psychophysical experiments, we studied stimuli that contained randomly positioned disc elements, moving at two different speeds in the same direction, to analyse the interaction of cues during the perception of motion transparency. The current work extends findings from previous experiments with sine wave luminance gratings which only vary in one spatial dimension. The reported experiments manipulate low-level cues, like differences in speed or luminance, and what are likely to be higher level cues such as the relative size of the elements or the superposition rules that govern overlapping regions. The mechanism responsible for separation appears to be mediated by combination of the relevant and available cues. Where perceived transparency is stronger, the neural representations of components are inferred to be more distinguishable from each other across what appear to be multiple cue dimensions. The disproportionally large effect on transparency strength of the type of superposition of disc suggests that with this manipulation, there may be enhanced separation above what might be expected from the linear combination of low-level cues in a process we term labelling. A mechanism for transparency perception consistent with the current results would require a minimum of three stages; in addition to the local motion detection and global pooling and separation of motion signals, findings suggest a powerful additional role of higher level separation cues.
Collapse
|
19
|
Carnecky R, Fuchs R, Mehl S, Jang Y, Peikert R. Smart transparency for illustrative visualization of complex flow surfaces. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2013; 19:838-851. [PMID: 22802119 DOI: 10.1109/tvcg.2012.159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The perception of transparency and the underlying neural mechanisms have been subject to extensive research in the cognitive sciences. However, we have yet to develop visualization techniques that optimally convey the inner structure of complex transparent shapes. In this paper, we apply the findings of perception research to develop a novel illustrative rendering method that enhances surface transparency nonlocally. Rendering of transparent geometry is computationally expensive since many optimizations, such as visibility culling, are not applicable and fragments have to be sorted by depth for correct blending. In order to overcome these difficulties efficiently, we propose the illustration buffer. This novel data structure combines the ideas of the A and G-buffers to store a list of all surface layers for each pixel. A set of local and nonlocal operators is then used to process these depth-lists to generate the final image. Our technique is interactive on current graphics hardware and is only limited by the available graphics memory. Based on this framework, we present an efficient algorithm for a nonlocal transparency enhancement that creates expressive renderings of transparent surfaces. A controlled quantitative double blind user study shows that the presented approach improves the understanding of complex transparent surfaces significantly.
Collapse
Affiliation(s)
- Robert Carnecky
- Computer Science Department, ETH Zurich, 8092 Zurich, Switzerland.
| | | | | | | | | |
Collapse
|
20
|
O'Herron P, von der Heydt R. Remapping of border ownership in the visual cortex. J Neurosci 2013; 33:1964-74. [PMID: 23365235 PMCID: PMC4086328 DOI: 10.1523/jneurosci.2797-12.2013] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Revised: 09/17/2012] [Accepted: 10/22/2012] [Indexed: 11/21/2022] Open
Abstract
We see objects as having continuity although the retinal image changes frequently. How such continuity is achieved is hard to understand, because neurons in the visual cortex have small receptive fields that are fixed on the retina, which means that a different set of neurons is activated every time the eyes move. Neurons in areas V1 and V2 of the visual cortex signal the local features that are currently in their receptive fields and do not show "remapping" when the image moves. However, subsets of neurons in these areas also carry information about global aspects, such as figure-ground organization. Here we performed experiments to find out whether figure-ground organization is remapped. We recorded single neurons in macaque V1 and V2 in which figure-ground organization is represented by assignment of contours to regions (border ownership). We found previously that border-ownership signals persist when a figure edge is switched to an ambiguous edge by removing the context. We now used this paradigm to see whether border ownership transfers when the ambiguous edge is moved across the retina. In the new position, the edge activated a different set of neurons at a different location in cortex. We found that border ownership was transferred to the newly activated neurons. The transfer occurred whether the edge was moved by a saccade or by moving the visual display. Thus, although the contours are coded in retinal coordinates, their assignment to objects is maintained across movements of the retinal image.
Collapse
Affiliation(s)
- Philip O'Herron
- Krieger Mind/Brain Institute and Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21218, USA.
| | | |
Collapse
|
21
|
Affiliation(s)
- Jonathan R Williford
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | |
Collapse
|
22
|
Sakai K, Nishimura H, Shimizu R, Kondo K. Consistent and robust determination of border ownership based on asymmetric surrounding contrast. Neural Netw 2012; 33:257-74. [DOI: 10.1016/j.neunet.2012.05.006] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2010] [Revised: 04/19/2012] [Accepted: 05/18/2012] [Indexed: 11/25/2022]
|
23
|
Wagemans J, Elder JH, Kubovy M, Palmer SE, Peterson MA, Singh M, von der Heydt R. A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. Psychol Bull 2012; 138:1172-217. [PMID: 22845751 DOI: 10.1037/a0029333] [Citation(s) in RCA: 505] [Impact Index Per Article: 42.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In 1912, Max Wertheimer published his paper on phi motion, widely recognized as the start of Gestalt psychology. Because of its continued relevance in modern psychology, this centennial anniversary is an excellent opportunity to take stock of what Gestalt psychology has offered and how it has changed since its inception. We first introduce the key findings and ideas in the Berlin school of Gestalt psychology, and then briefly sketch its development, rise, and fall. Next, we discuss its empirical and conceptual problems, and indicate how they are addressed in contemporary research on perceptual grouping and figure-ground organization. In particular, we review the principles of grouping, both classical (e.g., proximity, similarity, common fate, good continuation, closure, symmetry, parallelism) and new (e.g., synchrony, common region, element and uniform connectedness), and their role in contour integration and completion. We then review classic and new image-based principles of figure-ground organization, how it is influenced by past experience and attention, and how it relates to shape and depth perception. After an integrated review of the neural mechanisms involved in contour grouping, border ownership, and figure-ground perception, we conclude by evaluating what modern vision science has offered compared to traditional Gestalt psychology, whether we can speak of a Gestalt revival, and where the remaining limitations and challenges lie. A better integration of this research tradition with the rest of vision science requires further progress regarding the conceptual and theoretical foundations of the Gestalt approach, which is the focus of a second review article.
Collapse
Affiliation(s)
- Johan Wagemans
- University of Leuven (KU Leuven), Laboratory of Experimental Psychology, Tiensestraat 102, Box 3711, BE-3000 Leuven, Belgium.
| | | | | | | | | | | | | |
Collapse
|
24
|
DADAM J, ALBERTAZZI L, CANAL L, MICCIOLO R. AMODAL COMPLETION OF BOUNDARIES IN COLOURED SURFACES. PSYCHOLOGIA 2012. [DOI: 10.2117/psysoc.2012.227] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
25
|
Zannoli M, Mamassian P. The role of transparency in da Vinci stereopsis. Vision Res 2011; 51:2186-97. [PMID: 21906614 DOI: 10.1016/j.visres.2011.08.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Revised: 08/12/2011] [Accepted: 08/16/2011] [Indexed: 10/17/2022]
Abstract
The majority of natural scenes contains zones that are visible to one eye only. Past studies have shown that these monocular regions can be seen at a precise depth even though there are no binocular disparities that uniquely constrain their locations in depth. In the so-called da Vinci stereopsis configuration, the monocular region is a vertical line placed next to a binocular rectangular occluder. The opacity of the occluder has been mentioned to be a necessary condition to obtain da Vinci stereopsis. However, this opacity constraint has never been empirically tested. In the present study, we tested whether da Vinci stereopsis and perceptual transparency can interact using a classical da Vinci configuration in which the opacity of the occluder varied. We used two different monocular objects: a line and a disk. We found no effect of the opacity of the occluder on the perceived depth of the monocular object. A careful analysis of the distribution of perceived depth revealed that the monocular object was perceived at a depth that increased with the distance between the object and the occluder. The analysis of the skewness of the distributions was not consistent with a double fusion explanation, favoring an implication of occlusion geometry in da Vinci stereopsis. A simple model that includes the geometry of the scene could account for the results. In summary, the mechanism responsible to locate monocular regions in depth is not sensitive to the material properties of objects, suggesting that da Vinci stereopsis is solved at relatively early stages of disparity processing.
Collapse
Affiliation(s)
- Marina Zannoli
- Laboratoire Psychologie de la Perception (CNRS UMR 8158), Université Paris Descartes, France.
| | | |
Collapse
|
26
|
Lee K, Choo H. A critical review of selective attention: an interdisciplinary perspective. Artif Intell Rev 2011. [DOI: 10.1007/s10462-011-9278-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
27
|
Schmidt T, Haberkamp A, Veltkamp GM, Weber A, Seydell-Greenwald A, Schmidt F. Visual processing in rapid-chase systems: image processing, attention, and awareness. Front Psychol 2011; 2:169. [PMID: 21811484 PMCID: PMC3139957 DOI: 10.3389/fpsyg.2011.00169] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Accepted: 07/06/2011] [Indexed: 11/13/2022] Open
Abstract
Visual stimuli can be classified so rapidly that their analysis may be based on a single sweep of feedforward processing through the visuomotor system. Behavioral criteria for feedforward processing can be evaluated in response priming tasks where speeded pointing or keypress responses are performed toward target stimuli which are preceded by prime stimuli. We apply this method to several classes of complex stimuli. (1) When participants classify natural images into animals or non-animals, the time course of their pointing responses indicates that prime and target signals remain strictly sequential throughout all processing stages, meeting stringent behavioral criteria for feedforward processing (rapid-chase criteria). (2) Such priming effects are boosted by selective visual attention for positions, shapes, and colors, in a way consistent with bottom-up enhancement of visuomotor processing, even when primes cannot be consciously identified. (3) Speeded processing of phobic images is observed in participants specifically fearful of spiders or snakes, suggesting enhancement of feedforward processing by long-term perceptual learning. (4) When the perceived brightness of primes in complex displays is altered by means of illumination or transparency illusions, priming effects in speeded keypress responses can systematically contradict subjective brightness judgments, such that one prime appears brighter than the other but activates motor responses as if it was darker. We propose that response priming captures the output of the first feedforward pass of visual signals through the visuomotor system, and that this output lacks some characteristic features of more elaborate, recurrent processing. This way, visuomotor measures may become dissociated from several aspects of conscious vision. We argue that "fast" visuomotor measures predominantly driven by feedforward processing should supplement "slow" psychophysical measures predominantly based on visual awareness.
Collapse
Affiliation(s)
- Thomas Schmidt
- Faculty of Social Sciences, Psychology I, University of Kaiserslautern Kaiserslautern, Germany
| | | | | | | | | | | |
Collapse
|
28
|
Su YR, He ZJ, Ooi TL. Boundary contour-based surface integration affected by color. Vision Res 2010; 50:1833-44. [PMID: 20558193 DOI: 10.1016/j.visres.2010.06.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2010] [Revised: 06/07/2010] [Accepted: 06/08/2010] [Indexed: 11/19/2022]
Abstract
The visual system represents occluded surfaces by integrating the visible and partially occluded fragments with reliance on surface boundary contours. Does surface integration also depend on color similarity? Using displays with aligned images, we found the visual system has a preference to integrate images with the same color to form occluded surfaces and construct illusory occluding surfaces. This results in enhanced shape discrimination of briefly presented stimuli, and a tendency to perceive global motion of the integrated fragments. The contribution of color to surface integration is observed both in equiluminous setting and in non-equiluminous setting, where achromatic contrast exists.
Collapse
Affiliation(s)
- Yong R Su
- Department of Basic Sciences, Pennsylvania College of Optometry at Salus University, Elkins Park, PA 19027, USA
| | | | | |
Collapse
|
29
|
Response priming driven by local contrast, not subjective brightness. Atten Percept Psychophys 2010; 72:1556-68. [DOI: 10.3758/app.72.6.1556] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
30
|
Analysis of the context integration mechanisms underlying figure-ground organization in the visual cortex. J Neurosci 2010; 30:6482-96. [PMID: 20463212 DOI: 10.1523/jneurosci.5168-09.2010] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Most neurons in visual cortex respond to contrast borders and are orientation selective, and some are also selective for which side of a border is figure and which side is ground ("border ownership coding"). These neurons are influenced by the image context far beyond the classical receptive field (CRF) and as early as 25 ms after the onset of activity in the cortex. The nature of the fast context integration mechanism is not well understood. What parts of a figure contribute to the context effect? What is the structure of the "extraclassical surround"? Is the context information propagated through horizontal fibers within cortex or through reciprocal connections via higher-level areas? To address these questions, we studied border ownership modulation with fragmented figures. Neurons were recorded in areas V1 and V2 of Macaca mulatta under behaviorally induced fixation. Test figures were fragmented rectangles. While one edge was centered on the CRF, the presence of the fragments outside the CRF was varied. The surround fragments produced facilitation on the preferred border ownership side as well as suppression on the nonpreferred side, with approximately 80% of the locations contributing on average. Fragments far from the CRF influenced the responses even in the absence of fragments closer to the CRF, and without the extra delay that would incur from propagation through horizontal fibers. Three principally different models are discussed. The results support a model in which the antagonistic surround influences are produced by reentrant signals from a higher-level area.
Collapse
|
31
|
Peeling plaids apart: context counteracts cross-orientation contrast masking. PLoS One 2009; 4:e8123. [PMID: 19956546 PMCID: PMC2780729 DOI: 10.1371/journal.pone.0008123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 09/25/2009] [Indexed: 11/19/2022] Open
Abstract
Background Contrast discrimination for an image is usually harder if another image is superimposed on top. We asked whether such contrast masking may be enhanced or relieved depending on cues promoting integration of both images as a single pattern, versus segmentation into two independent components. Methodology & Principal Findings Contrast discrimination thresholds for a foveal test grating were sharply elevated in the presence of a perfectly overlapping orthogonally-oriented mask grating. However thresholds returned to the unmasked baseline when a surround grating was added, having the same orientation and phase of either the test or mask grating. Both such masking and ‘unmasking’ effects were much stronger for moving than static stimuli. Conclusions & Significance Our results suggest that common-fate motion reinforces the perception of a single coherent plaid pattern, while the surround helps to identify each component independently, thus peeling the plaid apart again. These results challenge current models of early vision, suggesting that higher-level surface organization influences contrast encoding, determining whether the contrast of a grating may be recovered independently from that of its mask.
Collapse
|
32
|
Xu JP, He ZJ, Ooi TL. Surface boundary contour strengthens image dominance in binocular competition. Vision Res 2009; 50:155-70. [PMID: 19913047 DOI: 10.1016/j.visres.2009.11.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2009] [Revised: 11/02/2009] [Accepted: 11/09/2009] [Indexed: 11/28/2022]
Abstract
We used a binocular rivalry stimulus with one half-image having a vertical grating disk surrounded by horizontal grating, and the other half-image having a horizontal grating disk with a variable spatial phase relative to the surrounding horizontal grating. We found that increasing the phase-shift of the horizontal grating disk, which strengthens the boundary contour, progressively increases its predominance. But the predominance is little affected when a constant gray ring (boundary contour) is added onto the rim of the incrementally phase-shifted horizontal grating. This suggests the influence of boundary contour supersede that of the center-surround-interaction caused by the phase-shift.
Collapse
Affiliation(s)
- Jingping P Xu
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, KY 40292, USA
| | | | | |
Collapse
|
33
|
Affiliation(s)
- Ken Nakayama
- Department of Psychology, Harvard University, Cambridge, MA 02138, USA.
| | | |
Collapse
|
34
|
O'Herron P, von der Heydt R. Short-term memory for figure-ground organization in the visual cortex. Neuron 2009; 61:801-9. [PMID: 19285475 DOI: 10.1016/j.neuron.2009.01.014] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2008] [Revised: 10/29/2008] [Accepted: 01/21/2009] [Indexed: 11/25/2022]
Abstract
Whether the visual system uses a buffer to store image information and the duration of that storage have been debated intensely in recent psychophysical studies. The long phases of stable perception of reversible figures suggest a memory that persists for seconds. But persistence of similar duration has not been found in signals of the visual cortex. Here, we show that figure-ground signals in the visual cortex can persist for a second or more after the removal of the figure-ground cues. When new figure-ground information is presented, the signals adjust rapidly, but when a figure display is changed to an ambiguous edge display, the signals decay slowly--a behavior that is characteristic of memory devices. Figure-ground signals represent the layout of objects in a scene, and we propose that a short-term memory for object layout is important in providing continuity of perception in the rapid stream of images flooding our eyes.
Collapse
Affiliation(s)
- Philip O'Herron
- Krieger Mind/Brain Institute and Department of Neuroscience, Johns Hopkins University, 3400 N Charles Street, Baltimore, MD 21218, USA
| | | |
Collapse
|
35
|
Spillmann L. Phenomenology and neurophysiological correlations: two approaches to perception research. Vision Res 2009; 49:1507-21. [PMID: 19303897 DOI: 10.1016/j.visres.2009.02.022] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2008] [Revised: 12/09/2008] [Accepted: 02/05/2009] [Indexed: 10/21/2022]
Abstract
This article argues that phenomenological description and neurophysiological correlation complement each other in perception research. Whilst phenomena constitute the material, neuronal mechanisms are indispensable for their explanation. Numerous examples of neurophysiological correlates show that the correlation of phenomenology and neurophysiology is fruitful. Phenomena for which neuronal mechanism have been found include: (in area V1) filling-in of real and artificial scotomata, contour integration, figure-ground segregation by orientation contrast, amodal completion, and motion transparency; (in V2) modal completion, border ownership, surface transparency, and cyclopean perception; (in V3) alignment in dotted contours, and filling-in with dynamic texture; (in V4) colour constancy; (in MT) shape by accretion/deletion, grouping by coherent motion, apparent motion in motion quartets, motion in apertures, and biological motion. Results suggest that in monkey visual cortex, occlusion cues, including stereo depth, are predominantly processed in lower areas, whereas mechanisms for grouping and motion are primarily represented in higher areas. More correlations are likely to emerge as neuroscientists strive for a better understanding of visual perception. The paper concludes with a review of major achievements in visual neuroscience pertinent to the study of the phenomena under consideration.
Collapse
Affiliation(s)
- Lothar Spillmann
- Neurozentrum, Neurological Clinic, University Hospital, Freiburg, Germany.
| |
Collapse
|
36
|
Dong Y, Mihalas S, Qiu F, von der Heydt R, Niebur E. Synchrony and the binding problem in macaque visual cortex. J Vis 2008; 8:30.1-16. [PMID: 19146262 DOI: 10.1167/8.7.30] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Accepted: 02/27/2008] [Indexed: 11/24/2022] Open
Abstract
We tested the binding-by-synchrony hypothesis which proposes that object representations are formed by synchronizing spike activity between neurons that code features of the same object. We studied responses of 32 pairs of neurons recorded with microelectrodes 3 mm apart in the visual cortex of macaques performing a fixation task. Upon mapping the receptive fields of the neurons, a quadrilateral was generated so that two of its sides were centered in the receptive fields at the optimal orientations. This one-figure condition was compared with a two-figure condition in which the neurons were stimulated by two separate figures, keeping the local edges in the receptive fields identical. For each neuron, we also determined its border ownership selectivity (H. Zhou, H. S. Friedman, & R. von der Heydt, 2000). We examined both synchronization and correlation at nonzero time lag. After correcting for effects of the firing rate, we found that synchrony did not depend on the binding condition. However, finding synchrony in a pair of neurons was correlated with finding border-ownership selectivity in both members of the pair. This suggests that the synchrony reflected the connectivity in the network that generates border ownership assignment. Thus, we have not found evidence to support the binding-by-synchrony hypothesis.
Collapse
Affiliation(s)
- Yi Dong
- Zanvyl Krieger Mind/Brain Institute, and Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA.
| | | | | | | | | |
Collapse
|
37
|
Qiu FT, Sugihara T, von der Heydt R. Figure-ground mechanisms provide structure for selective attention. Nat Neurosci 2007; 10:1492-9. [PMID: 17922006 DOI: 10.1038/nn1989] [Citation(s) in RCA: 167] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2007] [Accepted: 09/04/2007] [Indexed: 11/09/2022]
Abstract
Attention depends on figure-ground organization: figures draw attention, whereas shapes of the ground tend to be ignored. Recent research has revealed mechanisms for figure-ground organization in the visual cortex, but how these mechanisms relate to the attention process remains unclear. Here we show that the influences of figure-ground organization and volitional (top-down) attention converge in single neurons of area V2 in Macaca mulatta. Although we found assignment of border ownership for attended and for ignored figures, attentional modulation was stronger when the attended figure was located on the neuron's preferred side of border ownership. When the border between two overlapping figures was placed in the receptive field, responses depended on the side of attention, and enhancement was generally found on the neuron's preferred side of border ownership. This correlation suggests that the neural network that creates figure-ground organization also provides the interface for the top-down selection process.
Collapse
Affiliation(s)
- Fangtu T Qiu
- Krieger Mind/Brain Institute and Department of Neuroscience, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|