1
|
Bun LM, Horwitz GD. Color and luminance processing in V1 complex cells and artificial neural networks. COLOR RESEARCH AND APPLICATION 2023; 48:841-852. [PMID: 38145033 PMCID: PMC10746296 DOI: 10.1002/col.22903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 09/03/2023] [Indexed: 12/26/2023]
Abstract
Object recognition by natural and artificial visual systems benefits from the identification of object boundaries. A useful cue for the detection of object boundaries is the superposition of luminance and color edges. To gain insight into the suitability of this cue for object recognition, we examined convolutional neural network models that had been trained to recognize objects in natural images. We focused specifically on units in the second convolutional layer whose activations are invariant to the spatial phase of a sinusoidal grating. Some of these units were tuned for a nonlinear combination of color and luminance, which is broadly consistent with a role in object boundary detection. Others were tuned for luminance alone, but very few were tuned for color alone. A literature review reveals that V1 complex cells have a similar distribution of tuning. We speculate that this pattern of sensitivity provides an efficient basis for object recognition, perhaps by mitigating the effects of lighting on luminance contrast polarity. The absence of a contrast polarity-invariant representation of chromaticity alone suggests that it is redundant with other representations.
Collapse
Affiliation(s)
- Luke M. Bun
- Department of Bioengineering
- Washington National Primate Research Center
| | - Gregory D. Horwitz
- Department of Bioengineering
- Washington National Primate Research Center
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, 98195
| |
Collapse
|
2
|
Peterson MA, Campbell ES. Backward masking implicates cortico-cortical recurrent processes in convex figure context effects and cortico-thalamic recurrent processes in resolving figure-ground ambiguity. Front Psychol 2023; 14:1243405. [PMID: 37809293 PMCID: PMC10552270 DOI: 10.3389/fpsyg.2023.1243405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/17/2023] [Indexed: 10/10/2023] Open
Abstract
Introduction Previous experiments purportedly showed that image-based factors like convexity were sufficient for figure assignment. Recently, however, we found that the probability of perceiving a figure on the convex side of a central border was only slightly higher than chance for two-region displays and increased with the number of display regions; this increase was observed only when the concave regions were homogeneously colored. These convex figure context effects (CEs) revealed that figure assignment in these classic displays entails more than a response to local convexity. A Bayesian observer replicated the convex figure CEs using both a convexity object prior and a new, homogeneous background prior and made the novel prediction that the classic displays in which both the convex and concave regions were homogeneous were ambiguous during perceptual organization. Methods Here, we report three experiments investigating the proposed ambiguity and examining how the convex figure CEs unfold over time with an emphasis on whether they entail recurrent processing. Displays were shown for 100 ms followed by pattern masks after ISIs of 0, 50, or 100 ms. The masking conditions were designed to add noise to recurrent processing and therefore to delay the outcome of processes in which they play a role. In Exp. 1, participants viewed two- and eight-region displays with homogeneous convex regions (homo-convex displays; the putatively ambiguous displays). In Exp. 2, participants viewed putatively unambiguous hetero-convex displays. In Exp. 3, displays and masks were presented to different eyes, thereby delaying mask interference in the thalamus for up to 100 ms. Results and discussion The results of Exps. 1 and 2 are consistent with the interpretation that recurrent processing is involved in generating the convex figure CEs and resolving the ambiguity of homo-convex displays. The results of Exp. 3 suggested that corticofugal recurrent processing is involved in resolving the ambiguity of homo-convex displays and that cortico-cortical recurrent processes play a role in generating convex figure CEs and these two types of recurrent processes operate in parallel. Our results add to evidence that perceptual organization evolves dynamically and reveal that stimuli that seem unambiguous can be ambiguous during perceptual organization.
Collapse
Affiliation(s)
- Mary A. Peterson
- Department of Psychology, University of Arizona, Tucson, AZ, United States
- Cognitive Science Program, University of Arizona, Tucson, AZ, United States
| | - Elizabeth Salvagio Campbell
- Department of Psychology, University of Arizona, Tucson, AZ, United States
- Cognitive Science Program, University of Arizona, Tucson, AZ, United States
- College of Medicine Tucson, University of Arizona, Tucson, AZ, United States
| |
Collapse
|
3
|
Segmenting surface boundaries using luminance cues. Sci Rep 2021; 11:10074. [PMID: 33980899 PMCID: PMC8115076 DOI: 10.1038/s41598-021-89277-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 04/16/2021] [Indexed: 12/02/2022] Open
Abstract
Segmenting scenes into distinct surfaces is a basic visual perception task, and luminance differences between adjacent surfaces often provide an important segmentation cue. However, mean luminance differences between two surfaces may exist without any sharp change in albedo at their boundary, but rather from differences in the proportion of small light and dark areas within each surface, e.g. texture elements, which we refer to as a luminance texture boundary. Here we investigate the performance of human observers segmenting luminance texture boundaries. We demonstrate that a simple model involving a single stage of filtering cannot explain observer performance, unless it incorporates contrast normalization. Performing additional experiments in which observers segment luminance texture boundaries while ignoring super-imposed luminance step boundaries, we demonstrate that the one-stage model, even with contrast normalization, cannot explain performance. We then present a Filter–Rectify–Filter model positing two cascaded stages of filtering, which fits our data well, and explains observers' ability to segment luminance texture boundary stimuli in the presence of interfering luminance step boundaries. We propose that such computations may be useful for boundary segmentation in natural scenes, where shadows often give rise to luminance step edges which do not correspond to surface boundaries.
Collapse
|
4
|
Abstract
Resting state functional MR imaging methods can provide localization of the language system; however, presurgical functional localization of the language system with task-based functional MR imaging is the current standard of care before resection of brain tumors. These methods provide similar results and comparing them could be helpful for presurgical planning. We combine information from 3 data resources to provide quantitative information on the components of the language system. Tables and figures compare anatomic information, localization information from resting state fMR imaging, and activation patterns in different components of the language system expected from commonly used task fMR imaging experiments.
Collapse
|
5
|
Tang Q, Sang N, Liu H. Learning Nonclassical Receptive Field Modulation for Contour Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1192-1203. [PMID: 31536000 DOI: 10.1109/tip.2019.2940690] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This work develops a biologically inspired neural network for contour detection in natural images by combining the nonclassical receptive field modulation mechanism with a deep learning framework. The input image is first convolved with the local feature detectors to produce the classical receptive field responses, and then a corresponding modulatory kernel is constructed for each feature map to model the nonclassical receptive field modulation behaviors. The modulatory effects can activate a larger cortical area and thus allow cortical neurons to integrate a broader range of visual information to recognize complex cases. Additionally, to characterize spatial structures at various scales, a multiresolution technique is used to represent visual field information from fine to coarse. Different scale responses are combined to estimate the contour probability. Our method achieves state-of-the-art results among all biologically inspired contour detection models. This study provides a method for improving visual modeling of contour detection and inspires new ideas for integrating more brain cognitive mechanisms into deep neural networks.
Collapse
|
6
|
DiMattina C, Baker CL. Modeling second-order boundary perception: A machine learning approach. PLoS Comput Biol 2019; 15:e1006829. [PMID: 30883556 PMCID: PMC6438569 DOI: 10.1371/journal.pcbi.1006829] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 03/28/2019] [Accepted: 01/15/2019] [Indexed: 11/18/2022] Open
Abstract
Visual pattern detection and discrimination are essential first steps for scene analysis. Numerous human psychophysical studies have modeled visual pattern detection and discrimination by estimating linear templates for classifying noisy stimuli defined by spatial variations in pixel intensities. However, such methods are poorly suited to understanding sensory processing mechanisms for complex visual stimuli such as second-order boundaries defined by spatial differences in contrast or texture. We introduce a novel machine learning framework for modeling human perception of second-order visual stimuli, using image-computable hierarchical neural network models fit directly to psychophysical trial data. This framework is applied to modeling visual processing of boundaries defined by differences in the contrast of a carrier texture pattern, in two different psychophysical tasks: (1) boundary orientation identification, and (2) fine orientation discrimination. Cross-validation analysis is employed to optimize model hyper-parameters, and demonstrate that these models are able to accurately predict human performance on novel stimulus sets not used for fitting model parameters. We find that, like the ideal observer, human observers take a region-based approach to the orientation identification task, while taking an edge-based approach to the fine orientation discrimination task. How observers integrate contrast modulation across orientation channels is investigated by fitting psychophysical data with two models representing competing hypotheses, revealing a preference for a model which combines multiple orientations at the earliest possible stage. Our results suggest that this machine learning approach has much potential to advance the study of second-order visual processing, and we outline future steps towards generalizing the method to modeling visual segmentation of natural texture boundaries. This study demonstrates how machine learning methodology can be fruitfully applied to psychophysical studies of second-order visual processing. Many naturally occurring visual boundaries are defined by spatial differences in features other than luminance, for example by differences in texture or contrast. Quantitative models of such “second-order” boundary perception cannot be estimated using the standard regression techniques (known as “classification images”) commonly applied to “first-order”, luminance-defined stimuli. Here we present a novel machine learning approach to modeling second-order boundary perception using hierarchical neural networks. In contrast to previous quantitative studies of second-order boundary perception, we directly estimate network model parameters using psychophysical trial data. We demonstrate that our method can reveal different spatial summation strategies that human observers utilize for different kinds of second-order boundary perception tasks, and can be used to compare competing hypotheses of how contrast modulation is integrated across orientation channels. We outline extensions of the methodology to other kinds of second-order boundaries, including those in natural images.
Collapse
Affiliation(s)
- Christopher DiMattina
- Computational Perception Laboratory, Department of Psychology, Florida Gulf Coast University, Fort Myers, Florida, United States of America
- * E-mail:
| | - Curtis L. Baker
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
7
|
Abstract
Dynamic image deformation produces the perception of a transparent material that appears to deform the background image by light refraction. Since past studies on this phenomenon have mainly used subjective judgment about the presence of a transparent layer, it remains unsolved whether this is a real perceptual transparency effect in the sense that it forms surface representations, as do conventional transparency effects. Visual computation for color and luminance transparency, induced mainly by surface-contour information, can be decomposed into two components: surface formation to determine foreground and background layers, and scission to assign color and luminance to each layer. Here we show that deformation-induced perceptual transparency aids surface formation by color transparency and consequently resolves color scission. We asked observers to report the color of the front layer in a spatial region with a neutral physical color. The layer color could be seen as either reddish or greenish depending on the spatial context producing the color transparency, which was, however, ambiguous about the order of layers. We found that adding to the display a deformation-induced transparency that could specify the front layer significantly biased color scission in the predicted way if and only if the deformation-induced transparency was spatially coincident with the interpretation of color transparency. The results indicate that deformation-induced transparency is indeed a novel type of perceptual transparency that plays a role in surface formation in cooperation with color transparency.
Collapse
|
8
|
Luminance gradient at object borders communicates object location to the human oculomotor system. Sci Rep 2018; 8:1593. [PMID: 29371609 PMCID: PMC5785482 DOI: 10.1038/s41598-018-19464-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 12/27/2017] [Indexed: 11/09/2022] Open
Abstract
The locations of objects in our environment constitute arguably the most important piece of information our visual system must convey to facilitate successful visually guided behaviour. However, the relevant objects are usually not point-like and do not have one unique location attribute. Relatively little is known about how the visual system represents the location of such large objects as visual processing is, both on neural and perceptual level, highly edge dominated. In this study, human observers made saccades to the centres of luminance defined squares (width 4 deg), which appeared at random locations (8 deg eccentricity). The phase structure of the square was manipulated such that the points of maximum luminance gradient at the square's edges shifted from trial to trial. The average saccade endpoints of all subjects followed those shifts in remarkable quantitative agreement. Further experiments showed that the shifts were caused by the edge manipulations, not by changes in luminance structure near the centre of the square or outside the square. We conclude that the human visual system programs saccades to large luminance defined square objects based on edge locations derived from the points of maximum luminance gradients at the square's edges.
Collapse
|
9
|
Jennings BJ, Kingdom FAA. Chromatic blur perception in the presence of luminance contrast. Vision Res 2017; 135:34-42. [PMID: 28450052 DOI: 10.1016/j.visres.2017.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Revised: 02/09/2017] [Accepted: 04/02/2017] [Indexed: 10/19/2022]
Abstract
Hel-Or showed that blurring the chromatic but not the luminance layer of an image of a natural scene failed to elicit any impression of blur. Subsequent studies have suggested that this effect is due either to chromatic blur being masked by spatially contiguous luminance edges in the scene (Journal of Vision 13 (2013) 14), or to a relatively compressed transducer function for chromatic blur (Journal of Vision 15 (2015) 6). To test between the two explanations we conducted experiments using as stimuli both images of natural scenes as well as simple edges. First, we found that in color-and-luminance images of natural scenes more chromatic blur was needed to perceptually match a given level of blur in an isoluminant, i.e. colour-only scene. However, when the luminance layer in the scene was rotated relative to the chromatic layer, thus removing the colour-luminance edge correlations, the matched blur levels were near equal. Both results are consistent with Sharman et al.'s explanation. Second, when observers matched the blurs of luminance-only with isoluminant scenes, the matched blurs were equal, against Kingdom et al.'s prediction. Third, we measured the perceived blur in a square-wave as a function of (i) contrast (ii) number of luminance edges and (iii) the relative spatial phase between the colour and luminance edges. We found that the perceived chromatic blur was dependent on both relative phase and the number of luminance edges, or dependent on the luminance contrast if only a single edge is present. We conclude that this Hel-Or effect is largely due to masking of chromatic blur by spatially contiguous luminance edges.
Collapse
Affiliation(s)
- Ben J Jennings
- McGill Vision Research, Department of Ophthalmology, Montreal General Hospital, McGill University, Montreal, Quebec, Canada.
| | - Frederick A A Kingdom
- McGill Vision Research, Department of Ophthalmology, Montreal General Hospital, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
10
|
Mély DA, Kim J, McGill M, Guo Y, Serre T. A systematic comparison between visual cues for boundary detection. Vision Res 2016; 120:93-107. [PMID: 26748113 DOI: 10.1016/j.visres.2015.11.007] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2014] [Revised: 11/17/2015] [Accepted: 11/17/2015] [Indexed: 11/15/2022]
Abstract
The detection of object boundaries is a critical first step for many visual processing tasks. Multiple cues (we consider luminance, color, motion and binocular disparity) available in the early visual system may signal object boundaries but little is known about their relative diagnosticity and how to optimally combine them for boundary detection. This study thus aims at understanding how early visual processes inform boundary detection in natural scenes. We collected color binocular video sequences of natural scenes to construct a video database. Each scene was annotated with two full sets of ground-truth contours (one set limited to object boundaries and another set which included all edges). We implemented an integrated computational model of early vision that spans all considered cues, and then assessed their diagnosticity by training machine learning classifiers on individual channels. Color and luminance were found to be most diagnostic while stereo and motion were least. Combining all cues yielded a significant improvement in accuracy beyond that of any cue in isolation. Furthermore, the accuracy of individual cues was found to be a poor predictor of their unique contribution for the combination. This result suggested a complex interaction between cues, which we further quantified using regularization techniques. Our systematic assessment of the accuracy of early vision models for boundary detection together with the resulting annotated video dataset should provide a useful benchmark towards the development of higher-level models of visual processing.
Collapse
Affiliation(s)
- David A Mély
- Brown University, Providence, RI 02912, United States; Department of Cognitive, Linguistic and Psychological Sciences, United States.
| | - Junkyung Kim
- Brown University, Providence, RI 02912, United States; Department of Cognitive, Linguistic and Psychological Sciences, United States.
| | - Mason McGill
- Brown University, Providence, RI 02912, United States; Department of Cognitive, Linguistic and Psychological Sciences, United States.
| | - Yuliang Guo
- Brown University, Providence, RI 02912, United States; Department of Engineering, United States.
| | - Thomas Serre
- Brown University, Providence, RI 02912, United States; Department of Cognitive, Linguistic and Psychological Sciences, United States; Brown Institute for Brain Science, United States.
| |
Collapse
|
11
|
HFS: Hierarchical Feature Selection for Efficient Image Segmentation. COMPUTER VISION – ECCV 2016 2016. [DOI: 10.1007/978-3-319-46487-9_53] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
12
|
Mottaghi R, Fidler S, Yuille A, Urtasun R, Parikh D. Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016; 38:74-87. [PMID: 26656579 DOI: 10.1109/tpami.2015.2437377] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Recent trends in image understanding have pushed for scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning, and local appearance based classifiers. In this work, we are interested in understanding the roles of these different tasks in improved scene understanding, in particular semantic segmentation, object detection and scene recognition. Towards this goal, we "plug-in" human subjects for each of the various components in a conditional random field model. Comparisons among various hybrid human-machine CRFs give us indications of how much "head room" there is to improve scene understanding by focusing research efforts on various individual tasks.
Collapse
|
13
|
Sharman RJ, McGraw PV, Peirce JW. Cue Combination of Conflicting Color and Luminance Edges. Iperception 2015; 6:2041669515621215. [PMID: 27551364 PMCID: PMC4975110 DOI: 10.1177/2041669515621215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Abrupt changes in the color or luminance of a visual image potentially indicate object boundaries. Here, we consider how these cues to the visual "edge" location are combined when they conflict. We measured the extent to which localization of a compound edge can be predicted from a simple maximum likelihood estimation model using the reliability of chromatic (L-M) and luminance signals alone. Maximum likelihood estimation accurately predicted the pattern of results across a range of contrasts. Predictions consistently overestimated the relative influence of the luminance cue; although L-M is often considered a poor cue for localization, it was used more than expected. This need not indicate that the visual system is suboptimal but that its priors about which cue is more useful are not flat. This may be because, although strong changes in chromaticity typically represent object boundaries, changes in luminance can be caused by either a boundary or a shadow.
Collapse
Affiliation(s)
| | - Paul V McGraw
- School of Psychology, University of Nottingham, University Park, UK
| | | |
Collapse
|
14
|
Tadin D. Suppressive mechanisms in visual motion processing: From perception to intelligence. Vision Res 2015; 115:58-70. [PMID: 26299386 DOI: 10.1016/j.visres.2015.08.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Revised: 07/31/2015] [Accepted: 08/04/2015] [Indexed: 11/19/2022]
Abstract
Perception operates on an immense amount of incoming information that greatly exceeds the brain's processing capacity. Because of this fundamental limitation, the ability to suppress irrelevant information is a key determinant of perceptual efficiency. Here, I will review a series of studies investigating suppressive mechanisms in visual motion processing, namely perceptual suppression of large, background-like motions. These spatial suppression mechanisms are adaptive, operating only when sensory inputs are sufficiently robust to guarantee visibility. Converging correlational and causal evidence links these behavioral results with inhibitory center-surround mechanisms, namely those in cortical area MT. Spatial suppression is abnormally weak in several special populations, including the elderly and individuals with schizophrenia-a deficit that is evidenced by better-than-normal direction discriminations of large moving stimuli. Theoretical work shows that this abnormal weakening of spatial suppression should result in motion segregation deficits, but direct behavioral support of this hypothesis is lacking. Finally, I will argue that the ability to suppress information is a fundamental neural process that applies not only to perception but also to cognition in general. Supporting this argument, I will discuss recent research that shows individual differences in spatial suppression of motion signals strongly predict individual variations in IQ scores.
Collapse
Affiliation(s)
- Duje Tadin
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY 14627, USA; Center for Visual Science, University of Rochester, Rochester, NY 14627, USA; Department of Ophthalmology, University of Rochester School of Medicine, Rochester, NY 14642, USA.
| |
Collapse
|
15
|
Yang KF, Li CY, Li YJ. Multifeature-based surround inhibition improves contour detection in natural images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:5020-5032. [PMID: 25291794 DOI: 10.1109/tip.2014.2361210] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
To effectively perform visual tasks like detecting contours, the visual system normally needs to integrate multiple visual features. Sufficient physiological studies have revealed that for a large number of neurons in the primary visual cortex (V1) of monkeys and cats, neuronal responses elicited by the stimuli placed within the classical receptive field (CRF) are substantially modulated, normally inhibited, when difference exists between the CRF and its surround, namely, non-CRF, for various local features. The exquisite sensitivity of V1 neurons to the center-surround stimulus configuration is thought to serve important perceptual functions, including contour detection. In this paper, we propose a biologically motivated model to improve the performance of perceptually salient contour detection. The main contribution is the multifeature-based center-surround framework, in which the surround inhibition weights of individual features, including orientation, luminance, and luminance contrast, are combined according to a scale-guided strategy, and the combined weights are then used to modulate the final surround inhibition of the neurons. The performance was compared with that of single-cue-based models and other existing methods (especially other biologically motivated ones). The results show that combining multiple cues can substantially improve the performance of contour detection compared with the models using single cue. In general, luminance and luminance contrast contribute much more than orientation to the specific task of contour extraction, at least in gray-scale natural images.
Collapse
|
16
|
Abstract
The spatial resolution of disparity perception is poor compared to luminance perception, yet we do not notice that depth edges are more blurry than luminance edges. Is this because the two cues are combined by the visual system? Subjects judged the locations of depth-defined or luminance-defined edges, which were separated by up to 5.6 min of arc. The perceived edge location was a function of the depth-defined edge and the luminance-defined edge, with the luminance edge tending to play a larger role. Our data are compatible with but not completely explained by an optimal cue-combination model that gives more reliable cues a heavier weight. Both edge cues (depth and luminance) contribute to the final percept, with an adaptive weighting depending on the task and the acuity with which each cue is perceived.
Collapse
Affiliation(s)
- Alan E Robinson
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA.
| | | |
Collapse
|
17
|
Seeing and hearing a word: combining eye and ear is more efficient than combining the parts of a word. PLoS One 2013; 8:e64803. [PMID: 23734220 PMCID: PMC3667182 DOI: 10.1371/journal.pone.0064803] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 04/18/2013] [Indexed: 11/21/2022] Open
Abstract
To understand why human sensitivity for complex objects is so low, we study how word identification combines eye and ear or parts of a word (features, letters, syllables). Our observers identify printed and spoken words presented concurrently or separately. When researchers measure threshold (energy of the faintest visible or audible signal) they may report either sensitivity (one over the human threshold) or efficiency (ratio of the best possible threshold to the human threshold). When the best possible algorithm identifies an object (like a word) in noise, its threshold is independent of how many parts the object has. But, with human observers, efficiency depends on the task. In some tasks, human observers combine parts efficiently, needing hardly more energy to identify an object with more parts. In other tasks, they combine inefficiently, needing energy nearly proportional to the number of parts, over a 60∶1 range. Whether presented to eye or ear, efficiency for detecting a short sinusoid (tone or grating) with few features is a substantial 20%, while efficiency for identifying a word with many features is merely 1%. Why? We show that the low human sensitivity for words is a cost of combining their many parts. We report a dichotomy between inefficient combining of adjacent features and efficient combining across senses. Joining our results with a survey of the cue-combination literature reveals that cues combine efficiently only if they are perceived as aspects of the same object. Observers give different names to adjacent letters in a word, and combine them inefficiently. Observers give the same name to a word’s image and sound, and combine them efficiently. The brain’s machinery optimally combines only cues that are perceived as originating from the same object. Presumably such cues each find their own way through the brain to arrive at the same object representation.
Collapse
|
18
|
Hamilton-Fletcher G, Ward J. Representing Colour Through Hearing and Touch in Sensory Substitution Devices. Multisens Res 2013; 26:503-32. [PMID: 24800410 DOI: 10.1163/22134808-00002434] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Visual sensory substitution devices (SSDs) allow visually-deprived individuals to navigate and recognise the ‘visual world’; SSDs also provide opportunities for psychologists to study modality-independent theories of perception. At present most research has focused on encoding greyscale vision. However at the low spatial resolutions received by SSD users, colour information enhances object-ground segmentation, and provides more stable cues for scene and object recognition. Many attempts have been made to encode colour information in tactile or auditory modalities, but many of these studies exist in isolation. This review brings together a wide variety of tactile and auditory approaches to representing colour. We examine how each device constructs ‘colour’ relative to veridical human colour perception and report previous experiments using these devices. Theoretical approaches to encoding and transferring colour information through sound or touch are discussed for future devices, covering alternative stimulation approaches, perceptually distinct dimensions and intuitive cross-modal correspondences.
Collapse
Affiliation(s)
| | - Jamie Ward
- School of Psychology and Sackler Centre for Consciousness Science, University of Sussex, UK
| |
Collapse
|
19
|
Barbot A, Landy MS, Carrasco M. Differential effects of exogenous and endogenous attention on second-order texture contrast sensitivity. J Vis 2012; 12:6. [PMID: 22895879 DOI: 10.1167/12/8/6] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The visual system can use a rich variety of contours to segment visual scenes into distinct perceptually coherent regions. However, successfully segmenting an image is a computationally expensive process. Previously we have shown that exogenous attention--the more automatic, stimulus-driven component of spatial attention--helps extract contours by enhancing contrast sensitivity for second-order, texture-defined patterns at the attended location, while reducing sensitivity at unattended locations, relative to a neutral condition. Interestingly, the effects of exogenous attention depended on the second-order spatial frequency of the stimulus. At parafoveal locations, attention enhanced second-order contrast sensitivity to relatively high, but not to low second-order spatial frequencies. In the present study we investigated whether endogenous attention-the more voluntary, conceptually-driven component of spatial attention--affects second-order contrast sensitivity, and if so, whether its effects are similar to those of exogenous attention. To that end, we compared the effects of exogenous and endogenous attention on the sensitivity to second-order, orientation-defined, texture patterns of either high or low second-order spatial frequencies. The results show that, like exogenous attention, endogenous attention enhances second-order contrast sensitivity at the attended location and reduces it at unattended locations. However, whereas the effects of exogenous attention are a function of the second-order spatial frequency content, endogenous attention affected second-order contrast sensitivity independent of the second-order spatial frequency content. This finding supports the notion that both exogenous and endogenous attention can affect second-order contrast sensitivity, but that endogenous attention is more flexible, benefitting performance under different conditions.
Collapse
Affiliation(s)
- Antoine Barbot
- Department of Psychology, New York University, New York, NY, USA.
| | | | | |
Collapse
|
20
|
Abstract
We examined the interaction between motion and stereo cues to depth order along object boundaries. Relative depth was conveyed by a change in the speed of image motion across a boundary (motion parallax), the disappearance of features on a surface moving behind an occluding object (motion occlusion), or a difference in the stereo disparity of adjacent surfaces. We compared the perceived depth orders for different combinations of cues, incorporating conditions with conflicting depth orders and conditions with varying reliability of the individual cues. We observed large differences in performance between subjects, ranging from those whose depth order judgments were driven largely by the stereo disparity cues to those whose judgments were dominated by motion occlusion. The relative strength of these cues influenced individual subjects' behavior in conditions of cue conflict and reduced reliability.
Collapse
|
21
|
Saarela TP, Landy MS. Combination of texture and color cues in visual segmentation. Vision Res 2012; 58:59-67. [PMID: 22387319 DOI: 10.1016/j.visres.2012.01.019] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Revised: 01/16/2012] [Accepted: 01/24/2012] [Indexed: 11/29/2022]
Abstract
The visual system can use various cues to segment the visual scene into figure and background. We studied how human observers combine two of these cues, texture and color, in visual segmentation. In our task, the observers identified the orientation of an edge that was defined by a texture difference, a color difference, or both (cue combination). In a fourth condition, both texture and color information were available, but the texture and color edges were not spatially aligned (cue conflict). Performance markedly improved when the edges were defined by two cues, compared to the single-cue conditions. Observers only benefited from the two cues, however, when they were spatially aligned. A simple signal-detection model that incorporates interactions between texture and color processing accounts for the performance in all conditions. In a second experiment, we studied whether the observers are able to ignore a task-irrelevant cue in the segmentation task or whether it interferes with performance. Observers identified the orientation of an edge defined by one cue and were instructed to ignore the other cue. Three types of trial were intermixed: neutral trials, in which the second cue was absent; congruent trials, in which the second cue signaled the same edge as the target cue; and conflict trials, in which the second cue signaled an edge orthogonal to the target cue. Performance improved when the second cue was congruent with the target cue. Performance was impaired when the second cue was in conflict with the target cue, indicating that observers could not discount the second cue. We conclude that texture and color are not processed independently in visual segmentation.
Collapse
Affiliation(s)
- Toni P Saarela
- Department of Psychology and Center for Neural Science, New York University, New York, NY, USA
| | | |
Collapse
|
22
|
|
23
|
Barbot A, Landy MS, Carrasco M. Exogenous attention enhances 2nd-order contrast sensitivity. Vision Res 2011; 51:1086-98. [PMID: 21356228 DOI: 10.1016/j.visres.2011.02.022] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2010] [Revised: 02/18/2011] [Accepted: 02/23/2011] [Indexed: 01/02/2023]
Abstract
Natural scenes contain a rich variety of contours that the visual system extracts to segregate the retinal image into perceptually coherent regions. Covert spatial attention helps extract contours by enhancing contrast sensitivity for 1st-order, luminance-defined patterns at attended locations, while reducing sensitivity at unattended locations, relative to neutral attention allocation. However, humans are also sensitive to 2nd-order patterns such as spatial variations of texture, which are predominant in natural scenes and cannot be detected by linear mechanisms. We assess whether and how exogenous attention--the involuntary and transient capture of spatial attention--affects the contrast sensitivity of channels sensitive to 2nd-order, texture-defined patterns. Using 2nd-order, texture-defined stimuli, we demonstrate that exogenous attention increases 2nd-order contrast sensitivity at the attended location, while decreasing it at unattended locations, relative to a neutral condition. By manipulating both 1st- and 2nd-order spatial frequency, we find that the effects of attention depend both on 2nd-order spatial frequency of the stimulus and the observer's 2nd-order spatial resolution at the target location. At parafoveal locations, attention enhances 2nd-order contrast sensitivity to high, but not to low 2nd-order spatial frequencies; at peripheral locations attention also enhances sensitivity to low 2nd-order spatial frequencies. Control experiments rule out the possibility that these effects might be due to an increase in contrast sensitivity at the 1st-order stage of visual processing. Thus, exogenous attention affects 2nd-order contrast sensitivity at both attended and unattended locations.
Collapse
Affiliation(s)
- Antoine Barbot
- Department of Psychology, New York University, New York, NY 10003, United States.
| | | | | |
Collapse
|
24
|
Kida T, Tanaka E, Takeshima Y, Kakigi R. Neural representation of feature synergy. Neuroimage 2010; 55:669-80. [PMID: 21111826 DOI: 10.1016/j.neuroimage.2010.11.054] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 11/11/2010] [Accepted: 11/16/2010] [Indexed: 10/18/2022] Open
Abstract
Interactive non-linear cooperation of different feature dimensions, feature synergy, has been studied in psychophysics, but the neural mechanism is unknown. The present study investigated the neural representation of feature synergy of two second-order visual features by combining electroencephalography (EEG) with the signal detection theory (SDT). Two kinds of a 27-by-27 array of Gabor patches were presented in a random order; a reference stimulus which has no segregated region, and a target stimulus whose inner region differed in spatial frequency, orientation, or both from the surround. Subjects performed a Yes-No discrimination of whether the inner region was different from the surround, while EEG signals were recorded from 62 locations. When the SDT measure showed feature synergy, EEG activity showed a long-lasting enhancement starting at 130 ms around the inferior temporal region. In contrast, no EEG modulation was observed when feature synergy was not present. Thus, our combined approach demonstrates that non-linear cooperation between different features is represented by neural activity starting at 130 ms post-stimulus in the ventral visual stream.
Collapse
Affiliation(s)
- Tetsuo Kida
- Department of Integrative Physiology, National Institute for Physiological Sciences, Myodaiji, Okazaki, Japan.
| | | | | | | |
Collapse
|
25
|
Abstract
A fundamental goal of visual neuroscience is to identify the neural pathways representing different image features. It is widely argued that the early stages of these pathways represent linear features of the visual scene and that the nonlinearities necessary to represent complex visual patterns are introduced later in cortex. We tested this by comparing the responses of subcortical and cortical neurons to interference patterns constructed by summing sinusoidal gratings. Although a linear mechanism can detect the component gratings, a nonlinear mechanism is required to detect an interference pattern resulting from their sum. Consistent with in vitro retinal ganglion cell recordings, we found that interference patterns are represented subcortically by cat LGN Y-cells, but not X-cells. Linear and nonlinear tuning properties of LGN Y-cells were then characterized and compared quantitatively with those of cortical area 18 neurons responsive to interference patterns. This comparison revealed a high degree of similarity between the two neural populations, including the following: (1) the representation of similar spatial frequencies in both their linear and nonlinear responses, (2) comparable orientation selectivity for the high spatial frequency carrier of interference patterns, and (3) the same difference in their temporal frequency selectivity for drifting gratings versus the envelope of interference patterns. The present findings demonstrate that the nonlinear subcortical Y-cell pathway represents complex visual patterns and likely underlies cortical responses to interference patterns. We suggest that linear and nonlinear mechanisms important for encoding visual scenes emerge in parallel through distinct pathways originating at the retina.
Collapse
|
26
|
Electrophysiological correlates of figure–ground segregation directly reflect perceptual saliency. Vision Res 2010; 50:509-21. [DOI: 10.1016/j.visres.2009.12.013] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2009] [Revised: 10/27/2009] [Accepted: 12/29/2009] [Indexed: 11/20/2022]
|
27
|
Straube S, Fahle M. The electrophysiological correlate of saliency: Evidence from a figure-detection task. Brain Res 2010; 1307:89-102. [DOI: 10.1016/j.brainres.2009.10.043] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Revised: 10/12/2009] [Accepted: 10/16/2009] [Indexed: 11/24/2022]
|
28
|
Martín A, Barraza JF, Colombo EM. The effect of spatial layout on motion segmentation. Vision Res 2009; 49:1613-9. [PMID: 19336241 DOI: 10.1016/j.visres.2009.03.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Revised: 03/18/2009] [Accepted: 03/24/2009] [Indexed: 11/27/2022]
Abstract
We present a series of experiments exploring the effect of the stimulus spatial configuration on speed discrimination and two different types of segmentation, for random dot patterns. In the first experiment, we find that parsing the image produces a decrease of speed discrimination thresholds such as was first shown by Verghese and Stone [Verghese, P., & Stone, L. (1997). Spatial layout affects speed discrimination threshold. Vision Research, 37(4), 397-406; Verghese, P., & Stone, L. S. (1996). Perceived visual speed constrained by image segmentation. Nature, 381, 161-163] for sinusoidal gratings. In the second experiment, we study how the spatial configuration affects the ability of a subject in localizing an illusory contour defined by two surfaces with different speeds. Results show that the speed difference necessary to localize the contour decreases as the stimulus patches are separated. The third experiment involves transparency. Our results show a little or null effect for this condition. We explain the first and second experiment in the framework of the model of Bravo and Watamaniuk [Bravo, M., & Watamaniuk, S. (1995). Evidence for two speed signals: a coarse local signal for segregation and a precise global signal for discrimination. Vision Research, 35(12), 1691-1697] who proposed that motion computation consists in, at least, two stages: a first computation of coarse local speeds followed by an integration stage. We propose that the more precise estimate of speed obtained from the integration stage is used to produce a new refined segmentation of the image perhaps, through a feedback loop. Our data suggest that this third stage would not apply to the processing of transparency.
Collapse
Affiliation(s)
- Andrés Martín
- Departamento de Luminotecnia, Luz y Visión, FACET, Universidad Nacional de Tucumán, Av. Independencia 1800, San Miguel de Tucuman, Argentina.
| | | | | |
Collapse
|
29
|
Chung STL, Li RW, Levi DM. Learning to identify near-threshold luminance-defined and contrast-defined letters in observers with amblyopia. Vision Res 2008; 48:2739-50. [PMID: 18824189 PMCID: PMC2642955 DOI: 10.1016/j.visres.2008.09.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2008] [Revised: 09/03/2008] [Accepted: 09/08/2008] [Indexed: 11/24/2022]
Abstract
We assessed whether or not the sensitivity for identifying luminance-defined and contrast-defined letters improved with training in a group of amblyopic observers who have passed the critical period of development. In Experiment 1, we tracked the contrast threshold for identifying luminance-defined letters with training in a group of 11 amblyopic observers. Following training, six observers showed a reduction in thresholds, averaging 20%, for identifying luminance-defined letters. This improvement transferred extremely well to the untrained task of identifying contrast-defined letters (average improvement=38%) but did not transfer to an acuity measurement. Seven of the 11 observers were subsequently trained on identifying contrast-defined letters in Experiment 2. Following training, five of these seven observers demonstrated a further improvement, averaging 17%, for identifying contrast-defined letters. This improvement did not transfer to the untrained task of identifying luminance-defined letters. Our findings are consistent with predictions based on the locus of learning for first- and second-order stimuli according to the filter-rectifier-filter model for second-order visual processing.
Collapse
Affiliation(s)
- Susana T L Chung
- School of Optometry, University of California, 360 Minor Hall, Berkeley, CA 94720-2020, USA.
| | | | | |
Collapse
|
30
|
Fan Z, Harris J. Perceived spatial displacement of motion-defined contours in peripheral vision. Vision Res 2008; 48:2793-804. [PMID: 18824016 DOI: 10.1016/j.visres.2008.09.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Revised: 09/05/2008] [Accepted: 09/09/2008] [Indexed: 10/21/2022]
Abstract
The perceived displacement of motion-defined contours in peripheral vision was examined in four experiments. In Experiment 1, in line with Ramachandran and Anstis' finding [Ramachandran, V. S., & Anstis, S. M. (1990). Illusory displacement of equiluminous kinetic edges. Perception, 19, 611-616], the border between a field of drifting dots and a static dot pattern was apparently displaced in the same direction as the movement of the dots. When a uniform dark area was substituted for the static dots, a similar displacement was found, but this was smaller and statistically insignificant. In Experiment 2, the border between two fields of dots moving in opposite directions was displaced in the direction of motion of the dots in the more eccentric field, so that the location of a boundary defined by a diverging pattern is perceived as more eccentric, and that defined by a converging as less eccentric. Two explanations for this effect (that the displacement reflects a greater weight given to the more eccentric motion, or that the region containing stronger centripetal motion components expands perceptually into that containing centrifugal motion) were tested in Experiment 3, by varying the velocity of the more eccentric region. The results favoured the explanation based on the expansion of an area in centripetal motion. Experiment 4 showed that the difference in perceived location was unlikely to be due to differences in the discriminability of contours in diverging and converging patterns, and confirmed that this effect is due to a difference between centripetal and centrifugal motion rather than motion components in other directions. Our result provides new evidence for a bias towards centripetal motion in human vision, and suggests that the direction of motion-induced displacement of edges is not always in the direction of an adjacent moving pattern.
Collapse
Affiliation(s)
- Zhao Fan
- School of Psychology and Clinical Language Sciences, University of Reading, Whiteknights, Reading RG6 6AL, UK.
| | | |
Collapse
|
31
|
Sofou A, Maragos P. Generalized flooding and Multicue PDE-based image segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2008; 17:364-76. [PMID: 18270125 DOI: 10.1109/tip.2007.916156] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Image segmentation remains an important, but hard-to-solve, problem since it appears to be application dependent with usually no a priori information available regarding the image structure. Moreover, the increasing demands of image analysis tasks in terms of segmentation results' quality introduce the necessity of employing multiple cues for improving image segmentation results. In this paper, we attempt to incorporate cues such as intensity contrast, region size, and texture in the segmentation procedure and derive improved results compared to using individual cues separately. We emphasize on the overall segmentation procedure, and we propose efficient simplification operators and feature extraction schemes, capable of quantifying important characteristics, like geometrical complexity, rate of change in local contrast variations, and orientation, that eventually favor the final segmentation result. Based on the well-known morphological paradigm of watershed transform segmentation, which exploits intensity contrast and region size criteria, we investigate its partial differential equation (PDE) formulation, and we extend it in order to satisfy various flooding criteria, thus making it applicable to a wider range of images. Going a step further, we introduce a segmentation scheme that couples contrast criteria in flooding with texture information. The modeling of the proposed scheme is done via PDEs and the efficient incorporation of the available contrast and texture information, is done by selecting an appropriate cartoon-texture image decomposition scheme. The proposed coupled segmentation scheme is driven by two separate image components: cartoon U (for contrast information) and texture component V. The performance of the proposed segmentation scheme is demonstrated through a complete set of experimental results and substantiated using quantitative and qualitative criteria.
Collapse
Affiliation(s)
- Anastasia Sofou
- School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece.
| | | |
Collapse
|
32
|
Durant S, Zanker JM. Combining direction and speed for the localisation of visual motion defined contours. Vision Res 2008; 48:1053-60. [DOI: 10.1016/j.visres.2007.12.021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2007] [Revised: 12/20/2007] [Accepted: 12/29/2007] [Indexed: 10/22/2022]
|
33
|
Crowding between first- and second-order letters in amblyopia. Vision Res 2008; 48:788-98. [PMID: 18241910 DOI: 10.1016/j.visres.2007.12.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2007] [Revised: 12/07/2007] [Accepted: 12/14/2007] [Indexed: 11/20/2022]
Abstract
To test whether first- and second-order stimuli are processed independently in amblyopic vision, we measured thresholds for identifying a target letter flanked by two letters for all combinations of first- and second-order targets and flankers. We found that (1) the magnitude of crowding is greater for second- than for first-order letters for target and flankers of the same order type; (2) substantial but asymmetric cross-over crowding occurs such that stronger crowding is found for a second-order letter flanked by first-order letters than for the converse; (3) the spatial extent of crowding is independent of the order type of the letters. Our findings are consistent with the hypothesis that crowding results from an abnormal integration of target and flankers beyond the stage of feature detection, which takes place over a large distance in amblyopic vision.
Collapse
|
34
|
Holcombe AO, Cavanagh P. Independent, synchronous access to color and motion features. Cognition 2008; 107:552-80. [PMID: 18206865 DOI: 10.1016/j.cognition.2007.11.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2007] [Revised: 10/22/2007] [Accepted: 11/05/2007] [Indexed: 11/29/2022]
Abstract
We investigated the role of attention in pairing superimposed visual features. When moving dots alternate in color and in motion direction, reports of the perceived color and motion reveal an asynchrony: the most accurate reports occur when the motion change precedes the associated color change by approximately 100ms [Moutoussis, K., & Zeki, S. (1997). A direct demonstration of perceptual asynchrony in vision. Proceedings of the Royal Society of London B, 264, 393-399]. This feature binding asynchrony was probed by manipulating endogenous and exogenous attention. First, endogenous attention was manipulated by changing which feature dimension observers were instructed to attend to first. This yielded little effect on the asynchrony. Second, exogenous attention was manipulated by briefly presenting a ring around the target, cueing the report of the color and motion seen within the ring. This reduced or eliminated the apparent latency difference between color and motion. Accuracy was best predicted by timing of each feature relative to the cue rather than the timing of the two features relative to each other, suggesting independent attentional access to the two features with an exogenous attention cue. The timing of attentional cueing affected feature pairing reports as much as the timing of the features themselves.
Collapse
Affiliation(s)
- Alex O Holcombe
- School of Psychology, University of Sydney, Sydney, NSW 2006, Australia.
| | | |
Collapse
|
35
|
Wenderoth P. The role of implicit axes of bilateral symmetry in orientation processing. AUSTRALIAN JOURNAL OF PSYCHOLOGY 2007. [DOI: 10.1080/00049539708260463] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
36
|
Song Y, Baker CL. Neuronal response to texture- and contrast-defined boundaries in early visual cortex. Vis Neurosci 2007; 24:65-77. [PMID: 17430610 DOI: 10.1017/s0952523807070113] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2006] [Accepted: 01/24/2007] [Indexed: 11/06/2022]
Abstract
Natural scenes contain a variety of visual cues that facilitate boundary perception (e.g., luminance, contrast, and texture). Here we explore whether single neurons in early visual cortex can process both contrast and texture cues. We recorded neural responses in cat A18 to both illusory contours formed by abutting gratings (ICs, texture-defined) and contrast-modulated gratings (CMs, contrast-defined). We found that if a neuron responded to one of the two stimuli, it also responded to the other. These neurons signaled similar contour orientation, spatial frequency, and movement direction of the two stimuli. A given neuron also exhibited similar selectivity for spatial frequency of the fine, stationary grating components (carriers) of the stimuli. These results suggest that the cue-invariance of early cortical neurons extends to different kinds of texture or contrast cues, and might arise from a common nonlinear mechanism.
Collapse
Affiliation(s)
- Yuning Song
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montréal, Québec, Canada
| | | |
Collapse
|
37
|
Chung STL, Li RW, Levi DM. Crowding between first- and second-order letter stimuli in normal foveal and peripheral vision. J Vis 2007; 7:10.1-13. [PMID: 18217825 DOI: 10.1167/7.2.10] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2006] [Accepted: 12/14/2007] [Indexed: 11/24/2022] Open
Abstract
Evidence that the detection of first- and second-order visual stimuli is processed by separate pathways abounds. This study asked whether first- and second-order stimuli remain independent at the stage of processing where crowding occurs. We measured thresholds for identifying a first-order (luminance defined) or second-order (contrast defined) target letter in the presence of two second- or first-order flanking letters. For comparison, we also measured thresholds when the target and flanking letters were all first or second order. Contrast of the flankers was 1.6 times their respective contrast thresholds. Measurements were obtained at the fovea and 10 degrees in the lower visual field of four normally sighted observers. Two observers were also tested at 10 degrees nasal visual field. As expected, in both the fovea and periphery, the magnitude of crowding (threshold elevation) was maximal at the closest letter separation and decreased as letter separation increased. The magnitude of crowding was greater for second- than for first-order target letters, independent of the order type of flankers; however, the critical distance for crowding was similar for first- and second-order letters. Substantial crossover crowding occurred when the target and flanking letters were of different order type. Our finding of substantial interaction between first- and second-order stimuli suggests that the processing of these stimuli is not independent at the stage of processing at which crowding occurs.
Collapse
Affiliation(s)
- Susana T L Chung
- College of Optometry & Center for Neuro-Engineering and Cognitive Science, University of Houston, Houston, TX, USA.
| | | | | |
Collapse
|
38
|
Meinhardt G, Persike M, Mesenholl B, Hagemann C. Cue combination in a combined feature contrast detection and figure identification task. Vision Res 2006; 46:3977-93. [PMID: 16962156 DOI: 10.1016/j.visres.2006.07.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2005] [Revised: 06/15/2006] [Accepted: 07/18/2006] [Indexed: 10/24/2022]
Abstract
Target figures defined by feature contrast in spatial frequency, orientation or both cues had to be detected in Gabor random fields and their shape had to be identified in a dual task paradigm. Performance improved with increasing feature contrast and was strongly correlated among both tasks. Subjects performed significantly better with combined cues than with single cues. The improvement due to cue summation was stronger than predicted by the assumption of independent feature specific mechanisms, and increased with the performance level achieved with single cues until it was limited by ceiling effects. Further, cue summation was also strongly correlated among tasks: when there was benefit due to the additional cue in feature contrast detection, there was also benefit in figure identification. For the same performance level achieved with single cues, cue summation was generally larger in figure identification than in feature contrast detection, indicating more benefit when processes of shape and surface formation are involved. Our results suggest that cue combination improves spatial form completion and figure-ground segregation in noisy environments, and therefore leads to more stable object vision.
Collapse
Affiliation(s)
- Günter Meinhardt
- Johannes Gutenberg Universität, FB02, Department of Psychology, Methods Section, Staudinger Weg 9, Mainz, Germany.
| | | | | | | |
Collapse
|
39
|
Mysore SG, Vogels R, Raiguel SE, Orban GA. Processing of kinetic boundaries in macaque V4. J Neurophysiol 2005; 95:1864-80. [PMID: 16267116 DOI: 10.1152/jn.00627.2005] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We used gratings and shapes defined by relative motion to study selectivity for static kinetic boundaries in macaque V4 neurons. Kinetic gratings were generated by random pixels moving in opposite directions in the neighboring bars, either parallel to the orientation of the boundary (parallel kinetic grating) or perpendicular to the boundary (orthogonal kinetic grating). Neurons were also tested with static, luminance defined gratings to establish cue invariance. In addition, we used eight shapes defined either by relative motion or by luminance contrast, as used previously to test cue invariance in the infero-temporal (IT) cortex. A sizeable fraction (10-20%) of the V4 neurons responded selectively to kinetic patterns. Most neurons selective for kinetic contours had receptive fields (RFs) within the central 10 degrees of the visual field. Neurons selective for the orientation of kinetic gratings were defined as having similar orientation preferences for the two types of kinetic gratings, and the vast majority of these neurons also retained the same orientation preference for luminance defined gratings. Also, kinetic shape selective neurons had similar shape preferences when the shape was defined by relative motion or by luminance contrast, showing a cue-invariant form processing in V4. Although shape selectivity was weaker in V4 than what has been reported in the IT cortex, cue invariance was similar in the two areas, suggesting that invariance for luminance and motion cues of IT originates in V4. The neurons selective for kinetic patterns tended to be clustered within dorsal V4.
Collapse
Affiliation(s)
- Santosh G Mysore
- Lab. voor Neuro- en Psychofysiologie, K.U.Leuven Medical School, Campus Gasthuisberg, Leuven B-3000, Belgium
| | | | | | | |
Collapse
|
40
|
Poirier FJAM, Frost BJ. Global orientation aftereffect in multi-attribute displays: implications for the binding problem. Vision Res 2005; 45:497-506. [PMID: 15610753 DOI: 10.1016/j.visres.2004.09.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2003] [Revised: 05/18/2004] [Indexed: 11/29/2022]
Abstract
We investigated the binding problem (e.g. the combination of edge information across attributes), using an orientation aftereffect paradigm (OAE). Horizontal layers of vertical edges were phase-shifted to create a global near-vertical orientation. Multi-attribute displays were created by alternating the attribute defining edges (e.g. luminance, colour, texture or motion) across layers. OAE magnitude was dependent only on the attributes used in the adaptation phase, and the similarity of attributes from adaptation to testing phase had no significant effect. Moreover, compared to single-attribute conditions, the cooperation between attributes is moderate. These results favour segregation models of the binding mechanism.
Collapse
Affiliation(s)
- Frédéric J A M Poirier
- Centre for Vision Research, Neurodynamics and Vision Lab, Computer Science and Engineering Building, Room B0002E, 4700 Keele Street, Toronto (ON), M3J 1P3, Canada.
| | | |
Collapse
|
41
|
Heron J, Whitaker D, McGraw PV. Sensory uncertainty governs the extent of audio-visual interaction. Vision Res 2005; 44:2875-84. [PMID: 15380993 DOI: 10.1016/j.visres.2004.07.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2004] [Revised: 06/23/2004] [Indexed: 11/17/2022]
Abstract
Auditory signals have been shown to exert a marked influence on visual perception in a wide range of tasks. However, the mechanisms of these interactions are, at present, poorly understood. Here we present a series of experiments where a temporal cue within the auditory domain can significantly affect the localisation of a moving visual target. To investigate the mechanism of this interaction, we first modulated the spatial positional uncertainty of the visual target by varying its size. When visual positional uncertainty was low (small target size), auditory signals had little or no influence on perceived visual location. However, with increasing visual uncertainty (larger target sizes), auditory signals exerted a significantly greater influence on perceived visual location. We then altered the temporal profile of the auditory signal by modulating the spread of its Gaussian temporal envelope. Introducing this temporal uncertainty to the auditory signal greatly reduced its effect on visual localisation judgements. These findings support the view that the relative uncertainty in individual sensory domains governs the perceptual outcome of multisensory integration.
Collapse
Affiliation(s)
- J Heron
- Department of Optometry, University of Bradford, BD7 1DP, UK.
| | | | | |
Collapse
|
42
|
Abstract
Objects in the visual scene are defined by different cues such as colour and motion. Through the integration of these cues the visual system is able to utilize different sources of information, thus enhancing its ability to discriminate objects from their backgrounds. In the following experiments, we investigate the neural mechanisms of cue integration in the human. We show, using functional magnetic resonance imaging (fMRI), that both colour and motion defined shapes activate the lateral occipital complex (LOC) and that shapes defined by both colour and motion simultaneously activate the anterior-ventral margins of this area more strongly than shapes defined by either cue alone. This suggests that colour and motion cues are integrated in the LOC and possibly a neighbouring, more anterior, region. We support this result using an fMR adaptation technique, demonstrating that a region of the LOC adapts on repeated presentations of a shape regardless of the cue that is used to define it and even if the cue is varied. This result raises the possibility that the LOC contains cue-invariant neurons that respond to shapes regardless of the cue that is used to define them. We propose that such neurons could integrate signals from different cues, making them more responsive to objects defined by more than one cue, thus increasing the ability of the observer to recognize them.
Collapse
Affiliation(s)
- Matthew W Self
- Anatomy Department, Wellcome Department of Imaging Neuroscience, University College London, Gower Street, London WC1E6BT, UK.
| | | |
Collapse
|
43
|
Meinhardt G, Schmidt M, Persike M, Röers B. Feature synergy depends on feature contrast and objecthood. Vision Res 2004; 44:1843-50. [PMID: 15145678 DOI: 10.1016/j.visres.2004.04.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2003] [Revised: 03/30/2004] [Indexed: 10/26/2022]
Abstract
Pairs of texture figures, defined by contrast in spatial frequency, orientation or both cues (redundant texture definition) had to be detected within a homogeneous Gabor field. In line with expectation we find better detection performance for arrangements with higher feature contrast along the border where the figures abut. Redundantly defined figures show synergy, a significant performance increase compared to the prediction of independent processing of orientation and spatial frequency cues. As found in previous studies [Spatial Vision 16 (2003) 459; Vision Research (submitted for publication)] this performance advantage is negatively correlated with visibility. In particular, figures with high border feature contrast are easily detectable but show weak synergy whereas figures with low border feature contrast are barely detectable but remarkably benefit from redundant texture definition. Closer analysis reveals that the form of the figures is also crucial: As long as they maintain a clear two dimensional shape the synergy effect is only marginally affected by variation figure size and border length. But when they degrade to one dimensional Gabor element arrays, synergy almost completely vanishes. The results imply that both factors, low visibility and objecthood, are critical for feature synergy. We conclude that facilitation across feature domains serves to segregate figure from ground when the signal from a single domain is too weak to enable object detection and vanishes under conditions of stable object vision.
Collapse
Affiliation(s)
- Günter Meinhardt
- Westf. Wilhelms Universität, FB07 Psychologie, Fliednerstr. 21, D-48149 Münster, Germany.
| | | | | | | |
Collapse
|
44
|
Martin DR, Fowlkes CC, Malik J. Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2004; 26:530-549. [PMID: 15460277 DOI: 10.1109/tpami.2004.1273918] [Citation(s) in RCA: 459] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness, color, and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, we train a classifier using human labeled images as ground truth. The output of this classifier provides the posterior probability of a boundary at each image location and orientation. We present precision-recall curves showing that the resulting detector significantly outperforms existing approaches. Our two main results are 1) that cue combination can be performed adequately with a simple linear model and 2) that a proper, explicit treatment of texture is required to detect boundaries in natural images.
Collapse
Affiliation(s)
- David R Martin
- Computer Science Department, 460 Fulton Hall, Boston College, 140 Commonwealth Ave., Chestnut Hill, MA 02167, USA.
| | | | | |
Collapse
|
45
|
Roncato S, Casco C. The influence of contrast and spatial factors in the perceived shape of boundaries. ACTA ACUST UNITED AC 2003; 65:1252-72. [PMID: 14710960 DOI: 10.3758/bf03194850] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
When an edge can be perceived to continue either with a collinear edge of the opposite contrast polarity or with a noncollinear edge of the same contrast polarity, observers perceive an alignment between the edges of the same contrast polarity, even though they are noncollinear. Using several stimulus configurations and both free and tachistoscopic viewing, we studied the luminance and spatial factors affecting the perceived distortion and binding. The results showed that the two noncollinear edges tended to align when they had the same contrast polarity (Experiment 1A) and to misalign when they had opposite contrast polarity (Experiment 2), providing that (1) they were separated by a distance larger than 1 arcmin and smaller than 3-4 arcmin (for all configurations) and (2) they laterally overlapped for about 7 arcmin (Experiment 1B). The results also showed that the direction of apparent distortion depended on the direction of overlapping. The results of Experiment 3 ruled out the local attraction/repulsion explanation but, instead, supported the suggestion that the interaction concerned the global edges, or part of them, and produced an inward tilt, which made the edges of the same contrast polarity perceptually to align, or an outward tilt, so that the edges of opposite contrast polarity were perceived to be more misaligned. From the overlap and distance limits found, it can be inferred that for two noncollinear contours to join perceptually, the tilt must not exceed 18 degrees, a limit compatible with the orientation bandwidth of contrast-sensitive early cortical mechanisms.
Collapse
Affiliation(s)
- Sergio Roncato
- Dipartimento di Psicologia Generale, Università di Padova, Padova, Italy.
| | | |
Collapse
|
46
|
Ichikawa M, Saida S, Osa A, Munechika K. Integration of binocular disparity and monocular cues at near threshold level. Vision Res 2003; 43:2439-49. [PMID: 12972394 DOI: 10.1016/s0042-6989(03)00432-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We examined the dependency of the integration of multiple depth cues upon the combined cues and upon the consistency of depth information from different cues. For each observer, depth thresholds were measured by the use of stimuli in which different depth cues (motion parallax, binocular disparity, and monocular configuration) specified the surface undulating sinusoidally with different spatial frequencies and different phases. Analysis of d(') showed that the performance was better than the prediction of probability summation only when parallax and disparity cues specified an undulation with the same spatial frequency and same phase. The probability summation model overestimated the performance for the other conditions of combination of disparity and parallax, and for all of the conditions of combination of disparity and monocular configuration. These results suggest that the improvement in depth perception caused by integration of multiple cues depends on the type of combined cues, and that the visual system possibly integrates the depth information from different cues at different stages of the visual processing.
Collapse
Affiliation(s)
- Makoto Ichikawa
- Department of Perceptual Sciences and Design Engineering, Yamaguchi University, 2-16-1 Tokiwadai, Ube, Yamaguchi 755-8611, Japan.
| | | | | | | |
Collapse
|
47
|
Abstract
Visual binding of edge segments embedded in noise and created by luminance, motion and disparity contrasts were studied in three experiments. The results showed that path formation was limited by the same rules across all attributes tested. The first experiment showed that binding could be accomplished with either attribute used in isolation. The second experiment showed that closed paths were easier to detect than open paths irrespectively of the attributes used to create the path elements. No additive effects were found in either Experiment 1 or 2 when the path elements were created with several attributes superimposed on the same positions, compared to when only one attribute was used along the path. In Experiment 3 it was found that when another attribute was added between the positions of the first attribute along the path, so that two attributes alternated along the path, the performance of path detection was better than expected by probability summation estimated from the single attribute conditions. These results provide evidence for attribute-invariant Gestalt laws and provide clues about the underlying neural mechanisms.
Collapse
Affiliation(s)
- Leo Poom
- Department of Psychology, Uppsala University, Box 1225, SE-751 42, Uppsala, Sweden.
| |
Collapse
|
48
|
Faubert J, Bellefeuille A. Aging effects on intra- and inter-attribute spatial frequency information for luminance, color, and working memory. Vision Res 2002; 42:369-78. [PMID: 11809488 DOI: 10.1016/s0042-6989(01)00292-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Visual working memory (VWM) for spatial frequency information was assessed in both young and older observers. In the first experiment we assessed the effect of a memory mask on a VWM task. We found no effect of mask on retention for either the young or older groups. This argues against the 'inhibition' hypothesis of aging in regards to visual processing, which suggests that the elderly should have difficulty to inhibit irrelevant information. We conclude that the suggestion of an inefficient inhibition process in aging derived from evidence obtained for higher-level WM tasks cannot be generalized to the discrimination of basic patterns in VWM. The second experiment focused on processing resources within VWM by assessing VWM for intra-attribute (color or luminance), and inter-attribute (color and luminance)-defined spatial frequency information. Results show that retention of spatial frequency information in VWM is robust for both the younger and older group regardless of the defining attribute. Thresholds were significantly higher in the inter-attribute condition, indicating increased processing demands for this task and suggesting that these attributes are initially processed in parallel. Older observers showed higher discrimination thresholds than young observers for all conditions indicating a deficit in perceptual abilities rather than in VWM for basic stimuli. The difference in thresholds for the older group was highest in the inter-attribute condition suggesting that older observers show more deficits on visual tasks with increased processing demands.
Collapse
Affiliation(s)
- Jocelyn Faubert
- Département de psychologie et Ecole d'optométrie, Université de Montréal, 3744 Jean-Brillant, Montreal, Que., Canada H3C 1P1.
| | | |
Collapse
|
49
|
Landy MS, Kojima H. Ideal cue combination for localizing texture-defined edges. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2001; 18:2307-20. [PMID: 11551065 DOI: 10.1364/josaa.18.002307] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Many visual tasks can be carried out by using several sources of information. The most accurate estimates of scene properties require the observer to utilize all available information and to combine the information sources in an optimal manner. Two experiments are described that required the observers to judge the relative locations of two texture-defined edges (a vernier task). The edges were signaled by a change across the edge of two texture properties [either frequency and orientation (Experiment 1) or contrast and orientation (Experiment 2)]. The reliability of each cue was controlled by varying the distance over which the change (in frequency, orientation, or contrast) occurred-a kind of "texture blur." In some conditions, the position of the edge signaled by one cue was shifted relative to the other ("perturbation analysis"). An ideal-observer model, previously used in studies of depth perception and color constancy, was fitted to the data. Although the fit can be rejected relative to some more elaborate models, especially given the large quantity of data, this model does account for most trends in the data. A second, suboptimal model that switches between the available cues from trial to trial does a poor job of accounting for the data.
Collapse
Affiliation(s)
- M S Landy
- Department of Psychology and Center for Neural Science, New York University, New York 10003, USA.
| | | |
Collapse
|
50
|
Abstract
A new visual phenomenon, inter-attribute illusory (completed) contours, is demonstrated. Contour completions are perceived between any combination of spatially separate pairs of inducing elements (Kanizsa-like 'pacman' figures) defined either by pictorial cues (luminance contrast or offset gratings), temporal contrast (motion, second-order-motion or 'phantom' contours), or binocular-disparity contrast. In a first experiment, observers reported the perceived occurrence of contour completion for all pair combinations of inducing elements. In a second experiment they rated the perceived clarity of the completed contours. Both methods generated similar results contour completions were perceived even though the inducing elements were defined by different attributes. Ratings of inter-attribute clarity were no lower than in either of the two corresponding intra-attribute conditions and seem to be the average of these two ratings. The results provide evidence for the existence of attribute-invariant Gestalt processes, and on a mechanistic level indicate that the completion process operates on attribute-invariant contour detectors.
Collapse
Affiliation(s)
- L Poom
- Uppsala University, Department of Psychology, Sweden.
| |
Collapse
|