1
|
Stereoscopic depth constancy for physical objects and their virtual counterparts. J Vis 2022; 22:9. [PMID: 35315875 PMCID: PMC8944385 DOI: 10.1167/jov.22.4.9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Stereopsis plays an important role in depth perception; if so, disparity-defined depth should not vary with distance. However, studies of stereoscopic depth constancy often report systematic distortions in depth judgments over distance, particularly for virtual stimuli. Our aim was to understand how depth estimation is impacted by viewing distance and display-based cue conflicts by replicating physical objects in virtual counterparts. To this end, we measured perceived depth using virtual textured half-cylinders and identical three-dimensional (3D) printed versions at two viewing distances under monocular and binocular conditions. Virtual stimuli were viewed using a mirror stereoscope and an Oculus Rift head-mounted display (HMD), while physical stimuli were viewed in a controlled test environment. Depth judgments were similar in both virtual apparatuses, which suggests that variations in the viewing geometry and optics of the HMD have little impact on perceived depth. When viewing physical stimuli binocularly, judgments were accurate and exhibited stereoscopic depth constancy. However, in all cases, depth was underestimated for virtual stimuli and failed to achieve depth constancy. It is clear that depth constancy is only complete for cue-rich physical stimuli and that the failure of constancy in virtual stimuli is due to the presence of the vergence-accommodation conflict. Further, our post hoc analysis revealed that prior experience with virtual and physical environments had a strong effect on depth judgments. That is, performance in virtual environments was enhanced by limited exposure to a related task using physical objects.
Collapse
|
2
|
Using Microsaccades to Estimate Task Difficulty During Visual Search of Layered Surfaces. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:2904-2918. [PMID: 30835226 DOI: 10.1109/tvcg.2019.2901881] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We develop an approach to using microsaccade dynamics for the measurement of task difficulty/cognitive load imposed by a visual search task of a layered surface. Previous studies provide converging evidence that task difficulty/cognitive load can influence microsaccade activity. We corroborate this notion. Specifically, we explore this relationship during visual search for features embedded in a terrain-like surface, with the eyes allowed to move freely during the task. We make two relevant contributions. First, we validate an approach to distinguishing between the ambient and focal phases of visual search. We show that this spectrum of visual behavior can be quantified by a single previously reported estimator, known as Krejtz's K coefficient. Second, we use ambient/focal segments based on K as a moderating factor for microsaccade analysis in response to task difficulty. We find that during the focal phase of visual search (a) microsaccade magnitude increases significantly, and (b) microsaccade rate decreases significantly, with increased task difficulty. We conclude that the combined use of K and microsaccade analysis may be helpful in building effective tools that provide an indication of the level of cognitive activity within a task while the task is being performed.
Collapse
|
3
|
A dataset of stereoscopic images and ground-truth disparity mimicking human fixations in peripersonal space. Sci Data 2017; 4:170034. [PMID: 28350382 PMCID: PMC5369322 DOI: 10.1038/sdata.2017.34] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 01/13/2017] [Indexed: 01/17/2023] Open
Abstract
Binocular stereopsis is the ability of a visual system, belonging to a live being or a machine, to interpret the different visual information deriving from two eyes/cameras for depth perception. From this perspective, the ground-truth information about three-dimensional visual space, which is hardly available, is an ideal tool both for evaluating human performance and for benchmarking machine vision algorithms. In the present work, we implemented a rendering methodology in which the camera pose mimics realistic eye pose for a fixating observer, thus including convergent eye geometry and cyclotorsion. The virtual environment we developed relies on highly accurate 3D virtual models, and its full controllability allows us to obtain the stereoscopic pairs together with the ground-truth depth and camera pose information. We thus created a stereoscopic dataset: GENUA PESTO-GENoa hUman Active fixation database: PEripersonal space STereoscopic images and grOund truth disparity. The dataset aims to provide a unified framework useful for a number of problems relevant to human and computer vision, from scene exploration and eye movement studies to 3D scene reconstruction.
Collapse
|
4
|
Dominance of orientation over frequency in the perception of 3-D slant and shape. PLoS One 2013; 8:e64958. [PMID: 23741436 PMCID: PMC3669012 DOI: 10.1371/journal.pone.0064958] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Accepted: 04/23/2013] [Indexed: 11/21/2022] Open
Abstract
In images of textured three-dimensional surfaces, pattern changes can be characterized as changes in orientation and spatial frequency, features for which neurons in primary visual cortex are classically selective. Previously, we have demonstrated that correct 3-D shape perception is contingent on the visibility of orientation flows that run parallel to the surface curvature. We sought to determine the relative contributions of orientation modulations (OMs) and frequency modulations (FMs) for the detection of slant and shape from 3-D surfaces. Results show that 1) when OM and FM indicate inconsistent degrees of surface slant or curvature, observer responses were consistent with the slant or curvature specified by OM even if the FM indicated a slant or curvature in the opposite direction to the same degree. 2) For slanted surfaces, OM information dictates slant perception at both shallow and steep slants while FM information is effective only for steep slants. Together these results point to a dominant role of OM information in the perception of 3-D slant and shape.
Collapse
|
5
|
Effects of texture component orientation on orientation flow visibility for 3-D shape perception. PLoS One 2013; 8:e53556. [PMID: 23301085 PMCID: PMC3536750 DOI: 10.1371/journal.pone.0053556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Accepted: 11/29/2012] [Indexed: 12/04/2022] Open
Abstract
In images of textured 3-D surfaces, orientation flows created by the texture components parallel to the surface slant play a critical role in conveying the surface slant and shape. This study examines the visibility of these orientation flows in complex patterns. Specifically, we examine the effect of orientation of neighboring texture components on orientation flow visibility. Complex plaids consisting of gratings equally spaced in orientation were mapped onto planar and curved surfaces. The visibility of the component that creates the orientation flows was quantified by measuring its contrast threshold (CT) while varying the combination of neighboring components present in the pattern. CTs were consistently lowest only when components closest in orientation to that of the orientation flows were subtracted from the pattern. This finding suggests that a previously reported frequency-selective cross-orientation suppression mechanism involved with the perception of 3-D shape from texture is affected by proximity in orientation of concurrent texture components.
Collapse
|
6
|
Representation of 3-D surface orientation by velocity and disparity gradient cues in area MT. J Neurophysiol 2012; 107:2109-22. [PMID: 22219031 DOI: 10.1152/jn.00578.2011] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Neural coding of the three-dimensional (3-D) orientation of planar surface patches may be an important intermediate step in constructing representations of complex 3-D surface structure. Spatial gradients of binocular disparity, image velocity, and texture provide potent cues to the 3-D orientation (tilt and slant) of planar surfaces. Previous studies have described neurons in both dorsal and ventral stream areas that are selective for surface tilt based on one or more of these gradient cues. However, relatively little is known about whether single neurons provide consistent information about surface orientation from multiple gradient cues. Moreover, it is unclear how neural responses to combinations of surface orientation cues are related to responses to the individual cues. We measured responses of middle temporal (MT) neurons to random dot stimuli that simulated planar surfaces at a variety of tilts and slants. Four cue conditions were tested: disparity, velocity, and texture gradients alone, as well as all three gradient cues combined. Many neurons showed robust tuning for surface tilt based on disparity and velocity gradients, with relatively little selectivity for texture gradients. Some neurons showed consistent tilt preferences for disparity and velocity cues, whereas others showed large discrepancies. Responses to the combined stimulus were generally well described as a weighted linear sum of responses to the individual cues, even when disparity and velocity preferences were discrepant. These findings suggest that area MT contains a rudimentary representation of 3-D surface orientation based on multiple cues, with single neurons implementing a simple cue integration rule.
Collapse
|
7
|
Perceived slant of binocularly viewed large-scale surfaces: a common model from explicit and implicit measures. J Vis 2011; 10:13. [PMID: 21188784 DOI: 10.1167/10.14.13] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
It is known that the perceived slants of large distal surfaces, such as hills, are exaggerated and that the exaggeration increases with distance. In a series of two experiments, we parametrically investigated the effect of viewing distance and slant on perceived slant using a high-fidelity virtual environment. An explicit numerical estimation method and an implicit aspect-ratio approach were separately used to assess the perceived optical slant of simulated large-scale surfaces with different slants and viewing distances while gaze direction was fixed. The results showed that perceived optical slant increased logarithmically with viewing distance and the increase was proportionally greater for shallow slants. At each viewing distance, perceived optical slant could be approximately fit by linear functions of actual slant that were parallel across distances. These linear functions demonstrated a fairly constant gain of about 1.5 and an intercept that increased logarithmically with distance. A comprehensive three-parameter model based on the present data provides a good fit to a number of previous empirical observations measured in real environments.
Collapse
|
8
|
Visualizing 3D objects from 2D cross sectional images displayed in-situ versus ex-situ. J Exp Psychol Appl 2010; 16:45-59. [PMID: 20350043 DOI: 10.1037/a0018373] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The present research investigates how mental visualization of a 3D object from 2D cross sectional images is influenced by displacing the images from the source object, as is customary in medical imaging. Three experiments were conducted to assess people's ability to integrate spatial information over a series of cross sectional images in order to visualize an object posed in 3D space. Participants used a hand-held tool to reveal a virtual rod as a sequence of cross-sectional images, which were displayed either directly in the space of exploration (in-situ) or displaced to a remote screen (ex-situ). They manipulated a response stylus to match the virtual rod's pitch (vertical slant), yaw (horizontal slant), or both. Consistent with the hypothesis that spatial colocation of image and source object facilitates mental visualization, we found that although single dimensions of slant were judged accurately with both displays, judging pitch and yaw simultaneously produced differences in systematic error between in-situ and ex-situ displays. Ex-situ imaging also exhibited errors such that the magnitude of the response was approximately correct but the direction was reversed. Regression analysis indicated that the in-situ judgments were primarily based on spatiotemporal visualization, while the ex-situ judgments relied on an ad hoc, screen-based heuristic. These findings suggest that in-situ displays may be useful in clinical practice by reducing error and facilitating the ability of radiologists to visualize 3D anatomy from cross sectional images.
Collapse
|
9
|
Text scaffolds for effective surface labeling. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2008; 14:1675-1682. [PMID: 18989025 DOI: 10.1109/tvcg.2008.168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
In this paper we introduce a technique for applying textual labels to 3D surfaces. An effective labeling must balance the conflicting goals of conveying the shape of the surface while being legible from a range of viewing directions. Shape can be conveyed by placing the text as a texture directly on the surface, providing shape cues, meaningful landmarks and minimally obstructing the rest of the model. But rendering such surface text is problematic both in regions of high curvature, where text would be warped, and in highly occluded regions, where it would be hidden. Our approach achieves both labeling goals by applying surface labels to a 'text scaffold', a surface explicitly constructed to hold the labels. Text scaffolds conform to the underlying surface whenever possible, but can also float above problem regions, allowing them to be smooth while still conveying the overall shape. This paper provides methods for constructing scaffolds from a variety of input sources, including meshes, constructive solid geometry, and scalar fields. These sources are first mapped into a distance transform, which is then filtered and used to construct a new mesh on which labels are either manually or automatically placed. In the latter case, annotated regions of the input surface are associated with proximal regions on the new mesh, and labels placed using cartographic principles.
Collapse
|
10
|
Grid with a view: optimal texturing for perception of layered surface shape. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2007; 13:1656-1663. [PMID: 17968122 DOI: 10.1109/tvcg.2007.70559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
We present the results of two controlled studies comparing layered surface visualizations under various texture conditions. The task was to estimate surface normals, measured by accuracy of a hand-set surface normal probe. A single surface visualization was compared with the two-surfaces case under conditions of no texture and with projected grid textures. Variations in relative texture spacing on top and bottom surfaces were compared, as well as opacity of the top surface. Significant improvements are found for the textured cases over non-textured surfaces. Either larger or thinner top-surface textures, and lower top surface opacities are shown to give less bottom surface error. Top surface error appears to be highly resilient to changes in texture. Given the results we also present an example of how appropriate textures might be useful in volume visualization.
Collapse
|
11
|
Abstract
A novel illusion in apparent size is reported. We asked observers to estimate the width and depth of vertically oriented elliptic cylinders depicted with texture or luminance gradients (experiment 1), or the height of horizontally oriented elliptic cylinders depicted with binocular disparity (experiment 2). The estimated width or height of cylinders showed systematic shrinkage in the direction of the gradual depth change. The dissimilarity of 2-D appearance amongst our stimuli implies a large variation in spatial-frequency components and brightness contrasts, eliminating the possibility that these parameters contributed to the illusion. Also, the mechanism inappropriately triggered by pictorial depth cues (eg size scaling) may be irrelevant, because the illusion was obtained even when binocular disparity alone specified the shape of the cylinders. The illusion demonstrated here suggests that our visual system may determine the size of 3-D objects by accounting for their depth structures.
Collapse
|
12
|
A neural model of 3D shape-from-texture: Multiple-scale filtering, boundary grouping, and surface filling-in. Vision Res 2007; 47:634-72. [PMID: 17275061 DOI: 10.1016/j.visres.2006.10.024] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2006] [Revised: 10/15/2006] [Accepted: 10/22/2006] [Indexed: 10/23/2022]
Abstract
A neural model is presented of how cortical areas V1, V2, and V4 interact to convert a textured 2D image into a representation of curved 3D shape. Two basic problems are solved to achieve this: (1) Patterns of spatially discrete 2D texture elements are transformed into a spatially smooth surface representation of 3D shape. (2) Changes in the statistical properties of texture elements across space induce the perceived 3D shape of this surface representation. This is achieved in the model through multiple-scale filtering of a 2D image, followed by a cooperative-competitive grouping network that coherently binds texture elements into boundary webs at the appropriate depths using a scale-to-depth map and a subsequent depth competition stage. These boundary webs then gate filling-in of surface lightness signals in order to form a smooth 3D surface percept. The model quantitatively simulates challenging psychophysical data about perception of prolate ellipsoids [Todd, J., & Akerstrom, R. (1987). Perception of three-dimensional form from patterns of optical texture. Journal of Experimental Psychology: Human Perception and Performance, 13(2), 242-255]. In particular, the model represents a high degree of 3D curvature for a certain class of images, all of whose texture elements have the same degree of optical compression, in accordance with percepts of human observers. Simulations of 3D percepts of an elliptical cylinder, a slanted plane, and a photo of a golf ball are also presented.
Collapse
|
13
|
Texturing of layered surfaces for optimal viewing. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2006; 12:1125-32. [PMID: 17080843 DOI: 10.1109/tvcg.2006.183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This paper is a contribution to the literature on perceptually optimal visualizations of layered three-dimensional surfaces. Specifically, we develop guidelines for generating texture patterns, which, when tiled on two overlapped surfaces, minimize confusion in depth-discrimination and maximize the ability to localize distinct features. We design a parameterized texture space and explore this texture space using a "human in the loop" experimental approach. Subjects are asked to rate their ability to identify Gaussian bumps on both upper and lower surfaces of noisy terrain fields. Their ratings direct a genetic algorithm, which selectively searches the texture parameter space to find fruitful areas. Data collected from these experiments are analyzed to determine what combinations of parameters work well and to develop texture generation guidelines. Data analysis methods include ANOVA, linear discriminant analysis, decision trees, and parallel coordinates. To confirm the guidelines, we conduct a post-analysis experiment, where subjects rate textures following our guidelines against textures violating the guidelines. Across all subjects, textures following the guidelines consistently produce high rated textures on an absolute scale, and are rated higher than those that did not follow the guidelines.
Collapse
|
14
|
An approach to the perceptual optimization of complex visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2006; 12:509-21. [PMID: 16805260 DOI: 10.1109/tvcg.2006.58] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
This paper proposes a new experimental framework within which evidence regarding the perceptual characteristics of a visualization method can be collected, and describes how this evidence can be explored to discover principles and insights to guide the design of perceptually near-optimal visualizations. We make the case that each of the current approaches for evaluating visualizations is limited in what it can tell us about optimal tuning and visual design. We go on to argue that our new approach is better suited to optimizing the kinds of complex visual displays that are commonly created in visualization. Our method uses human-in-the-loop experiments to selectively search through the parameter space of a visualization method, generating large databases of rated visualization solutions. Data mining is then used to extract results from the database, ranging from highly specific exemplar visualizations for a particular data set, to more broadly applicable guidelines for visualization design. We illustrate our approach using a recent study of optimal texturing for layered surfaces viewed in stereo and in motion. We show that a genetic algorithm is a valuable way of guiding the human-in-the-loop search through visualization parameter space. We also demonstrate several useful data mining methods including clustering, principal component analysis, neural networks, and statistical comparisons of functions of parameters.
Collapse
|
15
|
The effects of field of view on the perception of 3D slant from texture. Vision Res 2005; 45:1501-17. [PMID: 15781069 DOI: 10.1016/j.visres.2005.01.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2004] [Revised: 12/21/2004] [Accepted: 01/03/2005] [Indexed: 11/18/2022]
Abstract
Observers judged the apparent signs and magnitudes of surface slant from monocular textured images of convex or concave dihedral angles with varying fields of view between 5 degrees C and 60 degrees C. The results revealed that increasing the field of view or the regularity of the surface texture produced large increases in the magnitude of the perceptual gain (i.e., the judged slant divided by the ground truth). Additional regression analyses also revealed that observers slant judgments were highly correlated with the range of texture densities (or spatial frequencies) in each display, which accounted for 96% of the variance among the different possible dihedral angles and fields of view.
Collapse
|
16
|
Abstract
The failure of shape constancy from stereoscopic information is widely reported in the literature. In this study we investigate how shape constancy is influenced by the size of the object and by the shape of the object's surface. Participants performed a shape-judgment task on objects of five sizes with three different surface shapes. The shapes used were: a frontoparallel rectangle, a triangular ridge surface, and a cylindrical surface, all of which contained the same maximum depth information, but different variations in depth across the surface. The results showed that, generally, small objects appear stretched and large objects appear squashed along the depth dimension. We also found a larger variance in shape judgments for rectangular stimuli than for cylindrical and ridge-shaped stimuli, suggesting that, when performing shape judgments with cylindrical and ridge-shaped stimuli, observers rely on a higher-order shape representation.
Collapse
|
17
|
Conveying shape with texture: experimental investigations of texture's effects on shape categorization judgments. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2004; 10:471-483. [PMID: 18579974 DOI: 10.1109/tvcg.2004.5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In this paper, we describe the results of two comprehensive controlled observer experiments intended to yield insight into the following question: If we could design the ideal texture pattern to apply to an arbitrary smoothly curving surface in order to enable its 3D shape to be most accurately and effectively perceived, what would the characteristics of that texture pattern be? We begin by reviewing the results of our initial study in this series, which were presented at the 2003 IEEE Symposium on Information Visualization, and offer an expanded analysis of those findings. We continue by presenting the results of a follow-on study in which we sought to more specifically investigate the separate and combined influences on shape perception of particular texture components, with the goal of obtaining a clearer view of their potential information carrying capacities. In each study, we investigated the observers' ability to identify the intrinsic shape category of a surface patch (elliptical, hyperbolic, cylindrical, or flat) and its extrinsic surface orientation (convex, concave, both, or neither). In our first study, we compared performance under eight different texture type conditions, plus two projection conditions (perspective or orthographic) and two viewing conditions (head-on or oblique). In this study, we found that: 1) Shape perception was better facilitated, in general, by the bidirectional "principal direction grid" pattern than by any of the seven other patterns tested; 2) shape type classification accuracy remained high under the orthographic projection condition for some texture types when the viewpoint was oblique; 3) perspective projection was required for accurate surface orientation classification; and 4) shape classification accuracy was higher when the surface patches were oriented at a (generic) oblique angle to the line of sight than when they were oriented (in a nongeneric pose) to face the viewpoint straight on. In our second study, we compared performance under eight new texture type conditions, redesigned to facilitate gathering insight into the cumulative effects of specific individual directional components in a wider variety of multidirectional texture patterns. In this follow-on study, we found that shape classification accuracy was equivalently good under a variety of test patterns that included components following either the first or first and second principal directions, in addition to other directions, suggesting that a principal direction grid texture is not the only possible "best option" for enhancing shape representation.
Collapse
|
18
|
Abstract
Most existing computational models of the visual perception of three-dimensional shape from texture are based on assumed constraints about how texture is distributed on visible surfaces. The research described in the present article was designed to investigate how violations of these assumptions influence human perception. Observers were presented with images of smoothly curved surfaces depicted with different types of texture, whose distribution of surface markings could be both anisotropic and inhomogeneous. Observers judged the pattern of ordinal depth on each object by marking local maxima and minima along designated scan lines. They also judged the apparent magnitudes of relative depth between designated probe points on the surface. The results revealed a high degree of accuracy and reliability in all conditions, except for a systematic underestimation of the overall magnitude of surface relief. These findings suggest that human perception of three-dimensional shape from texture is much more robust than would be reasonable to expect based on current computational models of this phenomenon.
Collapse
|
19
|
Abstract
We document the limitations of isotropic textures in conveying three-dimensional shape. We measured the perceived shape and pitch of upright and pitched corrugated surfaces overlaid with different classes of isotropic textures: patterns containing isotropic texture elements, isotropically filtered noise patterns, and patterns containing ellipses or lines of all orientations. Frequency modulations arising from surface slant were incorrectly interpreted as changes in surface distance, resulting in concavities being misclassified as convexities, and right and left slants as concavities. In addition, images of pitched surfaces exhibited oriented flows that confound surface shape and surface pitch. Observers related oriented flow patterns to particular surface shapes with a bias for perceiving convex surfaces. When concave and convex curvatures were concurrently visible, the number of correct shape classifications increased slightly. Isotropic textures thus convey correct 3-D shapes of developable surfaces only in some conditions, and the same perceptual strategies lead to non-veridical percepts in other conditions.
Collapse
|
20
|
Abstract
Texture can be an effective source of information for perception of slant and curvature. A computational assumption required for some texture cues is that texture must be flat along a surface. There are many textures which violate this assumption, and have some sort of texture relief: variations perpendicular to the surface. Some examples include grass, which has vertical elements, or scattered rocks, which are volumetric elements with 3-D shapes. Previous studies of perception of slant from texture have not addressed the case of textures with relief. The experiments reported here test judgments of slant for textures with various types of relief, including textures composed of bumps, columns, and oriented elements. The presence of texture relief was found to affect judgments, indicating that perception of slant from texture is not robust to violations of the flat-texture assumption. For bumps and oriented elements, slant was underestimated relative to matching flat textures, while for columns textures, which had visible flat top faces, perceived slant was equal or greater than for flat textures. The differences can be explained by the way different types of texture relief affect the amount of optical compression in the projected image, which would be consistent with results from previous experiments using cue conflicts in flat textures. These results provide further evidence that compression contributes to perception of slant from texture.
Collapse
|
21
|
The stereoscopic anisotropy: individual differences and underlying mechanisms. J Exp Psychol Hum Percept Perform 2002; 28:469-76. [PMID: 11999867 DOI: 10.1037/0096-1523.28.2.469] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Observers are more sensitive to variations in the depth of stereoscopic surfaces in a vertical than in a horizontal direction; however, there are large individual differences in this anisotropy. The authors measured discrimination thresholds for surfaces slanted about a vertical axis or inclined about a horizontal axis for 50 observers. Orientation and spatial frequency discrimination thresholds were also measured. For most observers, thresholds were lower for inclination than for slant and lower for orientation than for spatial frequency. There was a positive correlation between the 2 anisotropies, resulting from positive correlations between (a) orientation and inclination thresholds and (b) spatial frequency and slant thresholds. These results support the notion that surface inclination and slant perception is in part limited by the sensitivity of orientation and spatial frequency mechanisms.
Collapse
|
22
|
Abstract
This paper uses visual, empirical and formal methods (Li & Zaidi, Vision Research, 40 (2000) 217; Li & Zaidi, Vision Research, 41 (22) (2001a) 2927) to examine the roles of oriented texture components in conveying veridical percepts of concave and convex surfaces that are pitched towards or away from the observer. The results show that pairs of components, oriented symmetrically around the axis of maximum curvature, combine to provide the geodesic orientation modulations that are critical for veridical shape perception. The degree of pitch determines the orientations of the critical pair of components. Perspective is crucial to the veridical perception of concavities and convexities, regardless of the degree of pitch. The results of this paper reconfirm that veridical shape perception depends on extracting critical patterns of oriented energy, but also show that the class of textures capable of conveying veridical percepts of developable shapes in general views is even more restricted than that identified by Li and Zaidi (Journal of Optical Society of America A, 18 (2001b), 2430).
Collapse
|
23
|
Abstract
Li and Zaidi (Vision Research 40 (2000) 217; 41 (2001) 1519) have recently argued that there are two necessary conditions for the perception of 3D shape from texture: (1) the texture pattern must have a disproportionate amount of energy along directions of principal curvature; and (2) the surface must be viewed with a noticeable amount of perspective. In the present article we present evidence that these conclusions are only valid under a limited set of non-generic viewing conditions. Other relevant factors that need to be considered in this context include the distribution of curvature on an object's surface and the set of possible viewing directions from which it can be observed. For generic viewing directions and patterns of curvature, the perception of surface curvature from texture is only minimally affected by the orientation spectrum of the texture pattern or the amount perspective in its optical projection. Li and Zaidi (Vision Research 41 (2001) 1519) have also identified two characteristic patterns of image contours, which they claim to be the only possible source of information within textured images for determining the direction of surface slant or the sign of surface curvature. In the present article we attempt to show that these characteristic patterns can only arise in natural vision for a limited set of non-generic viewing directions. We also review several other factors that can influence the perceived direction of slant or the perceived sign of curvature, which have been identified previously by other investigators.
Collapse
|
24
|
Veridicality of three-dimensional shape perception predicted from amplitude spectra of natural textures. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2001; 18:2430-2447. [PMID: 11583260 DOI: 10.1364/josaa.18.002430] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
We show that the amplitude spectrum of a texture pattern, regardless of its phase spectrum, can be used to predict whether the pattern will convey the veridical three-dimensional (3-D) shape of the surface on which it lies. Patterns from the Brodatz collection of natural textures were overlaid on a flat surface that was then corrugated in depth and projected in perspective. Perceived ordinal shapes, reconstructed from a series of local relative depth judgments, showed that only about a third of the patterns conveyed veridical shape. The phase structure of each pattern was then randomized. Simulated concavities and convexities were presented for both the Brodatz and the phase-randomized patterns in a global shape identification task. The concordance between the shapes perceived from the Brodatz patterns and their phase-randomized versions was 80-88%, showing that the capacity for a pattern to correctly convey concavities and convexities is independent of phase information and that the amplitude spectrum contains all the information required to determine whether a pattern will convey veridical 3-D shape. A measure of the discrete oriented energy centered on the axis of maximum curvature was successful in identifying textures that convey veridical shape.
Collapse
|
25
|
Erratum to "Information limitations in perception of shape from texture". [Vision Research 41 (2001) 1519-1534]. Vision Res 2001; 41:2927-42. [PMID: 11701185 DOI: 10.1016/s0042-6989(01)00255-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Li and Zaidi (Li, A., and Zaidi, Q. (2000) Vision Research, 40, 217-242) showed that the veridical perception of the 3-dimensional (3D) shape of a corrugated surface from texture cues is entirely dependent on the visibility of critical patterns of oriented energy. These patterns are created by perspective projection of surface markings oriented along lines of maximum 3D curvature. In images missing these orientation modulations, observers confused concavities with convexities, and leftward slants with rightward slants. In this paper, it is shown that these results were a direct consequence of the physical information conveyed by different oriented components of the texture pattern. For texture patterns consisting of single gratings of arbitrary spatial frequency and orientation, equations are derived from perspective geometry that describe the local spatial frequency and orientation for any slant at any height above and below eye level. The analysis shows that only gratings oriented within a few degrees of the axis of maximum curvature exhibit distinct patterns of orientation modulations for convex, concave, and leftward and rightward slanted portions of a corrugated surface. All other gratings exhibit patterns of frequency and orientation modulations that are distinct for curvatures on the one hand and slants on the other, but that are nearly identical for curvatures of different sign, and nearly identical for slants of different direction. The perceived shape of surfaces was measured in a 5AFC paradigm (concave, convex, leftward slant, rightward slant, and flat-frontoparallel). Observers perceived all five shapes correctly only for gratings oriented within a few degrees of the axis of maximum curvature. For all other oriented gratings, observers could distinguish curvatures from slants, but could not distinguish signs of curvature or directions of slant. These results demonstrate that human observers utilize the shape information provided by texture components along both critical and non-critical orientations.
Collapse
|
26
|
Abstract
This study aimed at quantifying diminished depth perception in telemedicine due to the two-dimensional image and to devise coping strategies for the problem. Two hundred and thirty-five patients in the telemedicine room of a Minor Accident and Treatment Service were studied. The magnitude of impaired depth perception was noted. Seven coping strategies were used and the resolution of the problem was measured. Depth perception was judged to be less than 90% of binocular vision in 235 cases. This improved to more than 90% of binocular vision in 99 of the 234 cases (42.13%) when using of all strategies. Improvement by rotation of the camera 30 degrees at a time in the axial plane was the most useful strategy and it occurred in all 235 (100%) cases. Light adjustment and angulation occurred in 206 of 235 cases (87.66%). Comparison with the opposite side helped in 179 of 235 cases (76.17%), skin color and texture in 139 of 235 cases (59.15%), shutting one eye in 103 of 235 cases (43.83%), enlarging the image in 85 of 235 cases (36.17%), and diminishing depth of field of lens in 77 of 235 cases (32.77%). Other visual cues occurred in 63 of 235 cases (26.81%). Impaired depth perception is a significant problem in telemedicine. It can be improved to make a confident diagnosis in most cases by adopting a variety of strategies that are described in this paper.
Collapse
|
27
|
Abstract
Li and Zaidi (Li, A., and Zaidi, Q. (2000) Vision Research, 40, 217-242) showed that the veridical perception of the 3-dimensional (3D) shape of a corrugated surface from texture cues is entirely dependent on the visibility of critical patterns of oriented energy. These patterns are created by perspective projection of surface markings oriented along lines of maximum 3D curvature. In images missing these orientation modulations, observers confused concavities with convexities, and leftward slants with rightward slants. In this paper, it is shown that these results were a direct consequence of the physical information conveyed by different oriented components of the texture pattern. For texture patterns consisting of single gratings of arbitrary spatial frequency and orientation, equations are derived from perspective geometry that describe the local spatial frequency and orientation for any slant at any height above and below eye level. The analysis shows that only gratings oriented within a few degrees of the axis of maximum curvature exhibit distinct patterns of orientation modulations for convex, concave, and leftward and rightward slanted portions of a corrugated surface. All other gratings exhibit patterns of frequency and orientation modulations that are distinct for curvatures on the one hand and slants on the other, but that are nearly identical for curvatures of different sign, and nearly identical for slants of different direction. The perceived shape of surfaces was measured in a 5AFC paradigm (concave, convex, leftward slant, rightward slant, and flat-frontoparallel). Observers perceived all five shapes correctly only for gratings oriented within a few degrees of the axis of maximum curvature. For all other oriented gratings, observers could distinguish curvatures from slants, but could not distinguish signs of curvature or directions of slant. These results demonstrate that human observers utilize the shape information provided by texture components along both critical and non-critical orientations.
Collapse
|
28
|
Perceptual learning without feedback and the stability of stereoscopic slant estimation. Perception 2001; 30:95-114. [PMID: 11257982 DOI: 10.1068/p3163] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Subjects were examined for practice effects in a stereoscopic slant-estimation task involving surfaces that comprised a large portion of the visual field. In most subjects slant estimation was significantly affected by practice, but only when an isolated surface (an absolute disparity gradient) was present in the visual field. When a second, unslanted, surface was visible (providing a second disparity gradient and thereby also a relative disparity gradient) none of the subjects exhibited practice effects. Apparently, stereoscopic slant estimation is more robust or stable over time in the presence of a second surface than in its absence. In order to relate the practice effects, which occurred without feedback, to perceptual learning, results are interpreted within a cue-interaction framework. In this paradigm the contribution of a cue depends on its reliability. It is suggested that normally absolute disparity gradients contribute relatively little to perceived slant and that subjects learn to increase this contribution by utilizing proprioceptive information. It is argued that--given the limited computational power of the brain--a relatively small contribution of absolute disparity gradients in perceived slant enhances the stability of stereoscopic slant perception.
Collapse
|
29
|
Abstract
We study the hypothesis that observers can use haptic percepts as a standard against which the relative reliabilities of visual cues can be judged, and that these reliabilities determine how observers combine depth information provided by these cues. Using a novel visuo-haptic virtual reality environment, subjects viewed and grasped virtual objects. In Experiment 1, subjects were trained under motion relevant conditions, during which haptic and visual motion cues were consistent whereas haptic and visual texture cues were uncorrelated, and texture relevant conditions, during which haptic and texture cues were consistent whereas haptic and motion cues were uncorrelated. Subjects relied more on the motion cue after motion relevant training than after texture relevant training, and more on the texture cue after texture relevant training than after motion relevant training. Experiment 2 studied whether or not subjects could adapt their visual cue combination strategies in a context-dependent manner based on context-dependent consistencies between haptic and visual cues. Subjects successfully learned two cue combination strategies in parallel, and correctly applied each strategy in its appropriate context. Experiment 3, which was similar to Experiment 1 except that it used a more naturalistic experimental task, yielded the same pattern of results as Experiment 1 indicating that the findings do not depend on the precise nature of the experimental task. Overall, the results suggest that observers can involuntarily compare visual and haptic percepts in order to evaluate the relative reliabilities of visual cues, and that these reliabilities determine how cues are combined during three-dimensional visual perception.
Collapse
|
30
|
Abstract
The role of disparity-perspective cue conflict in depth contrast was examined. A central square and a surrounding frame were observed in a stereoscope. Five conditions were compared: (1) only disparity was introduced into either the centre or surround stimulus, (2) only perspective was introduced into the centre or surround, (3) concordant perspective and disparity were introduced into the centre or surround, (4) disparity was introduced into one stimulus and perspective into the other, and (5) only the centre stimulus was presented with horizontal shear disparity and perspective manipulated independently. The results show that individual differences in depth contrast were related to individual differences in the weighting of disparity and perspective in the single-stimulus conditions. We conclude that conflict between disparity and perspective contributes to depth contrast. However, significant depth contrast occurred when there was no disparity-perspective cue conflict, indicating that this cue conflict is not the sole mechanism producing depth contrast.
Collapse
|
31
|
Role of chromaticity, contrast, and local orientation cues in the perception of density. Perception 2000; 29:581-600. [PMID: 10992955 DOI: 10.1068/p3043] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
We compared the role of the red-green, blue-yellow, and luminance post-receptoral mechanisms in the perception of density. The task requires the comparison of densities between two stimuli composed of oriented bandpass elements, pseudo-randomly scattered across an area of constant size. The perception of density differences was measured by a temporal 2AFC procedure for all pairs of mechanisms and for four possible densities. We found that stimuli of identical physical densities are not perceived equally: there is a consistent bias in favour of blue-yellow stimuli which are perceived as significantly more dense than red-green and achromatic stimuli. We considered three factors that could have differentially affected the density perception of blue-yellow stimuli: an increase in the perceived size of the individual blue-yellow elements, a perceived contrast difference, and the presence of local orientation cues. We found that the increased perceived density of the blue-yellow stimuli occurred despite the fact that there was no increase in perceived size of the individual elements, and remained despite corrections for the two other factors. We conclude that the significant increase in perceived density for the blue-yellow mechanism is a global effect, associated with a perceived colour 'melting' of the elements in the array. Our data were fitted with the occupancy model of Allik and Tuulmets (1991, Perception & Psychophysics 49 303-314) and we found that blue-yellow stimuli have a greater 'occupancy' than red-green or achromatic stimuli.
Collapse
|
32
|
Abstract
This paper presents empirical support for a new observer model of inferring three-dimensional shape from monocular texture cues. By measuring observers' abilities to estimate the relative three-dimensional curvature along a textured surface from two-dimensional projected images, and concurrently examining the local spectral changes occurring in the projected image for various texture patterns, we have found that correlated changes in oriented energy along lines corresponding to the lines of maximum and minimum curvature of the surface are crucial for conveying the three-dimensional shape of the surface. Energy along these lines of maximum and minimum curvature can be used to compute the orientation of local surface patches. Texture patterns consisting of simple and complex sinusoidal gratings and plaids, and filtered noise were drawn onto a surface that was corrugated sinusoidally in depth about the horizontal axis and projected in perspective onto an image plane. The perceived relative surface curvature was reconstructed from measurements of local ordinal depth around a central fixation point at 12 different phases of the corrugation. Our results show that: (1) it is neither necessary nor sufficient to identify individual texture elements or texture gradients in order to extract the shape of the surface; (2) one-dimensional frequency modulation is insufficient for conveying complex three-dimensional shape. (3) Veridical ordinal depth is seen only when the projected pattern contains changes in oriented energy along lines corresponding to projected lines of maximum curvature of the surface. (4) For a surface corrugated in depth about the horizontal axis, this pattern of oriented energy arises from energy along the vertical direction in the global Fourier transform of the pre-corrugated pattern. (5) Local orientation changes across lines of minimum curvature can be also critical for conveying shape. (6) These correlated orientation changes along lines of maximum and minimum curvature are entirely lost in parallel projection. Hence texture is a useful cue for shape if the image is a perspective projection. (7) Only some natural textures will provide sufficient monocular cues to support veridical shape inferences, and this can be predicted from their global Fourier transforms.
Collapse
|
33
|
Abstract
Previous investigators have shown that observers' visual cue combination strategies are remarkably flexible in the sense that these strategies adapt on the basis of the estimated reliabilities of the visual cues. However, these researchers have not addressed how observers' acquire these estimated reliabilities. This article studies observers' abilities to learn cue combination strategies. Subjects made depth judgments about simulated cylinders whose shapes were indicated by motion and texture cues. Because the two cues could indicate different shapes, it was possible to design tasks in which one cue provided useful information for making depth judgments, whereas the other cue was irrelevant. The results of experiment 1 suggest that observers' cue combination strategies are adaptable as a function of training; subjects adjusted their cue combination rules to use a cue more heavily when the cue was informative on a task versus when the cue was irrelevant. Experiment 2 demonstrated that experience-dependent adaptation of cue combination rules is context-sensitive. On trials with presentations of short cylinders, one cue was informative, whereas on trials with presentations of tall cylinders, the other cue was informative. The results suggest that observers can learn multiple cue combination rules, and can learn to apply each rule in the appropriate context. Experiment 3 demonstrated a possible limitation on the context-sensitivity of adaptation of cue combination rules. One cue was informative on trials with presentations of cylinders at a left oblique orientation, whereas the other cue was informative on trials with presentations of cylinders at a right oblique orientation. The results indicate that observers did not learn to use different cue combination rules in different contexts under these circumstances. These results are consistent with the hypothesis that observers' visual systems are biased to learn to perceive in the same way views of bilaterally symmetric objects that differ solely by a symmetry transformation. Taken in conjunction with the results of Experiment 2, this means that the visual learning mechanism underlying cue combination adaptation is biased such that some sets of statistics are more easily learned than others.
Collapse
|
34
|
Abstract
We report the results of a depth-matching experiment in which subjects were asked to adjust the height of an ellipse until it matched the depth of a simulated cylinder defined by texture and motion cues. In one-third of the trials the shape of the cylinder was primarily given by motion information, in another one-third of the trials it was given by texture information, and on the remaining trials it was given by both sources of information. Two optimal cue combination models are described where optimality is defined in terms of Bayesian statistics. The parameter values of the models are set based on subjects' responses on trials when either the motion cue or the texture cue was informative. These models provide predictions of subjects' responses on trials when both cues were informative. The results indicate that one of the optimal models provides a good fit to the subjects' data, and the second model provides an exceptional fit. Because the predictions of the optimal models closely match the experimental data, we conclude that observers' cue-combination strategies are indeed optimal, at least under the conditions studied here.
Collapse
|
35
|
Abstract
To pick up 3-D aspects of pictures is arguably the most difficult problem concerning tactile pictorial perception by the blind. The aim of the experiments reported was to examine the potential utility of texture gradients in this context. Since there is no theoretical basis for predicting absolute values of 3-D properties from 2-D patterns read by the finger pads, the abilities of participants to perceive gradients lying between known maxima and minima were assessed. Experiment 1 involved blindfolded sighted participants making verbal magnitude estimations of texture-gradient magnitudes corresponding to plane surfaces at different slants. In experiment 2 the participants' task was to orient a surface at a slant corresponding to the texture gradients depicted tactually, and experiment 3 required early-blind participants to attempt the same task. The results revealed that participants can scale the magnitudes of texture gradients with high precision and that they can also accurately produce surface slants from depictions, providing the extreme conditions are clearly defined and there are opportunities for learning. Texture gradients appear as informative to the blind as they do to the sighted. To what extent these data can be generalised to other gradients and textures or to other projections of 3-D scenes remains to be investigated.
Collapse
|
36
|
Abstract
The slant of a stereoscopically defined surface cannot be determined solely from horizontal disparities or from derived quantities such as horizontal size ratio (HSR). There are four other signals that, in combination with horizontal disparity, could in principle allow an unambiguous estimate of slant: the vergence and version of the eyes, the vertical size ratio (VSR), and the horizontal gradient of VSR. Another useful signal is provided by perspective slant cues. The determination of perceived slant can be modeled as a weighted combination of three estimates based on those signals: a perspective estimate, a stereoscopic estimate based on HSR and VSR, and a stereoscopic estimate based on HSR and sensed eye position. In a series of experiments, we examined human observers' use of the two stereoscopic means of estimation. Perspective cues were rendered uninformative. We found that VSR and sensed eye position are both used to interpret the measured horizontal disparities. When the two are placed in conflict, the visual system usually gives more weight to VSR. However, when VSR is made difficult to measure by using short stimuli or stimuli composed of vertical lines, the visual system relies on sensed eye position. A model in which the observer's slant estimate is a weighted average of the slant estimate based on HSR and VSR and the one based on HSR and eye position accounted well for the data. The weights varied across viewing conditions because the informativeness of the signals they employ vary from one situation to another.
Collapse
|
37
|
Ideal observer perturbation analysis reveals human strategies for inferring surface orientation from texture. Vision Res 1998; 38:2635-56. [PMID: 12116709 DOI: 10.1016/s0042-6989(97)00415-x] [Citation(s) in RCA: 50] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Optical texture patterns contain three quasi-independent cues to planar surface orientation: perspective scaling, projective foreshortening and density. The purpose of this work was to estimate the perceptual weights assigned to these texture cues for discriminating surface orientation and to measure the visual system's reliance on an isotropy assumption in interpreting foreshortening information. A novel analytical technique is introduced which takes advantage of the natural cue perturbations inherent in stochastic texture stimuli to estimate cue weights and measure the influence of an isotropy assumption. Ideal observers were derived which compute the exact information content of the different texture cues in the stimuli used in the experiments and which either did or did not rely on an assumption of surface texture isotropy. Simulations of the ideal observers using the same stimuli shown to subjects in a slant discrimination task provided trial-by-trial estimates of the natural cue perturbations which were inherent in the stimuli. By back-correlating subjects' judgments with the different ideal observer estimates, we were able to estimate both the weights given to each cue by subjects and the strength of subjects' prior assumptions of isotropy. In all of the conditions tested, we found that subjects relied primarily on the foreshortening cue. A small, but significant weight was given to scaling information and no significant weight was given to density information. In conditions in which the surface textures deviated from isotropy by random amounts from stimulus to stimulus, subject judgements correlated well with the estimates of an ideal observer which incorrectly assumed surface texture isotropy. This correlation was not complete, however, suggesting that a soft form of the isotropy constraint was used. Moreover, the correlation was significantly lower for textures containing higher-order information about surface orientation (skew of rectangular texture elements). The results of the analysis clearly implicate texture foreshortening as a primary cue for perceiving surface slant from texture and suggest that the visual system incorporates a strong, though not complete, bias to interpret surface textures as isotropic in its inference of surface slant from texture. They further suggest that local texture skew, when available in an image, contributes significantly to perceptual estimates of surface orientation.
Collapse
|
38
|
The perception of scale-dependent and scale-independent surface structure from binocular disparity, texture, and shading. Perception 1998; 27:147-66. [PMID: 9709448 DOI: 10.1068/p270147] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The integration of binocular disparity, shading, and texture was measured for two different aspects of three-dimensional structure: (1) shape index, which is a measure of scale-independent structure, and (2) curvedness, which is a measure of scale-dependent structure. Binocular disparity was found to contribute significantly more to judged shape index than it does to judged curvedness, and shading and texture were both found to contribute more to judged curvedness than to judged shape index. These results demonstrate that different cues do not contribute equally to different aspects of perceived surface structure. This finding suggests that, for the case of linear integration, multiple cues to three-dimensional structure do not combine on the basis of a single type of representation shared by all the 'shape-from-X' processes in the visual system.
Collapse
|
39
|
Surface orientation from texture: ideal observers, generic observers and the information content of texture cues. Vision Res 1998; 38:1655-82. [PMID: 9747502 DOI: 10.1016/s0042-6989(97)00324-6] [Citation(s) in RCA: 95] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Perspective views of textured, planar surfaces provide a number of cues about the orientations of the surfaces. These include the information created by perspective scaling of texture elements (scaling), the information created by perspective foreshortening of texels (foreshortening) and, for textures composed of discrete elements, the information created by the effects of both scaling and foreshortening on the relative positions of texels (position). We drive a general form for ideal observers for each of these cues as they appear in images of spatially extended textures, (e.g. those composed of solid 2-D figures). As an application of the formulation, we derive a set of 'generic' observers which we show perform near optimally for images of a broad range of surface textures, without special prior knowledge about the statistics of the textures. Using simulations of ideal observers, we analyze the informational structure of texture cues, including a quantification of lower bounds on reliability for the three different cues, how cue reliability varies with slant angle and how it varies with field of view. We also quantify how strongly the reliability of the foreshortening cue depends on a prior assumption of isotropy. Finally, we extend the analysis to a naturalistic class of textures, showing that the information content of textures particularly suited to psychophysical investigation can be quantified, at least to a first-order approximation. The results provide an important computational foundation for psychophysical work on perceiving surface orientation from texture.
Collapse
|
40
|
Abstract
In order to quantify the ability of the human visual system to use texture information to perceive planar surface orientation, I measured subjects' ability to discriminate planar surface slant (angle away from the fronto-parallel) for a variety of different types of textures and in a number of different viewing conditions. I measured the subjects' discrimination performance as a function of surface slant, field of view size and surface texture structure. I compared the subjects' performance with that of ideal observers derived for each of the available texture cues--texel position, scaling and foreshortening. The results can be summarized by four points: (i) subjects' discrimination performance improves dramatically with increasing surface slant, tracking the performance of the ideal observers; (ii) subjects can integrate texture information over a large range of visual angles; (iii) comparisons between human subjects and ideal observers show that the human observers rely to some degree on foreshortening information; and (iv) similar comparisons show that in using foreshortening information, subjects rely to some extent on a prior assumption of isotropy.
Collapse
|
41
|
Abstract
With a horizontal magnifier before one eye, a frontoparallel surface appears rotated about a vertical axis (geometric effect). With a vertical magnifier, apparent rotation is opposite in direction (induced effect); to restore appearance of frontoparallelism, the surface must be rotated away from the magnified eye. The induced effect is interesting because it was thought until recently that vertical disparities do not play an important role in surface perception. As with the geometric effect, the required rotation for the induced effect increases linearly to approximately equal to 4% magnification; unlike the geometric effect, it plateaus at approximately 8%. Current theory explains the linear portion: vertical size ratios (VSRs) are used to compensate for changes in horizontal size ratios (HSRs) that accompany eccentric gaze, so changes in VSR cause changes in perceived slant. The theory does not explain the plateau. We demonstrate that it results from differing slant estimates obtained by use of various retinal and extra-retinal signals. When perspective cues to slant are minimized or sensed eye position is consistent with VSR, the induced and geometric effects have similar magnitudes even at large magnifications.
Collapse
|
42
|
Abstract
We examine two models for human perception of shape from texture, based on two assumptions about the surface texture: isotropy and homogeneity. Observers made orientation judgments on planar textured surfaces. Surface textures were either isotropic or anisotropically stretched or compressed. If subjects used an isotropy assumption, they would make biased orientation estimates for the anisotropic textures. In some conditions some observers showed no bias for the anisotropic textures relative to the isotropic textures. In general, even when the observers showed bias, the biases were significantly less than those predicted if the observer used only deviation from isotropy as a cue. Observers appear to use both the deviation from isotropy and a texture gradient or affine texture distortion cue for shape from texture.
Collapse
|
43
|
Surface gradients, contours and the perception of surface attitude in images of complex scenes. Perception 1996; 25:701-13. [PMID: 8888302 DOI: 10.1068/p250701] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Sophisticated computer graphics were used to generate images of three-dimensional blocks-world scenes to investigate the perception of surface attitude. Three types of surface primitive (planar blocks, cylinders, and ellipsoids) were combined to form structured settings. The experiments were designed to investigate whether surface-based information such as gradients in shading and texture provide any significant advantage in attitude judgments over information derived from object contours. Images of shaded, textured, and line-drawn surfaces formed the stimulus set. The subjects' task consisted of setting an attitude probe on different parts of the scene so that the probe appeared to be locally coplanar with the perceived surfaces. Analysis of settings according to attitude components, slant and tilt, shows remarkable agreement in slant settings for the shaded and line-drawn scenes but poor correlation between shaded and textured scenes. Similarly, tilt was also easily judged in shaded and line-drawn scenes and the experiments indicate that explicit surface boundaries are important for stable tilt perception. In general, the results suggest that, for the simple surfaces employed here, surface cues provide little extra information beyond that which is derived from contours.
Collapse
|
44
|
Characterization of the spatial-frequency spectrum in the perception of shape from texture. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 1995; 12:1208-1224. [PMID: 7769507 DOI: 10.1364/josaa.12.001208] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The major cue to shape from texture is the compression of texture as a function of surface curvature [J. Exp. Psychol. 13, 242 (1987); Vision Res. 33, 827 (1993)]. A number of computational models have been proposed in which compression is measured by detection of changes in the spatial-frequency spectrum [Comput. Graphics Image Process. 5, (1976)]. We propose that the visual system uses a strategy of characterizing the frequency spectrum by a simple set of measures and of tracking the changes in this characterization rather than determining changes in the shape of the actual spectra. Our evidence is based on a number of psychophysical demonstrations that use stimuli with specifically tailored frequency spectra, constructed from white noise filtered in the frequency domain. Our evidence suggests that the visual system determines the average peak frequency of the spectrum and uses this measure as its characterization. Changes in fp are strongly correlated with the degree of surface curvature, and, over a range of stimuli, fp takes account of the variance in local estimates of the frequency spectrum. One computes fp by determining the peak frequency at each spatial location and then averaging these frequency values over a local spatial region. We show that fp is related to the second-order moment but is more biologically plausible and shows superior ability to function in the presence of noise. As a test of this model, we have constructed a neural network architecture for computing shape from texture. Our model is limited to orthographically projected, homogeneous textures without in-surface rotation. The early stages of the model consist of multiple simple-cell units tuned to different orientations and spatial frequencies. We show that these simple cells are inadequate for the determination of compression but that the outputs of complex-cell-like units after normalization generate estimates of surface slant and tilt. The network shows qualitative agreement with human perception of shape from texture over a wide range of real and artificial stimuli.
Collapse
|
45
|
Abstract
One of the first attempts to develop a formal model of depth cue integration is to be found in Maloney and Landy's [(1989) Proceedings of the SPIE: Visual communications and image processing, Part 2 (pp. 1154-1163)] "human depth combination rule". They advocate that the combination of depth cues by the visual system is best described by a weighted linear model. The present experiments tested whether the linear combination rule applies to the integration of texture and shading. As would be predicted by a linear combination rule, the weight assigned to the shading cue did not vary as a function of its curvature value. However, the weight assigned to the texture cue varied systematically as a function of the curvature values of both cues. Here we describe a non-linear model which provides a better fit to the data. Redescribing the stimuli in terms of depth rather than curvature reduced the goodness of fit for all models tested. These results support the hypothesis that the locus of cue integration is a curvature map, rather than a depth map. We conclude that the linear combination rule does not generalize to the integration of shading and texture, and that for these cues it is likely that integration occurs after the recovery of surface curvature.
Collapse
|
46
|
Abstract
We describe an ideal observer model for estimating "shape from texture" which is derived from the principles of statistical information. For a given family of surface shapes, measures of statistical information can be computed for two different texture cues--density and orientation of texels. These measures can be used to predict lower bounds on the variance of shape judgements of "ideal" and human observers. They can also predict optimal weights for cue integration for the inference of shape from texture. These weights are directly proportional to the information carried by each cue. The ideal observer model therefore predicts that the variance of subjects' responses in a psychophysical shape judgement task should reflect the statistical importance of individual texture cues. Our results show that human performance in shape judgements for a one-parameter family of parabolic cylinders is often better than what an ideal observer achieves using a density cue alone. Therefore other information, for example the compression cue, must be used by human observers. For the first time, such results have been obtained without recourse to the unnatural cue conflict paradigms used in previous experiments. The model makes further predictions for the perception of planar slanted surfaces in the case of wide field of view.
Collapse
|
47
|
Abstract
Global shape judgements were employed to examine the combination of stereopsis and shape-from-texture in the determination of three-dimensional shape. Adding textural variations to stereograms increased perceived depth. Thus, texture was not simply vetoed by the strong stereo cue. In experiments where the depth specified by texture was incongruent with that specified by stereo, the data were well described by a weighted linear combination rule. Although only a small weight was assigned to texture, this weight was somewhat greater at a farther viewing distance. This could be a consequence of the decreased reliability of stereopsis at far viewing distances.
Collapse
|