1
|
Kemp JT, Cesanek E, Domini F. Perceiving depth from texture and disparity cues: Evidence for a non-probabilistic account of cue integration. J Vis 2023; 23:13. [PMID: 37486299 PMCID: PMC10382782 DOI: 10.1167/jov.23.7.13] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 06/12/2023] [Indexed: 07/25/2023] Open
Abstract
Bayesian inference theories have been extensively used to model how the brain derives three-dimensional (3D) information from ambiguous visual input. In particular, the maximum likelihood estimation (MLE) model combines estimates from multiple depth cues according to their relative reliability to produce the most probable 3D interpretation. Here, we tested an alternative theory of cue integration, termed the intrinsic constraint (IC) theory, which postulates that the visual system derives the most stable, not most probable, interpretation of the visual input amid variations in viewing conditions. The vector sum model provides a normative approach for achieving this goal where individual cue estimates are components of a multidimensional vector whose norm determines the combined estimate. Individual cue estimates are not accurate but related to distal 3D properties through a deterministic mapping. In three experiments, we show that the IC theory can more adeptly account for 3D cue integration than MLE models. In Experiment 1, we show systematic biases in the perception of depth from texture and depth from binocular disparity. Critically, we demonstrate that the vector sum model predicts an increase in perceived depth when these cues are combined. In Experiment 2, we illustrate the IC theory radical reinterpretation of the just noticeable difference (JND) and test the related vector sum model prediction of the classic finding of smaller JNDs for combined-cue versus single-cue stimuli. In Experiment 3, we confirm the vector sum prediction that biases found in cue integration experiments cannot be attributed to flatness cues, as the MLE model predicts.
Collapse
Affiliation(s)
- Jovan T Kemp
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| | - Evan Cesanek
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Fulvio Domini
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
- Italian Institute of Technology, Rovereto, Italy
| |
Collapse
|
2
|
Domini F. The case against probabilistic inference: a new deterministic theory of 3D visual processing. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210458. [PMID: 36511407 PMCID: PMC9745883 DOI: 10.1098/rstb.2021.0458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
How the brain derives 3D information from inherently ambiguous visual input remains the fundamental question of human vision. The past two decades of research have addressed this question as a problem of probabilistic inference, the dominant model being maximum-likelihood estimation (MLE). This model assumes that independent depth-cue modules derive noisy but statistically accurate estimates of 3D scene parameters that are combined through a weighted average. Cue weights are adjusted based on the system representation of each module's output variability. Here I demonstrate that the MLE model fails to account for important psychophysical findings and, importantly, misinterprets the just noticeable difference, a hallmark measure of stimulus discriminability, to be an estimate of perceptual uncertainty. I propose a new theory, termed Intrinsic Constraint, which postulates that the visual system does not derive the most probable interpretation of the visual input, but rather, the most stable interpretation amid variations in viewing conditions. This goal is achieved with the Vector Sum model, which represents individual cue estimates as components of a multi-dimensional vector whose norm determines the combined output. This model accounts for the psychophysical findings cited in support of MLE, while predicting existing and new findings that contradict the MLE model. This article is part of a discussion meeting issue 'New approaches to 3D vision'.
Collapse
Affiliation(s)
- Fulvio Domini
- CLPS, Brown University, 190 Thayer Street Providence, Rhode Island 02912-9067, USA
| |
Collapse
|
3
|
Rigutti S, Stragà M, Jez M, Baldassi G, Carnaghi A, Miceu P, Fantoni C. Don't worry, be active: how to facilitate the detection of errors in immersive virtual environments. PeerJ 2018; 6:e5844. [PMID: 30397547 PMCID: PMC6211266 DOI: 10.7717/peerj.5844] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 09/26/2018] [Indexed: 11/23/2022] Open
Abstract
The current research aims to study the link between the type of vision experienced in a collaborative immersive virtual environment (active vs. multiple passive), the type of error one looks for during a cooperative multi-user exploration of a design project (affordance vs. perceptual violations), and the type of setting in which multi-user perform (field in Experiment 1 vs. laboratory in Experiment 2). The relevance of this link is backed by the lack of conclusive evidence on an active vs. passive vision advantage in cooperative search tasks within software based on immersive virtual reality (IVR). Using a yoking paradigm based on the mixed usage of simultaneous active and multiple passive viewings, we found that the likelihood of error detection in a complex 3D environment was characterized by an active vs. multi-passive viewing advantage depending on: (1) the degree of knowledge dependence of the type of error the passive/active observers were looking for (low for perceptual violations, vs. high for affordance violations), as the advantage tended to manifest itself irrespectively from the setting for affordance, but not for perceptual violations; and (2) the degree of social desirability possibly induced by the setting in which the task was performed, as the advantage occurred irrespectively from the type of error in the laboratory (Experiment 2) but not in the field (Experiment 1) setting. Results are relevant to future development of cooperative software based on IVR used for supporting the design review. A multi-user design review experience in which designers, engineers and end-users all cooperate actively within the IVR wearing their own head mounted display, seems more suitable for the detection of relevant errors than standard systems characterized by a mixed usage of active and passive viewing.
Collapse
Affiliation(s)
- Sara Rigutti
- Department of Life Sciences, Psychology Unit "Gaetano Kanizsa", University of Trieste, Trieste, Italy
| | - Marta Stragà
- Department of Life Sciences, Psychology Unit "Gaetano Kanizsa", University of Trieste, Trieste, Italy
| | - Marco Jez
- Area Science Park, Arsenal S.r.L, Trieste, Italy
| | - Giulio Baldassi
- Department of Life Sciences, Psychology Unit "Gaetano Kanizsa", University of Trieste, Trieste, Italy
| | - Andrea Carnaghi
- Department of Life Sciences, Psychology Unit "Gaetano Kanizsa", University of Trieste, Trieste, Italy
| | - Piero Miceu
- Area Science Park, Arsenal S.r.L, Trieste, Italy
| | - Carlo Fantoni
- Department of Life Sciences, Psychology Unit "Gaetano Kanizsa", University of Trieste, Trieste, Italy
| |
Collapse
|
4
|
Derzsi Z, Volcic R. MOTOM toolbox: MOtion Tracking via Optotrak and Matlab. J Neurosci Methods 2018; 308:129-134. [DOI: 10.1016/j.jneumeth.2018.07.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/06/2018] [Accepted: 07/07/2018] [Indexed: 10/28/2022]
|
5
|
Fantoni C, Caudek C, Domini F. Perceived surface slant is systematically biased in the actively-generated optic flow. PLoS One 2012; 7:e33911. [PMID: 22479473 PMCID: PMC3316515 DOI: 10.1371/journal.pone.0033911] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 02/19/2012] [Indexed: 12/04/2022] Open
Abstract
Humans make systematic errors in the 3D interpretation of the optic flow in both passive and active vision. These systematic distortions can be predicted by a biologically-inspired model which disregards self-motion information resulting from head movements (Caudek, Fantoni, & Domini 2011). Here, we tested two predictions of this model: (1) A plane that is stationary in an earth-fixed reference frame will be perceived as changing its slant if the movement of the observer's head causes a variation of the optic flow; (2) a surface that rotates in an earth-fixed reference frame will be perceived to be stationary, if the surface rotation is appropriately yoked to the head movement so as to generate a variation of the surface slant but not of the optic flow. Both predictions were corroborated by two experiments in which observers judged the perceived slant of a random-dot planar surface during egomotion. We found qualitatively similar biases for monocular and binocular viewing of the simulated surfaces, although, in principle, the simultaneous presence of disparity and motion cues allows for a veridical recovery of surface slant.
Collapse
Affiliation(s)
- Carlo Fantoni
- Center for Neuroscience and Cognitive, Systems@UniTn, Istituto Italiano di Tecnologia, Rovereto, Italy.
| | | | | |
Collapse
|
6
|
Domini F, Shah R, Caudek C. Do we perceive a flattened world on the monitor screen? Acta Psychol (Amst) 2011; 138:359-66. [PMID: 21986481 DOI: 10.1016/j.actpsy.2011.07.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Revised: 07/27/2011] [Accepted: 07/29/2011] [Indexed: 11/28/2022] Open
Abstract
The current model of three-dimensional perception hypothesizes that the brain integrates the depth cues in a statistically optimal fashion through a weighted linear combination with weights proportional to the reliabilities obtained for each cue in isolation (Landy, Maloney, Johnston, & Young, 1995). Even though many investigations support such theoretical framework, some recent empirical findings are at odds with this view (e.g., Domini, Caudek, & Tassinari, 2006). Failures of linear cue integration have been attributed to cue-conflict and to unmodelled cues to flatness present in computer-generated displays. We describe two cue-combination experiments designed to test the integration of stereo and motion cues, in the presence of consistent or conflicting blur and accommodation information (i.e., when flatness cues are either absent, with physical stimuli, or present, with computer-generated displays). In both conditions, we replicated the results of Domini et al. (2006): The amount of perceived depth increased as more cues were available, also producing an over-estimation of depth in some conditions. These results can be explained by the Intrinsic Constraint model, but not by linear cue combination.
Collapse
Affiliation(s)
- Fulvio Domini
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, RI 02912, USA.
| | | | | |
Collapse
|
7
|
Abstract
Our vision remains stable even though the movements of our eyes, head and bodies create a motion pattern on the retina. One of the most important, yet basic, feats of the visual system is to correctly determine whether this retinal motion is owing to real movement in the world or rather our own self-movement. This problem has occupied many great thinkers, such as Descartes and Helmholtz, at least since the time of Alhazen. This theme issue brings together leading researchers from animal neurophysiology, clinical neurology, psychophysics and cognitive neuroscience to summarize the state of the art in the study of visual stability. Recently, there has been significant progress in understanding the limits of visual stability in humans and in identifying many of the brain circuits involved in maintaining a stable percept of the world. Clinical studies and new experimental methods, such as transcranial magnetic stimulation, now make it possible to test the causal role of different brain regions in creating visual stability and also allow us to measure the consequences when the mechanisms of visual stability break down.
Collapse
Affiliation(s)
- David Melcher
- Faculty of Cognitive Science, University of Trento, Italy.
| |
Collapse
|
8
|
Caudek C, Fantoni C, Domini F. Bayesian modeling of perceived surface slant from actively-generated and passively-observed optic flow. PLoS One 2011; 6:e18731. [PMID: 21533197 PMCID: PMC3077406 DOI: 10.1371/journal.pone.0018731] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2010] [Accepted: 03/11/2011] [Indexed: 11/23/2022] Open
Abstract
We measured perceived depth from the optic flow (a) when showing a stationary physical or virtual object to observers who moved their head at a normal or slower speed, and (b) when simulating the same optic flow on a computer and presenting it to stationary observers. Our results show that perceived surface slant is systematically distorted, for both the active and the passive viewing of physical or virtual surfaces. These distortions are modulated by head translation speed, with perceived slant increasing directly with the local velocity gradient of the optic flow. This empirical result allows us to determine the relative merits of two alternative approaches aimed at explaining perceived surface slant in active vision: an "inverse optics" model that takes head motion information into account, and a probabilistic model that ignores extra-retinal signals. We compare these two approaches within the framework of the bayesian theory. The "inverse optics" bayesian model produces veridical slant estimates if the optic flow and the head translation velocity are measured with no error; because of the influence of a "prior" for flatness, the slant estimates become systematically biased as the measurement errors increase. The bayesian model, which ignores the observer's motion, always produces distorted estimates of surface slant. Interestingly, the predictions of this second model, not those of the first one, are consistent with our empirical findings. The present results suggest that (a) in active vision perceived surface slant may be the product of probabilistic processes which do not guarantee the correct solution, and (b) extra-retinal signals may be mainly used for a better measurement of retinal information.
Collapse
Affiliation(s)
- Corrado Caudek
- Department of Psychology, Università degli Studi di Firenze, Firenze, Italy.
| | | | | |
Collapse
|
9
|
Integration of disparity and velocity information for haptic and perceptual judgments of object depth. Acta Psychol (Amst) 2011; 136:300-10. [PMID: 21237442 DOI: 10.1016/j.actpsy.2010.12.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Revised: 12/09/2010] [Accepted: 12/10/2010] [Indexed: 11/23/2022] Open
Abstract
Do reach-to-grasp (prehension) movements require a metric representation of three-dimensional (3D) layouts and objects? We propose a model relying only on direct sensory information to account for the planning and execution of prehension movements in the absence of haptic feedback and when the hand is not visible. In the present investigation, we isolate relative motion and binocular disparity information from other depth cues and we study their efficacy for reach-to-grasp movements and visual judgments. We show that (i) the amplitude of the grasp increases when relative motion is added to binocular disparity information, even if depth from disparity information is already veridical, and (ii) similar distortions of derived depth are found for haptic tasks and perceptual judgments. With a quantitative test, we demonstrate that our results are consistent with the Intrinsic Constraint model and do not require 3D metric inferences (Domini, Caudek, & Tassinari, 2006). By contrast, the linear cue integration model (Landy, Maloney, Johnston, & Young, 1995) cannot explain the present results, even if the flatness cues are taken into account.
Collapse
|
10
|
Jain A, Backus BT. Experience affects the use of ego-motion signals during 3D shape perception. J Vis 2010; 10:10.14.30. [PMID: 21191132 DOI: 10.1167/10.14.30] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Experience has long-term effects on perceptual appearance (Q. Haijiang, J. A. Saunders, R. W. Stone, & B. T. Backus, 2006). We asked whether experience affects the appearance of structure-from-motion stimuli when the optic flow is caused by observer ego-motion. Optic flow is an ambiguous depth cue: a rotating object and its oppositely rotating, depth-inverted dual generate similar flow. However, the visual system exploits ego-motion signals to prefer the percept of an object that is stationary over one that rotates (M. Wexler, F. Panerai, I. Lamouret, & J. Droulez, 2001). We replicated this finding and asked whether this preference for stationarity, the "stationarity prior," is modulated by experience. During training, two groups of observers were exposed to objects with identical flow, but that were either stationary or moving as determined by other cues. The training caused identical test stimuli to be seen preferentially as stationary or moving by the two groups, respectively. We then asked whether different priors can exist independently at different locations in the visual field. Observers were trained to see objects either as stationary or as moving at two different locations. Observers' stationarity bias at the two respective locations was modulated in the directions consistent with training. Thus, the utilization of extraretinal ego-motion signals for disambiguating optic flow signals can be updated as the result of experience, consistent with the updating of a Bayesian prior for stationarity.
Collapse
Affiliation(s)
- Anshul Jain
- SUNY Eye Institute and Graduate Center for Vision Research, SUNY College of Optometry, New York, NY 10036, USA.
| | | |
Collapse
|
11
|
Di Luca M, Domini F, Caudek C. Inconsistency of perceived 3D shape. Vision Res 2010; 50:1519-31. [PMID: 20470815 DOI: 10.1016/j.visres.2010.05.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Revised: 05/05/2010] [Accepted: 05/05/2010] [Indexed: 11/16/2022]
Abstract
Internal consistency of local depth, slant, and curvature judgments was studied by asking participants to match two 3D surfaces rendered by different mixtures of 3D cues (velocity, texture, and shading). We found that perceptual judgments were not consistent with each other, with cue-specific distortions. Adding multiple cues did not eliminate the inconsistencies of the judgments. These results can be predicted by the Intrinsic Constraint (IC) model according to which the perceptual metric local estimates are a monotonically increasing function of the Signal-to-Noise Ratio of the optimal combination of direct information of 3D shape (Domini, Caudek, & Tassinari, 2006).
Collapse
Affiliation(s)
- M Di Luca
- Max Planck Institute for Biological Cybernetics, Tuebingen, Germany
| | | | | |
Collapse
|