1
|
White DN, Burge J. How distinct sources of nuisance variability in natural images and scenes limit human stereopsis. PLoS Comput Biol 2025; 21:e1012945. [PMID: 40233309 DOI: 10.1371/journal.pcbi.1012945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 03/10/2025] [Indexed: 04/17/2025] Open
Abstract
Stimulus variability-a form of nuisance variability-is a primary source of perceptual uncertainty in everyday natural tasks. How do different properties of natural images and scenes contribute to this uncertainty? Using binocular disparity as a model system, we report a systematic investigation of how various forms of natural stimulus variability impact performance in a stereo-depth discrimination task. With stimuli sampled from a stereo-image database of real-world scenes having pixel-by-pixel ground-truth distance data, three human observers completed two closely related double-pass psychophysical experiments. In the two experiments, each human observer responded twice to ten thousand unique trials, in which twenty thousand unique stimuli were presented. New analytical methods reveal, from this data, the specific and nearly dissociable effects of two distinct sources of natural stimulus variability-variation in luminance-contrast patterns and variation in local-depth structure-on discrimination performance, as well as the relative importance of stimulus-driven-variability and internal-noise in determining performance limits. Between-observer analyses show that both stimulus-driven sources of uncertainty are responsible for a large proportion of total variance, have strikingly similar effects on different people, and-surprisingly-make stimulus-by-stimulus responses more predictable (not less). The consistency across observers raises the intriguing prospect that image-computable models can make reasonably accurate performance predictions in natural viewing. Overall, the findings provide a rich picture of stimulus factors that contribute to human perceptual performance in natural scenes. The approach should have broad application to other animal models and other sensory-perceptual tasks with natural or naturalistic stimuli.
Collapse
Affiliation(s)
- David N White
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Electrical Engineering & Computer Science, York University, Toronto, Ontario, Canada
| | - Johannes Burge
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
2
|
Herrera-Esposito D, Burge J. Optimal Estimation of Local Motion-in-Depth with Naturalistic Stimuli. J Neurosci 2025; 45:e0490242024. [PMID: 39592236 PMCID: PMC11841760 DOI: 10.1523/jneurosci.0490-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 10/30/2024] [Accepted: 11/06/2024] [Indexed: 11/28/2024] Open
Abstract
Estimating the motion of objects in depth is important for behavior and is strongly supported by binocular visual cues. To understand both how the brain should estimate motion in depth and how natural constraints shape and limit performance in two local 3D motion tasks, we develop image-computable ideal observers from a large number of binocular video clips created from a dataset of natural images. The observers spatiotemporally filter the videos and nonlinearly decode 3D motion from the filter responses. The optimal filters and decoder are dictated by the task-relevant image statistics and are specific to each task. Multiple findings emerge. First, two distinct filter subpopulations are spontaneously learned for each task. For 3D speed estimation, filters emerge for processing either changing disparities over time or interocular velocity differences, cues that are used by humans. For 3D direction estimation, filters emerge for discriminating either left-right or toward-away motion. Second, the filter responses, conditioned on the latent variable, are well-described as jointly Gaussian, and the covariance of the filter responses carries the information about the task-relevant latent variable. Quadratic combination is thus necessary for optimal decoding, which can be implemented by biologically plausible neural computations. Finally, the ideal observer yields nonobvious-and in some cases counterintuitive-patterns of performance like those exhibited by humans. Important characteristics of human 3D motion processing and estimation may therefore result from optimal information processing in the early visual system.
Collapse
Affiliation(s)
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| |
Collapse
|
3
|
Ni L, Burge J. Feature-specific divisive normalization improves natural image encoding for depth perception. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.05.611536. [PMID: 39345647 PMCID: PMC11429615 DOI: 10.1101/2024.09.05.611536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Vision science and visual neuroscience seek to understand how stimulus and sensor properties limit the precision with which behaviorally-relevant latent variables are encoded and decoded. In the primate visual system, binocular disparity-the canonical cue for stereo-depth perception-is initially encoded by a set of binocular receptive fields with a range of spatial frequency preferences. Here, with a stereo-image database having ground-truth disparity information at each pixel, we examine how response normalization and receptive field properties determine the fidelity with which binocular disparity is encoded in natural scenes. We quantify encoding fidelity by computing the Fisher information carried by the normalized receptive field responses. Several findings emerge from an analysis of the response statistics. First, broadband (or feature-unspecific) normalization yields Laplace-distributed receptive field responses, and narrowband (or feature-specific) normalization yields Gaussian-distributed receptive field responses. Second, the Fisher information in narrowband-normalized responses is larger than in broadband-normalized responses by a scale factor that grows with population size. Third, the most useful spatial frequency decreases with stimulus size and the range of spatial frequencies that is useful for encoding a given disparity decreases with disparity magnitude, consistent with neurophysiological findings. Fourth, the predicted patterns of psychophysical performance, and absolute detection threshold, match human performance with natural and artificial stimuli. The current computational efforts establish a new functional role for response normalization, and bring us closer to understanding the principles that should govern the design of neural systems that support perception in natural scenes.
Collapse
Affiliation(s)
- Long Ni
- Department of Psychology, University of Pennsylvania, Pennsylvania PA
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Pennsylvania PA
- Neuroscience Graduate Group, University of Pennsylvania, Pennsylvania PA
- Bioengineering Graduate Group, University of Pennsylvania, Pennsylvania PA
| |
Collapse
|
4
|
Clark DA, Fitzgerald JE. Optimization in Visual Motion Estimation. Annu Rev Vis Sci 2024; 10:23-46. [PMID: 38663426 PMCID: PMC11998607 DOI: 10.1146/annurev-vision-101623-025432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2025]
Abstract
Sighted animals use visual signals to discern directional motion in their environment. Motion is not directly detected by visual neurons, and it must instead be computed from light signals that vary over space and time. This makes visual motion estimation a near universal neural computation, and decades of research have revealed much about the algorithms and mechanisms that generate directional signals. The idea that sensory systems are optimized for performance in natural environments has deeply impacted this research. In this article, we review the many ways that optimization has been used to quantitatively model visual motion estimation and reveal its underlying principles. We emphasize that no single optimization theory has dominated the literature. Instead, researchers have adeptly incorporated different computational demands and biological constraints that are pertinent to the specific brain system and animal model under study. The successes and failures of the resulting optimization models have thereby provided insights into how computational demands and biological constraints together shape neural computation.
Collapse
Affiliation(s)
- Damon A Clark
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut, USA;
| | - James E Fitzgerald
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, USA
- Department of Neurobiology, Northwestern University, Evanston, Illinois, USA;
| |
Collapse
|
5
|
Burge J, Cormack LK. Continuous psychophysics shows millisecond-scale visual processing delays are faithfully preserved in movement dynamics. J Vis 2024; 24:4. [PMID: 38722274 PMCID: PMC11094763 DOI: 10.1167/jov.24.5.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 02/22/2024] [Indexed: 05/18/2024] Open
Abstract
Image differences between the eyes can cause interocular discrepancies in the speed of visual processing. Millisecond-scale differences in visual processing speed can cause dramatic misperceptions of the depth and three-dimensional direction of moving objects. Here, we develop a monocular and binocular continuous target-tracking psychophysics paradigm that can quantify such tiny differences in visual processing speed. Human observers continuously tracked a target undergoing Brownian motion with a range of luminance levels in each eye. Suitable analyses recover the time course of the visuomotor response in each condition, the dependence of visual processing speed on luminance level, and the temporal evolution of processing differences between the eyes. Importantly, using a direct within-observer comparison, we show that continuous target-tracking and traditional forced-choice psychophysical methods provide estimates of interocular delays that agree on average to within a fraction of a millisecond. Thus, visual processing delays are preserved in the movement dynamics of the hand. Finally, we show analytically, and partially confirm experimentally, that differences between the temporal impulse response functions in the two eyes predict how lateral target motion causes misperceptions of motion in depth and associated tracking responses. Because continuous target tracking can accurately recover millisecond-scale differences in visual processing speed and has multiple advantages over traditional psychophysics, it should facilitate the study of temporal processing in the future.
Collapse
Affiliation(s)
- Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| | - Lawrence K Cormack
- Department of Psychology, University of Texas at Austin, Austin, TX, USA
- Center for Perceptual Systems, University of Texas at Austin, Austin, TX, USA
- Institute for Neuroscience, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
6
|
Burge J, Burge T. Shape, perspective, and what is and is not perceived: Comment on Morales, Bax, and Firestone (2020). Psychol Rev 2023; 130:1125-1136. [PMID: 35549319 PMCID: PMC11366222 DOI: 10.1037/rev0000363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Psychology and philosophy have long reflected on the role of perspective in vision. Since the dawn of modern vision science-roughly, since Helmholtz in the late 1800s-scientific explanations in vision have focused on understanding the computations that transform the sensed retinal image into percepts of the three-dimensional environment. The standard view in the science is that distal properties-viewpoint-independent properties of the environment (object shape) and viewpoint-dependent relational properties (3D orientation relative to the viewer)-are perceptually represented and that properties of the proximal stimulus (in vision, the retinal image) are not. This view is woven into the nature of scientific explanation in perceptual psychology, and has guided impressive advances over the past 150 years. A recently published article suggests that in shape perception, the standard view must be revised. It argues, on the basis of new empirical data, that a new entity-perspectival shape-should be introduced into scientific explanations of shape perception. Specifically, the article's centrally advertised claim is that, in addition to distal shape, perspectival shape is perceived. We argue that this claim rests on a series of mistakes. Problems in experimental design entail that the article provides no empirical support for any claims regarding either perspective or the perception of shape. There are further problems in scientific reasoning and conceptual development. Detailing these criticisms and explaining how science treats these issues are meant to clarify method and theory, and to improve exchanges between the science and philosophy of perception. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
- Johannes Burge
- Department of Psychology, University of Pennsylvania
- Neuroscience Graduate Group, University of Pennsylvania
- Bioengineering Graduate Group, University of Pennsylvania
| | - Tyler Burge
- Department of Philosophy, University of California, Los Angeles
| |
Collapse
|
7
|
Scaliti E, Pullar K, Borghini G, Cavallo A, Panzeri S, Becchio C. Kinematic priming of action predictions. Curr Biol 2023:S0960-9822(23)00687-5. [PMID: 37339628 DOI: 10.1016/j.cub.2023.05.055] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 04/06/2023] [Accepted: 05/24/2023] [Indexed: 06/22/2023]
Abstract
The ability to anticipate what others will do next is crucial for navigating social, interactive environments. Here, we develop an experimental and analytical framework to measure the implicit readout of prospective intention information from movement kinematics. Using a primed action categorization task, we first demonstrate implicit access to intention information by establishing a novel form of priming, which we term kinematic priming: subtle differences in movement kinematics prime action prediction. Next, using data collected from the same participants in a forced-choice intention discrimination task 1 h later, we quantify single-trial intention readout-the amount of intention information read by individual perceivers in individual kinematic primes-and assess whether it can be used to predict the amount of kinematic priming. We demonstrate that the amount of kinematic priming, as indexed by both response times (RTs) and initial fixations to a given probe, is directly proportional to the amount of intention information read by the individual perceiver at the single-trial level. These results demonstrate that human perceivers have rapid, implicit access to intention information encoded in movement kinematics and highlight the potential of our approach to reveal the computations that permit the readout of this information with single-subject, single-trial resolution.
Collapse
Affiliation(s)
- Eugenio Scaliti
- Center for Human Technologies, Fondazione Istituto Italiano di Tecnologia, Via Enrico Melen, 83, 16152 Genova, Italy; Department of Neurology, University Medical Center Hamburg-Eppendorf (UKE), Martinistrasse 52, 20246 Hamburg, Germany
| | - Kiri Pullar
- Center for Human Technologies, Fondazione Istituto Italiano di Tecnologia, Via Enrico Melen, 83, 16152 Genova, Italy
| | - Giulia Borghini
- Center for Human Technologies, Fondazione Istituto Italiano di Tecnologia, Via Enrico Melen, 83, 16152 Genova, Italy
| | - Andrea Cavallo
- Center for Human Technologies, Fondazione Istituto Italiano di Tecnologia, Via Enrico Melen, 83, 16152 Genova, Italy; Department of Psychology, Università degli Studi di Torino, Via Giuseppe Verdi, 10, 10124 Torino, Italy
| | - Stefano Panzeri
- Center for Human Technologies, Fondazione Istituto Italiano di Tecnologia, Via Enrico Melen, 83, 16152 Genova, Italy; Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Falkenried 94, 20251 Hamburg, Germany.
| | - Cristina Becchio
- Center for Human Technologies, Fondazione Istituto Italiano di Tecnologia, Via Enrico Melen, 83, 16152 Genova, Italy; Department of Neurology, University Medical Center Hamburg-Eppendorf (UKE), Martinistrasse 52, 20246 Hamburg, Germany.
| |
Collapse
|
8
|
Lieber JD, Lee GM, Majaj NJ, Movshon JA. Sensitivity to naturalistic texture relies primarily on high spatial frequencies. J Vis 2023; 23:4. [PMID: 36745452 PMCID: PMC9910384 DOI: 10.1167/jov.23.2.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 11/19/2022] [Indexed: 02/07/2023] Open
Abstract
Natural images contain information at multiple spatial scales. Though we understand how early visual mechanisms split multiscale images into distinct spatial frequency channels, we do not know how the outputs of these channels are processed further by mid-level visual mechanisms. We have recently developed a texture discrimination task that uses synthetic, multi-scale, "naturalistic" textures to isolate these mid-level mechanisms. Here, we use three experimental manipulations (image blur, image rescaling, and eccentric viewing) to show that perceptual sensitivity to naturalistic structure is strongly dependent on features at high object spatial frequencies (measured in cycles/image). As a result, sensitivity depends on a texture acuity limit, a property of the visual system that sets the highest retinal spatial frequency (measured in cycles/degree) at which observers can detect naturalistic features. Analysis of the texture images using a model observer analysis shows that naturalistic image features at high object spatial frequencies carry more task-relevant information than those at low object spatial frequencies. That is, the dependence of sensitivity on high object spatial frequencies is a property of the texture images, rather than a property of the visual system. Accordingly, we find human observers' ability to extract naturalistic information (their efficiency) is similar for all object spatial frequencies. We conclude that the mid-level mechanisms that underlie perceptual sensitivity effectively extract information from all image features below the texture acuity limit, regardless of their retinal and object spatial frequency.
Collapse
Affiliation(s)
- Justin D Lieber
- Center for Neural Science, New York University, New York, NY, USA
| | - Gerick M Lee
- Center for Neural Science, New York University, New York, NY, USA
| | - Najib J Majaj
- Center for Neural Science, New York University, New York, NY, USA
| | | |
Collapse
|
9
|
Chin BM, Burge J. Perceptual consequences of interocular differences in the duration of temporal integration. J Vis 2022; 22:12. [PMID: 36355360 PMCID: PMC9652723 DOI: 10.1167/jov.22.12.12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 07/25/2022] [Indexed: 11/12/2022] Open
Abstract
Temporal differences in visual information processing between the eyes can cause dramatic misperceptions of motion and depth. Processing delays between the eyes cause the Pulfrich effect: oscillating targets in the frontal plane are misperceived as moving along near-elliptical motion trajectories in depth (Pulfrich, 1922). Here, we explain a previously reported but poorly understood variant: the anomalous Pulfrich effect. When this variant is perceived, the illusory motion trajectory appears oriented left- or right-side back in depth, rather than aligned with the true direction of motion. Our data indicate that this perceived misalignment is due to interocular differences in neural temporal integration periods, as opposed to interocular differences in delay. For oscillating motion, differences in the duration of temporal integration dampen the effective motion amplitude in one eye relative to the other. In a dynamic analog of the Geometric effect in stereo-surface-orientation perception (Ogle, 1950), the different motion amplitudes cause the perceived misorientation of the motion trajectories. Forced-choice psychophysical experiments, conducted with both different spatial frequencies and different onscreen motion damping in the two eyes show that the perceived misorientation in depth is associated with the eye having greater motion damping. A target-tracking experiment provided more direct evidence that the anomalous Pulfrich effect is caused by interocular differences in temporal integration and delay. These findings highlight the computational hurdles posed to the visual system by temporal differences in sensory processing. Future work will explore how the visual system overcomes these challenges to achieve accurate perception.
Collapse
Affiliation(s)
- Benjamin M Chin
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
10
|
Singh V, Burge J, Brainard DH. Equivalent noise characterization of human lightness constancy. J Vis 2022; 22:2. [PMID: 35394508 PMCID: PMC8994201 DOI: 10.1167/jov.22.5.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 02/19/2022] [Indexed: 12/03/2022] Open
Abstract
A goal of visual perception is to provide stable representations of task-relevant scene properties (e.g. object reflectance) despite variation in task-irrelevant scene properties (e.g. illumination and reflectance of other nearby objects). To study such stability in the context of the perceptual representation of lightness, we introduce a threshold-based psychophysical paradigm. We measure how thresholds for discriminating the achromatic reflectance of a target object (task-relevant property) in rendered naturalistic scenes are impacted by variation in the reflectance functions of background objects (task-irrelevant property), using a two-alternative forced-choice paradigm in which the reflectance of the background objects is randomized across the two intervals of each trial. We control the amount of background reflectance variation by manipulating a statistical model of naturally occurring surface reflectances. For low background object reflectance variation, discrimination thresholds were nearly constant, indicating that observers' internal noise determines threshold in this regime. As background object reflectance variation increases, its effects start to dominate performance. A model based on signal detection theory allows us to express the effects of task-irrelevant variation in terms of the equivalent noise, that is relative to the intrinsic precision of the task-relevant perceptual representation. The results indicate that although naturally occurring background object reflectance variation does intrude on the perceptual representation of target object lightness, the effect is modest - within a factor of two of the equivalent noise level set by internal noise.
Collapse
Affiliation(s)
- Vijay Singh
- Department of Physics, North Carolina Agricultural and Technical State University, Greensboro, NC, USA
- Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, PA, USA
| | - Johannes Burge
- Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| | - David H Brainard
- Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
11
|
Zhang LQ, Cottaris NP, Brainard DH. An image reconstruction framework for characterizing initial visual encoding. eLife 2022; 11:e71132. [PMID: 35037622 PMCID: PMC8846596 DOI: 10.7554/elife.71132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 01/14/2022] [Indexed: 11/13/2022] Open
Abstract
We developed an image-computable observer model of the initial visual encoding that operates on natural image input, based on the framework of Bayesian image reconstruction from the excitations of the retinal cone mosaic. Our model extends previous work on ideal observer analysis and evaluation of performance beyond psychophysical discrimination, takes into account the statistical regularities of the visual environment, and provides a unifying framework for answering a wide range of questions regarding the visual front end. Using the error in the reconstructions as a metric, we analyzed variations of the number of different photoreceptor types on human retina as an optimal design problem. In addition, the reconstructions allow both visualization and quantification of information loss due to physiological optics and cone mosaic sampling, and how these vary with eccentricity. Furthermore, in simulations of color deficiencies and interferometric experiments, we found that the reconstructed images provide a reasonable proxy for modeling subjects' percepts. Lastly, we used the reconstruction-based observer for the analysis of psychophysical threshold, and found notable interactions between spatial frequency and chromatic direction in the resulting spatial contrast sensitivity function. Our method is widely applicable to experiments and applications in which the initial visual encoding plays an important role.
Collapse
Affiliation(s)
- Ling-Qi Zhang
- Department of Psychology, University of PennsylvaniaPhiladelphiaUnited States
| | - Nicolas P Cottaris
- Department of Psychology, University of PennsylvaniaPhiladelphiaUnited States
| | - David H Brainard
- Department of Psychology, University of PennsylvaniaPhiladelphiaUnited States
| |
Collapse
|
12
|
Human visual motion perception shows hallmarks of Bayesian structural inference. Sci Rep 2021; 11:3714. [PMID: 33580096 PMCID: PMC7881251 DOI: 10.1038/s41598-021-82175-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 01/13/2021] [Indexed: 11/08/2022] Open
Abstract
Motion relations in visual scenes carry an abundance of behaviorally relevant information, but little is known about how humans identify the structure underlying a scene's motion in the first place. We studied the computations governing human motion structure identification in two psychophysics experiments and found that perception of motion relations showed hallmarks of Bayesian structural inference. At the heart of our research lies a tractable task design that enabled us to reveal the signatures of probabilistic reasoning about latent structure. We found that a choice model based on the task's Bayesian ideal observer accurately matched many facets of human structural inference, including task performance, perceptual error patterns, single-trial responses, participant-specific differences, and subjective decision confidence-especially, when motion scenes were ambiguous and when object motion was hierarchically nested within other moving reference frames. Our work can guide future neuroscience experiments to reveal the neural mechanisms underlying higher-level visual motion perception.
Collapse
|
13
|
Rodriguez-Lopez V, Dorronsoro C, Burge J. Contact lenses, the reverse Pulfrich effect, and anti-Pulfrich monovision corrections. Sci Rep 2020; 10:16086. [PMID: 32999323 PMCID: PMC7527565 DOI: 10.1038/s41598-020-71395-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 08/10/2020] [Indexed: 11/09/2022] Open
Abstract
Interocular differences in image blur can cause processing speed differences that lead to dramatic misperceptions of the distance and three-dimensional direction of moving objects. This recently discovered illusion-the reverse Pulfrich effect-is caused by optical conditions induced by monovision, a common correction for presbyopia. Fortunately, anti-Pulfrich monovision corrections, which darken the blurring lens, can eliminate the illusion for many viewing conditions. However, the reverse Pulfrich effect and the efficacy of anti-Pulfrich corrections have been demonstrated only with trial lenses. This situation should be addressed, for clinical and scientific reasons. First, it is important to replicate these effects with contact lenses, the most common method for delivering monovision. Second, trial lenses of different powers, unlike contacts, can cause large magnification differences between the eyes. To confidently attribute the reverse Pulfrich effect to interocular optical blur differences, and to ensure that previously reported effect sizes are reliable, one must control for magnification. Here, in a within-observer study with five separate experiments, we demonstrate that (1) contact lenses and trial lenses induce indistinguishable reverse Pulfrich effects, (2) anti-Pulfrich corrections are equally effective when induced by contact and trial lenses, and (3) magnification differences do not cause or impact the Pulfrich effect.
Collapse
Affiliation(s)
- Victor Rodriguez-Lopez
- Institute of Optics, Spanish National Research Council (IO-CSIC), Madrid, Spain
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Carlos Dorronsoro
- Institute of Optics, Spanish National Research Council (IO-CSIC), Madrid, Spain
- 2Eyes Vision SL, Madrid, Spain
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA.
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA.
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
14
|
Kim S, Burge J. Natural scene statistics predict how humans pool information across space in surface tilt estimation. PLoS Comput Biol 2020; 16:e1007947. [PMID: 32579559 PMCID: PMC7340327 DOI: 10.1371/journal.pcbi.1007947] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 07/07/2020] [Accepted: 05/14/2020] [Indexed: 11/30/2022] Open
Abstract
Visual systems estimate the three-dimensional (3D) structure of scenes from information in two-dimensional (2D) retinal images. Visual systems use multiple sources of information to improve the accuracy of these estimates, including statistical knowledge of the probable spatial arrangements of natural scenes. Here, we examine how 3D surface tilts are spatially related in real-world scenes, and show that humans pool information across space when estimating surface tilt in accordance with these spatial relationships. We develop a hierarchical model of surface tilt estimation that is grounded in the statistics of tilt in natural scenes and images. The model computes a global tilt estimate by pooling local tilt estimates within an adaptive spatial neighborhood. The spatial neighborhood in which local estimates are pooled changes according to the value of the local estimate at a target location. The hierarchical model provides more accurate estimates of groundtruth tilt in natural scenes and provides a better account of human performance than the local estimates. Taken together, the results imply that the human visual system pools information about surface tilt across space in accordance with natural scene statistics. Visual systems estimate three-dimensional (3D) properties of scenes from two-dimensional images on the retinas. To solve this difficult problem as accurately as possible, visual systems use many available sources of information, including information about how the 3D properties of the world are spatially arranged. This manuscript reports a systematic analysis of 3D surface tilt in natural scenes, a model of surface tilt estimation that makes use of these scene statistics, and human psychophysical data on the estimation of surface tilt from natural images. The results show that the regularities present in the natural environment predict both how to maximize the accuracy of tilt estimation and how to maximize the prediction of human performance. This work contributes to a growing line of work that establishes links between rigorous measurements of natural scenes and the function of sensory and perceptual systems.
Collapse
Affiliation(s)
- Seha Kim
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- * E-mail:
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
15
|
Abstract
An ideal observer is a theoretical model observer that performs a specific sensory-perceptual task optimally, making the best possible use of the available information given physical and biological constraints. An image-computable ideal observer (pixels in, estimates out) is a particularly powerful type of ideal observer that explicitly models the flow of visual information from the stimulus-encoding process to the eventual decoding of a sensory-perceptual estimate. Image-computable ideal observer analyses underlie some of the most important results in vision science. However, most of what we know from ideal observers about visual processing and performance derives from relatively simple tasks and relatively simple stimuli. This review describes recent efforts to develop image-computable ideal observers for a range of tasks with natural stimuli and shows how these observers can be used to predict and understand perceptual and neurophysiological performance. The reviewed results establish principled links among models of neural coding, computational methods for dimensionality reduction, and sensory-perceptual performance in tasks with natural stimuli.
Collapse
Affiliation(s)
- Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; .,Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|