1
|
Herrera-Esposito D, Burge J. Optimal Estimation of Local Motion-in-Depth with Naturalistic Stimuli. J Neurosci 2025; 45:e0490242024. [PMID: 39592236 PMCID: PMC11841760 DOI: 10.1523/jneurosci.0490-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 10/30/2024] [Accepted: 11/06/2024] [Indexed: 11/28/2024] Open
Abstract
Estimating the motion of objects in depth is important for behavior and is strongly supported by binocular visual cues. To understand both how the brain should estimate motion in depth and how natural constraints shape and limit performance in two local 3D motion tasks, we develop image-computable ideal observers from a large number of binocular video clips created from a dataset of natural images. The observers spatiotemporally filter the videos and nonlinearly decode 3D motion from the filter responses. The optimal filters and decoder are dictated by the task-relevant image statistics and are specific to each task. Multiple findings emerge. First, two distinct filter subpopulations are spontaneously learned for each task. For 3D speed estimation, filters emerge for processing either changing disparities over time or interocular velocity differences, cues that are used by humans. For 3D direction estimation, filters emerge for discriminating either left-right or toward-away motion. Second, the filter responses, conditioned on the latent variable, are well-described as jointly Gaussian, and the covariance of the filter responses carries the information about the task-relevant latent variable. Quadratic combination is thus necessary for optimal decoding, which can be implemented by biologically plausible neural computations. Finally, the ideal observer yields nonobvious-and in some cases counterintuitive-patterns of performance like those exhibited by humans. Important characteristics of human 3D motion processing and estimation may therefore result from optimal information processing in the early visual system.
Collapse
Affiliation(s)
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104
- Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| |
Collapse
|
2
|
Wen P, Landy MS, Rokers B. Identifying cortical areas that underlie the transformation from 2D retinal to 3D head-centric motion signals. Neuroimage 2023; 270:119909. [PMID: 36801370 PMCID: PMC10061442 DOI: 10.1016/j.neuroimage.2023.119909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 01/26/2023] [Accepted: 01/28/2023] [Indexed: 02/18/2023] Open
Abstract
Accurate motion perception requires that the visual system integrate the 2D retinal motion signals received by the two eyes into a single representation of 3D motion. However, most experimental paradigms present the same stimulus to the two eyes, signaling motion limited to a 2D fronto-parallel plane. Such paradigms are unable to dissociate the representation of 3D head-centric motion signals (i.e., 3D object motion relative to the observer) from the associated 2D retinal motion signals. Here, we used stereoscopic displays to present separate motion signals to the two eyes and examined their representation in visual cortex using fMRI. Specifically, we presented random-dot motion stimuli that specified various 3D head-centric motion directions. We also presented control stimuli, which matched the motion energy of the retinal signals, but were inconsistent with any 3D motion direction. We decoded motion direction from BOLD activity using a probabilistic decoding algorithm. We found that 3D motion direction signals can be reliably decoded in three major clusters in the human visual system. Critically, in early visual cortex (V1-V3), we found no significant difference in decoding performance between stimuli specifying 3D motion directions and the control stimuli, suggesting that these areas represent the 2D retinal motion signals, rather than 3D head-centric motion itself. In voxels in and surrounding hMT and IPS0 however, decoding performance was consistently superior for stimuli that specified 3D motion directions compared to control stimuli. Our results reveal the parts of the visual processing hierarchy that are critical for the transformation of retinal into 3D head-centric motion signals and suggest a role for IPS0 in their representation, in addition to its sensitivity to 3D object structure and static depth.
Collapse
Affiliation(s)
- Puti Wen
- Psychology, New York University Abu Dhabi, United Arab Emirates.
| | - Michael S Landy
- Department of Psychology and Center for Neural Science, New York University, United States
| | - Bas Rokers
- Psychology, New York University Abu Dhabi, United Arab Emirates; Department of Psychology and Center for Neural Science, New York University, United States
| |
Collapse
|
3
|
Whritner JA, Czuba TB, Cormack LK, Huk AC. Spatiotemporal integration of isolated binocular three-dimensional motion cues. J Vis 2021; 21:2. [PMID: 34468705 PMCID: PMC8419873 DOI: 10.1167/jov.21.10.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 07/28/2021] [Indexed: 11/24/2022] Open
Abstract
Two primary binocular cues-based on velocities seen by the two eyes or on temporal changes in binocular disparity-support the perception of three-dimensional (3D) motion. Although these cues support 3D motion perception in different perceptual tasks or regimes, stimulus cross-cue contamination and/or substantial differences in spatiotemporal structure have complicated interpretations. We introduce novel psychophysical stimuli which cleanly isolate the cues, based on a design introduced in oculomotor work (Sheliga, Quaia, FitzGibbon, & Cumming, 2016). We then use these stimuli to characterize and compare the temporal and spatial integration properties of velocity- and disparity-based mechanisms. On average, temporal integration of velocity-based cues progressed more than twice as quickly as disparity-based cues; performance in each pure-cue condition saturated at approximately 200 ms and approximately 500 ms, respectively. This temporal distinction suggests that disparity-based 3D direction judgments may include a post-sensory stage involving additional integration time in some observers, whereas velocity-based judgments are rapid and seem to be more purely sensory in nature. Thus, these two binocular mechanisms appear to support 3D motion perception with distinct temporal properties, reflecting differential mixtures of sensory and decision contributions. Spatial integration profiles for the two mechanisms were similar, and on the scale of receptive fields in area MT. Consistent with prior work, there were substantial individual differences, which we interpret as both sensory and cognitive variations across subjects, further clarifying the case for distinct sets of both cue-specific sensory and cognitive mechanisms. The pure-cue stimuli presented here lay the groundwork for further investigations of velocity- and disparity-based contributions to 3D motion perception.
Collapse
Affiliation(s)
- Jake A Whritner
- Center for Perceptual Systems, Department of Psychology, The University of Texas at Austin, Austin, TX, USA
| | - Thaddeus B Czuba
- Center for Perceptual Systems, Department of Psychology, The University of Texas at Austin, Austin, TX, USA
| | - Lawrence K Cormack
- Center for Perceptual Systems, Department of Psychology, The University of Texas at Austin, Austin, TX, USA
| | - Alexander C Huk
- Center for Perceptual Systems, Departments of Neuroscience & Psychology, The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
4
|
Himmelberg MM, Segala FG, Maloney RT, Harris JM, Wade AR. Decoding Neural Responses to Motion-in-Depth Using EEG. Front Neurosci 2020; 14:581706. [PMID: 33362456 PMCID: PMC7758252 DOI: 10.3389/fnins.2020.581706] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 11/23/2020] [Indexed: 11/13/2022] Open
Abstract
Two stereoscopic cues that underlie the perception of motion-in-depth (MID) are changes in retinal disparity over time (CD) and interocular velocity differences (IOVD). These cues have independent spatiotemporal sensitivity profiles, depend upon different low-level stimulus properties, and are potentially processed along separate cortical pathways. Here, we ask whether these MID cues code for different motion directions: do they give rise to discriminable patterns of neural signals, and is there evidence for their convergence onto a single "motion-in-depth" pathway? To answer this, we use a decoding algorithm to test whether, and when, patterns of electroencephalogram (EEG) signals measured from across the full scalp, generated in response to CD- and IOVD-isolating stimuli moving toward or away in depth can be distinguished. We find that both MID cue type and 3D-motion direction can be decoded at different points in the EEG timecourse and that direction decoding cannot be accounted for by static disparity information. Remarkably, we find evidence for late processing convergence: IOVD motion direction can be decoded relatively late in the timecourse based on a decoder trained on CD stimuli, and vice versa. We conclude that early CD and IOVD direction decoding performance is dependent upon fundamentally different low-level stimulus features, but that later stages of decoding performance may be driven by a central, shared pathway that is agnostic to these features. Overall, these data are the first to show that neural responses to CD and IOVD cues that move toward and away in depth can be decoded from EEG signals, and that different aspects of MID-cues contribute to decoding performance at different points along the EEG timecourse.
Collapse
Affiliation(s)
- Marc M Himmelberg
- Department of Psychology, University of York, York, United Kingdom.,Department of Psychology, New York University, New York, NY, United States
| | | | - Ryan T Maloney
- Department of Psychology, University of York, York, United Kingdom
| | - Julie M Harris
- School of Psychology and Neuroscience, University of St. Andrews, Fife, United Kingdom
| | - Alex R Wade
- Department of Psychology, University of York, York, United Kingdom.,York Biomedical Research Institute, University of York, York, United Kingdom
| |
Collapse
|
5
|
Héjja-Brichard Y, Rima S, Rapha E, Durand JB, Cottereau BR. Stereomotion Processing in the Nonhuman Primate Brain. Cereb Cortex 2020; 30:4528-4543. [PMID: 32227117 DOI: 10.1093/cercor/bhaa055] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Revised: 01/22/2020] [Accepted: 02/14/2020] [Indexed: 12/21/2022] Open
Abstract
The cortical areas that process disparity-defined motion-in-depth (i.e., cyclopean stereomotion [CSM]) were characterized with functional magnetic resonance imaging (fMRI) in two awake, behaving macaques. The experimental protocol was similar to previous human neuroimaging studies. We contrasted the responses to dynamic random-dot patterns that continuously changed their binocular disparity over time with those to a control condition that shared the same properties, except that the temporal frames were shuffled. A whole-brain voxel-wise analysis revealed that in all four cortical hemispheres, three areas showed consistent sensitivity to CSM. Two of them were localized respectively in the lower bank of the superior temporal sulcus (CSMSTS) and on the neighboring infero-temporal gyrus (CSMITG). The third area was situated in the posterior parietal cortex (CSMPPC). Additional regions of interest-based analyses within retinotopic areas defined in both animals indicated weaker but significant responses to CSM within the MT cluster (most notably in areas MSTv and FST). Altogether, our results are in agreement with previous findings in both human and macaque and suggest that the cortical areas that process CSM are relatively well preserved between the two primate species.
Collapse
Affiliation(s)
- Yseult Héjja-Brichard
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, 31052 Toulouse, France.,Centre National de la Recherche Scientifique, 31055 Toulouse, France
| | - Samy Rima
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, 31052 Toulouse, France.,Centre National de la Recherche Scientifique, 31055 Toulouse, France
| | - Emilie Rapha
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, 31052 Toulouse, France.,Centre National de la Recherche Scientifique, 31055 Toulouse, France
| | - Jean-Baptiste Durand
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, 31052 Toulouse, France.,Centre National de la Recherche Scientifique, 31055 Toulouse, France
| | - Benoit R Cottereau
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, 31052 Toulouse, France.,Centre National de la Recherche Scientifique, 31055 Toulouse, France
| |
Collapse
|
6
|
Kaestner M, Maloney RT, Wailes-Newson KH, Bloj M, Harris JM, Morland AB, Wade AR. Asymmetries between achromatic and chromatic extraction of 3D motion signals. Proc Natl Acad Sci U S A 2019; 116:13631-13640. [PMID: 31209058 PMCID: PMC6612918 DOI: 10.1073/pnas.1817202116] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Motion in depth (MID) can be cued by high-resolution changes in binocular disparity over time (CD), and low-resolution interocular velocity differences (IOVD). Computational differences between these two mechanisms suggest that they may be implemented in visual pathways with different spatial and temporal resolutions. Here, we used fMRI to examine how achromatic and S-cone signals contribute to human MID perception. Both CD and IOVD stimuli evoked responses in a widespread network that included early visual areas, parts of the dorsal and ventral streams, and motion-selective area hMT+. Crucially, however, we measured an interaction between MID type and chromaticity. fMRI CD responses were largely driven by achromatic stimuli, but IOVD responses were better driven by isoluminant S-cone inputs. In our psychophysical experiments, when S-cone and achromatic stimuli were matched for perceived contrast, participants were equally sensitive to the MID in achromatic and S-cone IOVD stimuli. In comparison, they were relatively insensitive to S-cone CD. These findings provide evidence that MID mechanisms asymmetrically draw on information in precortical pathways. An early opponent motion signal optimally conveyed by the S-cone pathway may provide a substantial contribution to the IOVD mechanism.
Collapse
Affiliation(s)
- Milena Kaestner
- Department of Psychology, University of York, YO10 5DD York, United Kingdom;
- York Neuroimaging Centre, University of York, YO10 5DD York, United Kingdom
| | - Ryan T Maloney
- Department of Psychology, University of York, YO10 5DD York, United Kingdom
- York Neuroimaging Centre, University of York, YO10 5DD York, United Kingdom
| | - Kirstie H Wailes-Newson
- Department of Psychology, University of York, YO10 5DD York, United Kingdom
- York Neuroimaging Centre, University of York, YO10 5DD York, United Kingdom
| | - Marina Bloj
- School of Optometry and Vision Sciences, University of Bradford, BD7 1DP Bradford, United Kingdom
| | - Julie M Harris
- School of Psychology and Neuroscience, University of St. Andrews, KY16 9JP St. Andrews, United Kingdom
| | - Antony B Morland
- Department of Psychology, University of York, YO10 5DD York, United Kingdom
- York Neuroimaging Centre, University of York, YO10 5DD York, United Kingdom
- York Biomedical Research Institute, University of York, YO10 5DD York, United Kingdom
| | - Alex R Wade
- Department of Psychology, University of York, YO10 5DD York, United Kingdom
- York Neuroimaging Centre, University of York, YO10 5DD York, United Kingdom
- York Biomedical Research Institute, University of York, YO10 5DD York, United Kingdom
| |
Collapse
|
7
|
Joo SJ, Greer DA, Cormack LK, Huk AC. Eye-specific pattern-motion signals support the perception of three-dimensional motion. J Vis 2019; 19:27. [PMID: 31013523 PMCID: PMC6482860 DOI: 10.1167/19.4.27] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
An object moving through three-dimensional (3D) space typically yields different patterns of velocities in each eye. For an interocular velocity difference cue to be used, some instances of real 3D motion in the environment (e.g., when a moving object is partially occluded) would require an interocular velocity difference computation that operates on motion signals that are not only monocular (or eye specific) but also depend on each eye's two-dimensional (2D) direction being estimated over regions larger than the size of V1 receptive fields (i.e., global pattern motion). We investigated this possibility using 3D motion aftereffects (MAEs) with stimuli comprising many small, drifting Gabor elements. Conventional frontoparallel (2D) MAEs were local—highly sensitive to the test elements being in the same locations as the adaptor (Experiment 1). In contrast, 3D MAEs were robust to the test elements being in different retinal locations than the adaptor, indicating that 3D motion processing involves relatively global spatial pooling of motion signals (Experiment 2). The 3D MAEs were strong even when the local elements were in unmatched locations across the two eyes during adaptation, as well as when the adapting stimulus elements were randomly oriented, and specified global motion via the intersection of constraints (Experiment 3). These results bolster the notion of eye-specific computation of 2D pattern motion (involving global pooling of local, eye-specific motion signals) for the purpose of computing 3D motion, and highlight the idea that classically “late” computations such as pattern motion can be done in a manner that retains information about the eye of origin.
Collapse
Affiliation(s)
- Sung Jun Joo
- Department of Psychology, Pusan National University, Busan, Republic of Korea
| | - Devon A Greer
- Center for Perceptual Systems, The University of Texas at Austin, Austin, TX, USA
| | - Lawrence K Cormack
- Center for Perceptual Systems, The University of Texas at Austin, Austin, TX, USA.,Department of Psychology, The University of Texas at Austin, Austin, TX, USA
| | - Alexander C Huk
- Center for Perceptual Systems, The University of Texas at Austin, Austin, TX, USA.,Department of Psychology, The University of Texas at Austin, Austin, TX, USA.,Department of Neuroscience, The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
8
|
Thompson L, Ji M, Rokers B, Rosenberg A. Contributions of binocular and monocular cues to motion-in-depth perception. J Vis 2019; 19:2. [PMID: 30836382 PMCID: PMC6402382 DOI: 10.1167/19.3.2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Intercepting and avoiding moving objects requires accurate motion-in-depth (MID) perception. Such motion can be estimated based on both binocular and monocular cues. Because previous studies largely characterized sensitivity to these cues individually, their relative contributions to MID perception remain unclear. Here we measured sensitivity to binocular, monocular, and combined cue MID stimuli using a motion coherence paradigm. We first confirmed prior reports of substantial variability in binocular MID cue sensitivity across the visual field. The stimuli were matched for eccentricity and speed, suggesting that this variability has a neural basis. Second, we determined that monocular MID cue sensitivity also varied considerably across the visual field. A major component of this variability was geometric: An MID stimulus produces the largest motion signals in the eye contralateral to its visual field location. This resulted in better monocular discrimination performance when the contralateral rather than ipsilateral eye was stimulated. Third, we found that monocular cue sensitivity generally exceeded, and was independent of, binocular cue sensitivity. Finally, contralateral monocular cue sensitivity was found to be a strong predictor of combined cue sensitivity. These results reveal distinct factors constraining the contributions of binocular and monocular cues to three-dimensional motion perception.
Collapse
Affiliation(s)
- Lowell Thompson
- Department of Psychology, University of Wisconsin–Madison, Madison, WI, USA,Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin–Madison, Madison, WI, USA
| | - Mohan Ji
- Department of Psychology, University of Wisconsin–Madison, Madison, WI, USA
| | - Bas Rokers
- Department of Psychology, University of Wisconsin–Madison, Madison, WI, USA
| | - Ari Rosenberg
- Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin–Madison, Madison, WI, USA
| |
Collapse
|
9
|
Maloney RT, Kaestner M, Bruce A, Bloj M, Harris JM, Wade AR. Sensitivity to Velocity- and Disparity-Based Cues to Motion-In-Depth With and Without Spared Stereopsis in Binocular Visual Impairment. Invest Ophthalmol Vis Sci 2018; 59:4375-4383. [PMID: 30193309 DOI: 10.1167/iovs.17-23692] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Purpose Two binocular sources of information serve motion-in-depth (MID) perception: changes in disparity over time (CD), and interocular velocity differences (IOVD). While CD requires the computation of small spatial disparities, IOVD could be computed from a much lower-resolution signal. IOVD signals therefore might still be available under conditions of binocular vision impairment (BVI) with limited or no stereopsis, for example, amblyopia. Methods Sensitivity to CD and IOVD was measured in adults who had undergone therapy to correct optical misalignment or amblyopia in childhood (n = 16), as well as normal vision controls with good stereoacuity (n = 8). Observers discriminated the interval containing a smoothly oscillating MID "test" stimulus from a "control" stimulus in a two-interval forced choice paradigm. Results Of the BVI observers with no static stereoacuity (n = 9), one displayed evidence for sensitivity to IOVD only, while there was otherwise no sensitivity for either CD or IOVD in the group. Generally, BVI observers with measurable stereoacuity (n = 7) displayed a pattern resembling the control group: showing a similar sensitivity for both cues. A neutral density filter placed in front of the fixing eye in a subset of BVI observers did not improve performance. Conclusions In one BVI observer there was preserved sensitivity to IOVD but not CD, though overall only those BVI observers with at least gross stereopsis were able to detect disparity- or velocity-based cues to MID. The results imply that these logically distinct information sources are somehow coupled, and in some cases BVI observers with no stereopsis may still retain sensitivity to IOVD.
Collapse
Affiliation(s)
- Ryan T Maloney
- Department of Psychology, The University of York, York, United Kingdom
| | - Milena Kaestner
- Department of Psychology, The University of York, York, United Kingdom
| | - Alison Bruce
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, United Kingdom
| | - Marina Bloj
- School of Optometry and Vision Science, University of Bradford, Bradford, United Kingdom
| | - Julie M Harris
- School of Psychology and Neuroscience, University of St. Andrews, St. Andrews, United Kingdom
| | - Alex R Wade
- Department of Psychology, The University of York, York, United Kingdom
| |
Collapse
|
10
|
Abstract
The visual system must recover important properties of the external environment if its host is to survive. Because the retinae are effectively two-dimensional but the world is three-dimensional (3D), the patterns of stimulation both within and across the eyes must be used to infer the distal stimulus-the environment-in all three dimensions. Moreover, animals and elements in the environment move, which means the input contains rich temporal information. Here, in addition to reviewing the literature, we discuss how and why prior work has focused on purported isolated systems (e.g., stereopsis) or cues (e.g., horizontal disparity) that do not necessarily map elegantly on to the computations and complex patterns of stimulation that arise when visual systems operate within the real world. We thus also introduce the binoptic flow field (BFF) as a description of the 3D motion information available in realistic environments, which can foster the use of ecologically valid yet well-controlled stimuli. Further, it can help clarify how future studies can more directly focus on the computations and stimulus properties the visual system might use to support perception and behavior in a dynamic 3D world.
Collapse
Affiliation(s)
| | | | - Jonas Knöll
- The University of Texas at Austin, Texas 78757;
| | | |
Collapse
|