1
|
Scott H, Murphy AJ, Briggs F, Snyder AC. Using Generative Models of Naturalistic Scenes to Sample Neural Population Tuning Manifolds. Eur J Neurosci 2025; 61:e70088. [PMID: 40162802 DOI: 10.1111/ejn.70088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 03/04/2025] [Accepted: 03/15/2025] [Indexed: 04/02/2025]
Abstract
Investigations into sensory coding in the visual system have typically relied on the use of either simple, unnatural visual stimuli or natural images. Simple stimuli, such as Gabor patches, have been effective when looking at single neurons in early visual areas such as V1 but seldom produce large responses from mid-level visual neurons or neural populations with diverse tuning. Many types of "naturalistic" image models have been developed recently, which bridge the gap between overly simple stimuli and experimentally infeasible natural images. These stimuli can vary along a large number of feature dimensions, introducing new challenges when trying to map those features to neural activity. This "curse of dimensionality" is exacerbated when neural responses are themselves high dimensional, such as when recording neural populations with implanted multielectrode arrays. We propose a method that searches high-dimensional stimulus spaces for characterizing neural population manifolds in a closed-loop experimental design. Stimuli were generated using a deep neural network in each block by using neural responses to previous stimuli to make predictions about the relationship between the latent space of the image model and neural responses. We found that these latent variables from the deep generative image model explained stronger linear relationships with neural activity than various alternative forms of image compression. This result reinforces the potential for deep generative image models for efficient characterization of high-dimensional tuning manifolds for visual neural populations.
Collapse
Affiliation(s)
- Hayden Scott
- Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA
- Center for Visual Science, University of Rochester, Rochester, New York, USA
| | - Allison J Murphy
- Center for Visual Science, University of Rochester, Rochester, New York, USA
- Neuroscience, University of Rochester Medical Center, Rochester, New York, USA
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, Maryland, USA
| | - Farran Briggs
- Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA
- Center for Visual Science, University of Rochester, Rochester, New York, USA
- Neuroscience, University of Rochester Medical Center, Rochester, New York, USA
- Laboratory of Sensorimotor Research, National Eye Institute, Bethesda, Maryland, USA
| | - Adam C Snyder
- Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA
- Center for Visual Science, University of Rochester, Rochester, New York, USA
- Neuroscience, University of Rochester Medical Center, Rochester, New York, USA
| |
Collapse
|
2
|
Srinath R, Ni AM, Marucci C, Cohen MR, Brainard DH. Orthogonal neural representations support perceptual judgments of natural stimuli. Sci Rep 2025; 15:5316. [PMID: 39939679 PMCID: PMC11821992 DOI: 10.1038/s41598-025-88910-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 01/31/2025] [Indexed: 02/14/2025] Open
Abstract
In natural visually guided behavior, observers must separate relevant information from a barrage of irrelevant information. Many studies have investigated the neural underpinnings of this ability using artificial stimuli presented on blank backgrounds. Natural images, however, contain task-irrelevant background elements that might interfere with the perception of object features. Recent studies suggest that visual feature estimation can be modeled through the linear decoding of task-relevant information from visual cortex. So, if the representations of task-relevant and irrelevant features are not orthogonal in the neural population, then variation in the task-irrelevant features would impair task performance. We tested this hypothesis using human psychophysics and monkey neurophysiology combined with parametrically variable naturalistic stimuli. We demonstrate that (1) the neural representation of one feature (the position of an object) in visual area V4 is orthogonal to those of several background features, (2) the ability of human observers to precisely judge object position was largely unaffected by those background features, and (3) many features of the object and the background (and of objects from a separate stimulus set) are orthogonally represented in V4 neural population responses. Our observations are consistent with the hypothesis that orthogonal neural representations can support stable perception of object features despite the richness of natural visual scenes.
Collapse
Affiliation(s)
- Ramanujan Srinath
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL, 60637, USA
| | - Amy M Ni
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL, 60637, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Claire Marucci
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marlene R Cohen
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL, 60637, USA
| | - David H Brainard
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
3
|
Namima T, Kempkes E, Zamarashkina P, Owen N, Pasupathy A. High-Density Recording Reveals Sparse Clusters (But Not Columns) for Shape and Texture Encoding in Macaque V4. J Neurosci 2025; 45:e1893232024. [PMID: 39562041 PMCID: PMC11780345 DOI: 10.1523/jneurosci.1893-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 10/28/2024] [Accepted: 11/12/2024] [Indexed: 11/21/2024] Open
Abstract
Macaque area V4 includes neurons that exhibit exquisite selectivity for visual form and surface texture, but their functional organization across laminae is unknown. We used high-density Neuropixels probes in two awake monkeys (one female and one male) to characterize the shape and texture tuning of dozens of neurons simultaneously across layers. We found sporadic clusters of neurons that exhibit similar tuning for shape and texture: ∼20% exhibited similar tuning with their neighbors. Importantly, these clusters were confined to a few layers, seldom "columnar" in structure. This was the case even when neurons were strongly driven and exhibited robust contrast invariance for shape and texture tuning. We conclude that functional organization in area V4 is not columnar for shape and texture stimulus features and in general organization may be at a coarser stimulus category scale (e.g., selectivity for stimuli with vs without 3D cues) and a coarser spatial scale (assessed by optical imaging), rather than at a fine scale in terms of similarity in single-neuron tuning for specific features. We speculate that this may be a direct consequence of the great diversity of inputs integrated by V4 neurons to build variegated tuning manifolds in a high-dimensional space.
Collapse
Affiliation(s)
- Tomoyuki Namima
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, Washington 98195
- Graduate School of Frontier Biosciences, Osaka University, Suita 565-0871, Japan
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, Suita 565-0871, Japan
| | - Erin Kempkes
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, Washington 98195
| | - Polina Zamarashkina
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, Washington 98195
| | - Natalia Owen
- Undergraduate Neuroscience Program, University of Washington, Seattle, Washington 98195
| | - Anitha Pasupathy
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, Washington 98195
| |
Collapse
|
4
|
Li B, Todo Y, Tang Z. Artificial Visual System for Stereo-Orientation Recognition Based on Hubel-Wiesel Model. Biomimetics (Basel) 2025; 10:38. [PMID: 39851754 PMCID: PMC11762170 DOI: 10.3390/biomimetics10010038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 12/26/2024] [Accepted: 01/06/2025] [Indexed: 01/26/2025] Open
Abstract
Stereo-orientation selectivity is a fundamental neural mechanism in the brain that plays a crucial role in perception. However, due to the recognition process of high-dimensional spatial information commonly occurring in high-order cortex, we still know little about the mechanisms underlying stereo-orientation selectivity and lack a modeling strategy. A classical explanation for the mechanism of two-dimensional orientation selectivity within the primary visual cortex is based on the Hubel-Wiesel model, a cascading neural connection structure. The local-to-global information aggregation thought within the Hubel-Wiesel model not only contributed to neurophysiology but also inspired the development of computer vision fields. In this paper, we provide a clear and efficient conceptual understanding of stereo-orientation selectivity and propose a quantitative explanation for its generation based on the thought of local-to-global information aggregation within the Hubel-Wiesel model and develop an artificial visual system (AVS) for stereo-orientation recognition. Our approach involves modeling depth selective cells to receive depth information, simple stereo-orientation selective cells for combining distinct depth information inputs to generate various local stereo-orientation selectivity, and complex stereo-orientation selective cells responsible for integrating the same local information to generate global stereo-orientation selectivity. Simulation results demonstrate that our AVS is effective in stereo-orientation recognition and robust against spatial noise jitters. AVS achieved an overall over 90% accuracy on noise data in orientation recognition tasks, significantly outperforming deep models. In addition, the AVS contributes to enhancing deep models' performance, robustness, and stability in 3D object recognition tasks. Notably, AVS enhanced the TransNeXt model in improving its overall performance from 73.1% to 97.2% on the 3D-MNIST dataset and from 56.1% to 86.4% on the 3D-Fashion-MNIST dataset. Our explanation for the generation of stereo-orientation selectivity offers a reliable, explainable, and robust approach for extracting spatial features and provides a straightforward modeling method for neural computation research.
Collapse
Affiliation(s)
- Bin Li
- Division of Electrical Engineering and Computer Science, Kanazawa University, Kanazawa-shi 920-1192, Japan;
| | - Yuki Todo
- Faculty of Electrical, Information and Communication Engineering, Kanazawa University, Kanazawa-shi 920-1192, Japan
| | - Zheng Tang
- Institute of AI for Industries, Chinese Academy of Sciences, 168 Tianquan Road, Nanjing 211100, China
- School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
| |
Collapse
|
5
|
Nielsen KJ, Connor CE. How Shape Perception Works, in Two Dimensions and Three Dimensions. Annu Rev Vis Sci 2024; 10:47-68. [PMID: 38848596 DOI: 10.1146/annurev-vision-112823-031607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2024]
Abstract
The ventral visual pathway transforms retinal images into neural representations that support object understanding, including exquisite appreciation of precise 2D pattern shape and 3D volumetric shape. We articulate a framework for understanding the goals of this transformation and how they are achieved by neural coding at successive ventral pathway stages. The critical goals are (a) radical compression to make shape information communicable across axonal bundles and storable in memory, (b) explicit coding to make shape information easily readable by the rest of the brain and thus accessible for cognition and behavioral control, and (c) representational stability to maintain consistent perception across highly variable viewing conditions. We describe how each transformational step in ventral pathway vision serves one or more of these goals. This three-goal framework unifies discoveries about ventral shape processing into a neural explanation for our remarkable experience of shape as a vivid, richly detailed aspect of the natural world.
Collapse
Affiliation(s)
- Kristina J Nielsen
- Krieger Mind/Brain Institute and Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland, USA; ,
| | - Charles E Connor
- Krieger Mind/Brain Institute and Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland, USA; ,
| |
Collapse
|
6
|
Srinath R, Czarnik MM, Cohen MR. Coordinated Response Modulations Enable Flexible Use of Visual Information. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.10.602774. [PMID: 39071390 PMCID: PMC11275750 DOI: 10.1101/2024.07.10.602774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
We use sensory information in remarkably flexible ways. We can generalize by ignoring task-irrelevant features, report different features of a stimulus, and use different actions to report a perceptual judgment. These forms of flexible behavior are associated with small modulations of the responses of sensory neurons. While the existence of these response modulations is indisputable, efforts to understand their function have been largely relegated to theory, where they have been posited to change information coding or enable downstream neurons to read out different visual and cognitive information using flexible weights. Here, we tested these ideas using a rich, flexible behavioral paradigm, multi-neuron, multi-area recordings in primary visual cortex (V1) and mid-level visual area V4. We discovered that those response modulations in V4 (but not V1) contain the ingredients necessary to enable flexible behavior, but not via those previously hypothesized mechanisms. Instead, we demonstrated that these response modulations are precisely coordinated across the population such that downstream neurons have ready access to the correct information to flexibly guide behavior without making changes to information coding or synapses. Our results suggest a novel computational role for task-dependent response modulations: they enable flexible behavior by changing the information that gets out of a sensory area, not by changing information coding within it.
Collapse
Affiliation(s)
- Ramanujan Srinath
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Martyna M. Czarnik
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
- Current affiliation: Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Marlene R. Cohen
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
7
|
Srinath R, Ni AM, Marucci C, Cohen MR, Brainard DH. Orthogonal neural representations support perceptual judgements of natural stimuli. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.14.580134. [PMID: 38464018 PMCID: PMC10925131 DOI: 10.1101/2024.02.14.580134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
In natural behavior, observers must separate relevant information from a barrage of irrelevant information. Many studies have investigated the neural underpinnings of this ability using artificial stimuli presented on simple backgrounds. Natural viewing, however, carries a set of challenges that are inaccessible using artificial stimuli, including neural responses to background objects that are task-irrelevant. An emerging body of evidence suggests that the visual abilities of humans and animals can be modeled through the linear decoding of task-relevant information from visual cortex. This idea suggests the hypothesis that irrelevant features of a natural scene should impair performance on a visual task only if their neural representations intrude on the linear readout of the task relevant feature, as would occur if the representations of task-relevant and irrelevant features are not orthogonal in the underlying neural population. We tested this hypothesis using human psychophysics and monkey neurophysiology, in response to parametrically variable naturalistic stimuli. We demonstrate that 1) the neural representation of one feature (the position of a central object) in visual area V4 is orthogonal to those of several background features, 2) the ability of human observers to precisely judge object position was largely unaffected by task-irrelevant variation in those background features, and 3) many features of the object and the background are orthogonally represented by V4 neural responses. Our observations are consistent with the hypothesis that orthogonal neural representations can support stable perception of objects and features despite the tremendous richness of natural visual scenes.
Collapse
Affiliation(s)
- Ramanujan Srinath
- equal contribution
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Amy M. Ni
- equal contribution
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Claire Marucci
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Marlene R. Cohen
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
- equal contribution
| | - David H. Brainard
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
- equal contribution
| |
Collapse
|
8
|
Cadena SA, Willeke KF, Restivo K, Denfield G, Sinz FH, Bethge M, Tolias AS, Ecker AS. Diverse task-driven modeling of macaque V4 reveals functional specialization towards semantic tasks. PLoS Comput Biol 2024; 20:e1012056. [PMID: 38781156 PMCID: PMC11115319 DOI: 10.1371/journal.pcbi.1012056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 04/08/2024] [Indexed: 05/25/2024] Open
Abstract
Responses to natural stimuli in area V4-a mid-level area of the visual ventral stream-are well predicted by features from convolutional neural networks (CNNs) trained on image classification. This result has been taken as evidence for the functional role of V4 in object classification. However, we currently do not know if and to what extent V4 plays a role in solving other computational objectives. Here, we investigated normative accounts of V4 (and V1 for comparison) by predicting macaque single-neuron responses to natural images from the representations extracted by 23 CNNs trained on different computer vision tasks including semantic, geometric, 2D, and 3D types of tasks. We found that V4 was best predicted by semantic classification features and exhibited high task selectivity, while the choice of task was less consequential to V1 performance. Consistent with traditional characterizations of V4 function that show its high-dimensional tuning to various 2D and 3D stimulus directions, we found that diverse non-semantic tasks explained aspects of V4 function that are not captured by individual semantic tasks. Nevertheless, jointly considering the features of a pair of semantic classification tasks was sufficient to yield one of our top V4 models, solidifying V4's main functional role in semantic processing and suggesting that V4's selectivity to 2D or 3D stimulus properties found by electrophysiologists can result from semantic functional goals.
Collapse
Affiliation(s)
- Santiago A. Cadena
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
| | - Konstantin F. Willeke
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University Tübingen, Tübingen, Germany
| | - Kelli Restivo
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - George Denfield
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - Fabian H. Sinz
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University Tübingen, Tübingen, Germany
| | - Matthias Bethge
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Andreas S. Tolias
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America
| | - Alexander S. Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| |
Collapse
|
9
|
Cheng A, Sokol S, Connor CE. An inferotemporal coding strategy robust to partial object occlusion. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.09.588746. [PMID: 38645189 PMCID: PMC11030457 DOI: 10.1101/2024.04.09.588746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Object coding in primate ventral pathway cortex progresses in sparseness/compression/efficiency, from many orientation signals in V1, to fewer 2D/3D part signals in V4, to still fewer multi-part configuration signals in AIT (anterior inferotemporal cortex). 1-11 This progression could lead to individual neurons exclusively selective for unique objects, the sparsest code for identity, especially for highly familiar, important objects. 12-18 To test this, we trained macaque monkeys to discriminate 8 simple letter-like shapes in a match-to-sample task, a design in which one-to-one coding of letters by neurons could streamline behavior. Performance increased from chance to >80% correct over a period of weeks, after which AIT neurons showed clear learning effects, with increased selectivity for multi-part configurations within the trained alphabet shapes. But these neurons were not exclusively tuned for unique letters based on training, since their responsiveness generalized to different, non-trained shapes containing the same configurations. This multi-part configuration coding limit in AIT is not maximally sparse, but it could explain the robustness of primate vision to partial object occlusion, which is common in the natural world and problematic for computer vision. Multi-part configurations are highly diagnostic of identity, and neural signals for various partial object structures can provide different but equally sufficient evidence for whole object identity across most occlusion conditions.
Collapse
|
10
|
Namima T, Kempkes E, Zamarashkina P, Owen N, Pasupathy A. High-density recording reveals sparse clusters (but not columns) for shape and texture encoding in macaque V4. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.15.562424. [PMID: 37904996 PMCID: PMC10614825 DOI: 10.1101/2023.10.15.562424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Macaque area V4 includes neurons that exhibit exquisite selectivity for visual form and surface texture, but their functional organization across laminae is unknown. We used high-density Neuropixels probes in two awake monkeys to characterize shape and texture tuning of dozens of neurons simultaneously across layers. We found sporadic clusters of neurons that exhibit similar tuning for shape and texture: ~20% exhibited similar tuning with their neighbors. Importantly, these clusters were confined to a few layers, seldom 'columnar' in structure. This was the case even when neurons were strongly driven, and exhibited robust contrast invariance for shape and texture tuning. We conclude that functional organization in area V4 is not columnar for shape and texture stimulus features and in general organization maybe at a coarse scale (e.g. encoding of 2D vs 3D shape) rather than at a fine scale in terms of similarity in tuning for specific features (as in the orientation columns in V1). We speculate that this may be a direct consequence of the great diversity of inputs integrated by V4 neurons to build variegated tuning manifolds in a high-dimensional space.
Collapse
Affiliation(s)
- Tomoyuki Namima
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, WA 98195, USA
- Graduate School of Frontier Biosciences, Osaka University, and Center for Information and Neural Networks, National Institute of Information and Communications Technology, Suita, Osaka, 565-0871, Japan
| | - Erin Kempkes
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, WA 98195, USA
| | - Polina Zamarashkina
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, WA 98195, USA
| | - Natalia Owen
- Undergraduate Neuroscience Program, University of Washington, Seattle, WA 98195, USA
| | - Anitha Pasupathy
- Department of Biological Structure and Washington National Primate Research Center, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
11
|
Rosenberg A, Thompson LW, Doudlah R, Chang TY. Neuronal Representations Supporting Three-Dimensional Vision in Nonhuman Primates. Annu Rev Vis Sci 2023; 9:337-359. [PMID: 36944312 DOI: 10.1146/annurev-vision-111022-123857] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
The visual system must reconstruct the dynamic, three-dimensional (3D) world from ambiguous two-dimensional (2D) retinal images. In this review, we synthesize current literature on how the visual system of nonhuman primates performs this transformation through multiple channels within the classically defined dorsal (where) and ventral (what) pathways. Each of these channels is specialized for processing different 3D features (e.g., the shape, orientation, or motion of objects, or the larger scene structure). Despite the common goal of 3D reconstruction, neurocomputational differences between the channels impose distinct information-limiting constraints on perception. Convergent evidence further points to the little-studied area V3A as a potential branchpoint from which multiple 3D-fugal processing channels diverge. We speculate that the expansion of V3A in humans may have supported the emergence of advanced 3D spatial reasoning skills. Lastly, we discuss future directions for exploring 3D information transmission across brain areas and experimental approaches that can further advance the understanding of 3D vision.
Collapse
Affiliation(s)
- Ari Rosenberg
- Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, Wisconsin, USA;
| | - Lowell W Thompson
- Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, Wisconsin, USA;
| | - Raymond Doudlah
- Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, Wisconsin, USA;
| | - Ting-Yu Chang
- School of Medicine, National Defense Medical Center, Taipei, Taiwan
| |
Collapse
|
12
|
Emonds AMX, Srinath R, Nielsen KJ, Connor CE. Object representation in a gravitational reference frame. eLife 2023; 12:e81701. [PMID: 37561119 PMCID: PMC10414968 DOI: 10.7554/elife.81701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 07/19/2023] [Indexed: 08/11/2023] Open
Abstract
When your head tilts laterally, as in sports, reaching, and resting, your eyes counterrotate less than 20%, and thus eye images rotate, over a total range of about 180°. Yet, the world appears stable and vision remains normal. We discovered a neural strategy for rotational stability in anterior inferotemporal cortex (IT), the final stage of object vision in primates. We measured object orientation tuning of IT neurons in macaque monkeys tilted +25 and -25° laterally, producing ~40° difference in retinal image orientation. Among IT neurons with consistent object orientation tuning, 63% remained stable with respect to gravity across tilts. Gravitational tuning depended on vestibular/somatosensory but also visual cues, consistent with previous evidence that IT processes scene cues for gravity's orientation. In addition to stability across image rotations, an internal gravitational reference frame is important for physical understanding of a world where object position, posture, structure, shape, movement, and behavior interact critically with gravity.
Collapse
Affiliation(s)
- Alexandriya MX Emonds
- Department of Biomedical Engineering, Johns Hopkins University School of MedicineBaltimoreUnited States
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
| | - Ramanujan Srinath
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| | - Kristina J Nielsen
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| | - Charles E Connor
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| |
Collapse
|
13
|
Encoding of 3D physical dimensions by face-selective cortical neurons. Proc Natl Acad Sci U S A 2023; 120:e2214996120. [PMID: 36802419 PMCID: PMC9992780 DOI: 10.1073/pnas.2214996120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
Neurons throughout the primate inferior temporal (IT) cortex respond selectively to visual images of faces and other complex objects. The response magnitude of neurons to a given image often depends on the size at which the image is presented, usually on a flat display at a fixed distance. While such size sensitivity might simply reflect the angular subtense of retinal image stimulation in degrees, one unexplored possibility is that it tracks the real-world geometry of physical objects, such as their size and distance to the observer in centimeters. This distinction bears fundamentally on the nature of object representation in IT and on the scope of visual operations supported by the ventral visual pathway. To address this question, we assessed the response dependency of neurons in the macaque anterior fundus (AF) face patch to the angular versus physical size of faces. We employed a macaque avatar to stereoscopically render three-dimensional (3D) photorealistic faces at multiple sizes and distances, including a subset of size/distance combinations designed to cast the same size retinal image projection. We found that most AF neurons were modulated principally by the 3D physical size of the face rather than its two-dimensional (2D) angular size on the retina. Further, most neurons responded strongest to extremely large and small faces, rather than to those of normal size. Together, these findings reveal a graded encoding of physical size among face patch neurons, providing evidence that category-selective regions of the primate ventral visual pathway participate in a geometric analysis of real-world objects.
Collapse
|
14
|
Zhang Y, Schriver KE, Hu JM, Roe AW. Spatial frequency representation in V2 and V4 of macaque monkey. eLife 2023; 12:81794. [PMID: 36607323 PMCID: PMC9848390 DOI: 10.7554/elife.81794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 01/05/2023] [Indexed: 01/07/2023] Open
Abstract
Spatial frequency (SF) is an important attribute in the visual scene and is a defining feature of visual processing channels. However, there remain many unsolved questions about how extrastriate areas in primate visual cortex code this fundamental information. Here, using intrinsic signal optical imaging in visual areas of V2 and V4 of macaque monkeys, we quantify the relationship between SF maps and (1) visual topography and (2) color and orientation maps. We find that in orientation regions, low to high SF is mapped orthogonally to orientation; in color regions, which are reported to contain orthogonal axes of color and lightness, low SFs tend to be represented more frequently than high SFs. This supports a population-based SF fluctuation related to the 'color/orientation' organizations. We propose a generalized hypercolumn model across cortical areas, comprised of two orthogonal parameters with additional parameters.
Collapse
Affiliation(s)
- Ying Zhang
- Department of Neurosurgery of the Second Affiliated Hospital, Interdisciplinary Institute of Neuroscience and Technology, School of Medicine, Zhejiang UniversityHangzhouChina
- Key Laboratory of Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang UniversityHangzhouChina
| | - Kenneth E Schriver
- Department of Neurosurgery of the Second Affiliated Hospital, Interdisciplinary Institute of Neuroscience and Technology, School of Medicine, Zhejiang UniversityHangzhouChina
- Key Laboratory of Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang UniversityHangzhouChina
- MOE Frontier Science Center for Brain Science and Brain-Machine Integration, School of Brain Science and Brain Medicine, Zhejiang UniversityHangzhouChina
| | - Jia Ming Hu
- Department of Neurosurgery of the Second Affiliated Hospital, Interdisciplinary Institute of Neuroscience and Technology, School of Medicine, Zhejiang UniversityHangzhouChina
- Key Laboratory of Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang UniversityHangzhouChina
- MOE Frontier Science Center for Brain Science and Brain-Machine Integration, School of Brain Science and Brain Medicine, Zhejiang UniversityHangzhouChina
| | - Anna Wang Roe
- Department of Neurosurgery of the Second Affiliated Hospital, Interdisciplinary Institute of Neuroscience and Technology, School of Medicine, Zhejiang UniversityHangzhouChina
- Key Laboratory of Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang UniversityHangzhouChina
- MOE Frontier Science Center for Brain Science and Brain-Machine Integration, School of Brain Science and Brain Medicine, Zhejiang UniversityHangzhouChina
| |
Collapse
|
15
|
Chen YC, Deza A, Konkle T. How big should this object be? Perceptual influences on viewing-size preferences. Cognition 2022; 225:105114. [DOI: 10.1016/j.cognition.2022.105114] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/24/2022] [Accepted: 03/28/2022] [Indexed: 11/29/2022]
|
16
|
Hatanaka G, Inagaki M, Takeuchi RF, Nishimoto S, Ikezoe K, Fujita I. Processing of visual statistics of naturalistic videos in macaque visual areas V1 and V4. Brain Struct Funct 2022; 227:1385-1403. [PMID: 35286478 PMCID: PMC9046337 DOI: 10.1007/s00429-022-02468-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 02/02/2022] [Indexed: 11/25/2022]
Abstract
Natural scenes are characterized by diverse image statistics, including various parameters of the luminance histogram, outputs of Gabor-like filters, and pairwise correlations between the filter outputs of different positions, orientations, and scales (Portilla–Simoncelli statistics). Some of these statistics capture the response properties of visual neurons. However, it remains unclear to what extent such statistics can explain neural responses to natural scenes and how neurons that are tuned to these statistics are distributed across the cortex. Using two-photon calcium imaging and an encoding-model approach, we addressed these issues in macaque visual areas V1 and V4. For each imaged neuron, we constructed an encoding model to mimic its responses to naturalistic videos. By extracting Portilla–Simoncelli statistics through outputs of both filters and filter correlations, and by computing an optimally weighted sum of these outputs, the model successfully reproduced responses in a subpopulation of neurons. We evaluated the selectivities of these neurons by quantifying the contributions of each statistic to visual responses. Neurons whose responses were mainly determined by Gabor-like filter outputs (low-level statistics) were abundant at most imaging sites in V1. In V4, the relative contribution of higher order statistics, such as cross-scale correlation, was increased. Preferred image statistics varied markedly across V4 sites, and the response similarity of two neurons at individual imaging sites gradually declined with increasing cortical distance. The results indicate that natural scene analysis progresses from V1 to V4, and neurons sharing preferred image statistics are locally clustered in V4.
Collapse
Affiliation(s)
- Gaku Hatanaka
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Mikio Inagaki
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Suita, Osaka, 565-0871, Japan
| | - Ryosuke F Takeuchi
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Shinji Nishimoto
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Suita, Osaka, 565-0871, Japan
| | - Koji Ikezoe
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Suita, Osaka, 565-0871, Japan
- Faculty of Medicine, University of Yamanashi, Chuo, Yamanashi, 409-3898, Japan
| | - Ichiro Fujita
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan.
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Suita, Osaka, 565-0871, Japan.
| |
Collapse
|
17
|
Ohara M, Kim J, Koida K. The Role of Specular Reflections and Illumination in the Perception of Thickness in Solid Transparent Objects. Front Psychol 2022; 13:766056. [PMID: 35250710 PMCID: PMC8891632 DOI: 10.3389/fpsyg.2022.766056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Accepted: 01/17/2022] [Indexed: 11/24/2022] Open
Abstract
Specular reflections and refractive distortions are complex image properties of solid transparent objects, but despite this complexity, we readily perceive the 3D shapes of these objects (e.g., glass and clear plastic). We have found in past work that relevant sources of scene complexity have differential effects on 3D shape perception, with specular reflections increasing perceived thickness, and refractive distortions decreasing perceived thickness. In an object with both elements, such as glass, the two optical properties may complement each other to support reliable perception of 3D shape. We investigated the relative dominance of specular reflection and refractive distortions in the perception of shape. Surprisingly, the ratio of specular reflection to refractive component was almost equal to that of ordinary glass and ice, which promote correct percepts of 3D shape. The results were also explained by the variance in local RMS contrast in stimulus images but may depend on overall luminance and contrast of the surrounding light field.
Collapse
Affiliation(s)
- Masakazu Ohara
- Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, Japan
| | - Juno Kim
- School of Optometry and Vision Science, University of New South Wales, Sydney, NSW, Australia
| | - Kowa Koida
- Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, Japan.,Electronics-Inspired Interdisciplinary Research Institute, Toyohashi University of Technology, Toyohashi, Japan
| |
Collapse
|
18
|
Function-specific projections from V2 to V4 in macaques. Brain Struct Funct 2022; 227:1317-1330. [PMID: 34978607 DOI: 10.1007/s00429-021-02440-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 12/08/2021] [Indexed: 11/02/2022]
Abstract
Previous studies have revealed modular projections from area V2 to area V4 in macaques. Specifically, V2 neurons in cytochrome oxidase (CO)-rich thin and CO-sparse pale stripes project to distinct regions in V4. However, how these modular projections relate to the functional subcompartments of V4 remains unclear. In this study, we injected retrograde fluorescent tracers into V4 regions with different functional properties (color, orientation, and direction) that were identified with intrinsic signal optical imaging (ISOI). We examined the labeled neurons in area V2 and their locations with respect to the CO patterns. Covariation was observed between the functional properties of the V4 injection sites and the numbers of labeled neurons in particular CO stripes. This covariation indicates that the color domains in V4 mainly received inputs from thin stripes in V2, whereas V4 orientation domains received inputs from pale stripes. Although motion-sensitive domains are present in both V2 and V4, our results did not reveal a functional specific feedforward projection between them. These results confirmed previous findings of modular projections from V2 to V4 and provided functional specificity for these anatomical projections. Together, these findings indicate that color and form remain separate in ventral mid-level visual processing.
Collapse
|
19
|
Goutcher R, Barrington C, Hibbard PB, Graham B. Binocular vision supports the development of scene segmentation capabilities: Evidence from a deep learning model. J Vis 2021; 21:13. [PMID: 34289490 PMCID: PMC8300045 DOI: 10.1167/jov.21.7.13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/16/2021] [Indexed: 11/24/2022] Open
Abstract
The application of deep learning techniques has led to substantial progress in solving a number of critical problems in machine vision, including fundamental problems of scene segmentation and depth estimation. Here, we report a novel deep neural network model, capable of simultaneous scene segmentation and depth estimation from a pair of binocular images. By manipulating the arrangement of binocular image pairs, presenting the model with standard left-right image pairs, identical image pairs or swapped left-right images, we show that performance levels depend on the presence of appropriate binocular image arrangements. Segmentation and depth estimation performance are both impaired when images are swapped. Segmentation performance levels are maintained, however, for identical image pairs, despite the absence of binocular disparity information. Critically, these performance levels exceed those found for an equivalent, monocularly trained, segmentation model. These results provide evidence that binocular image differences support both the direct recovery of depth and segmentation information, and the enhanced learning of monocular segmentation signals. This finding suggests that binocular vision may play an important role in visual development. Better understanding of this role may hold implications for the study and treatment of developmentally acquired perceptual impairments.
Collapse
Affiliation(s)
- Ross Goutcher
- Psychology Division, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| | - Christian Barrington
- Psychology Division, Faculty of Natural Sciences, University of Stirling, Stirling, UK
- Computing Science and Mathematics Division, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| | - Paul B Hibbard
- Department of Psychology, University of Essex, Colchester, UK
| | - Bruce Graham
- Computing Science and Mathematics Division, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| |
Collapse
|
20
|
Leopold DA, Afraz A. Neurophysiology: The Three-Dimensional Building Blocks of Object Vision. Curr Biol 2021; 31:R9-R11. [PMID: 33434491 DOI: 10.1016/j.cub.2020.10.064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
A new study of the macaque visual cortex has revealed that visual area V4 performs substantial analysis of solid shape structure. The findings draw new attention to the embedding of local three-dimensional shape analysis into the early cortical stages of visual processing.
Collapse
Affiliation(s)
- David A Leopold
- Laboratory of Neuropsychology, National Institute of Mental Health, Bethesda, MD 20892, USA.
| | - Arash Afraz
- Laboratory of Neuropsychology, National Institute of Mental Health, Bethesda, MD 20892, USA.
| |
Collapse
|
21
|
Krauss P, Maier A. Will We Ever Have Conscious Machines? Front Comput Neurosci 2020; 14:556544. [PMID: 33414712 PMCID: PMC7782472 DOI: 10.3389/fncom.2020.556544] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 11/26/2020] [Indexed: 01/09/2023] Open
Abstract
The question of whether artificial beings or machines could become self-aware or conscious has been a philosophical question for centuries. The main problem is that self-awareness cannot be observed from an outside perspective and the distinction of being really self-aware or merely a clever imitation cannot be answered without access to knowledge about the mechanism's inner workings. We investigate common machine learning approaches with respect to their potential ability to become self-aware. We realize that many important algorithmic steps toward machines with a core consciousness have already been taken.
Collapse
Affiliation(s)
- Patrick Krauss
- Neuroscience Lab, University Hospital Erlangen, Erlangen, Germany.,Cognitive Computational Neuroscience Group, Chair of Linguistics, Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Andreas Maier
- Chair of Machine Intelligence, Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| |
Collapse
|