1
|
Ai H, Lin W, Liu C, Chen N, Zhang P. Mesoscale functional organization and connectivity of color, disparity, and naturalistic texture in human second visual area. eLife 2025; 13:RP93171. [PMID: 40111254 PMCID: PMC11925451 DOI: 10.7554/elife.93171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025] Open
Abstract
Although parallel processing has been extensively studied in the low-level geniculostriate pathway and the high-level dorsal and ventral visual streams, less is known at the intermediate-level visual areas. In this study, we employed high-resolution fMRI at 7T to investigate the columnar and laminar organizations for color, disparity, and naturalistic texture in the human secondary visual cortex (V2), and their informational connectivity with lower- and higher-order visual areas. Although fMRI activations in V2 showed reproducible interdigitated color-selective thin and disparity-selective thick 'stripe' columns, we found no clear evidence of columnar organization for naturalistic textures. Cortical depth-dependent analyses revealed the strongest color-selectivity in the superficial layers of V2, along with both feedforward and feedback informational connectivity with V1 and V4. Disparity selectivity was similar across different cortical depths of V2, which showed significant feedforward and feedback connectivity with V1 and V3ab. Interestingly, the selectivity for naturalistic texture was strongest in the deep layers of V2, with significant feedback connectivity from V4. Thus, while local circuitry within cortical columns is crucial for processing color and disparity information, feedback signals from V4 are involved in generating the selectivity for naturalistic textures in area V2.
Collapse
Affiliation(s)
- Hailin Ai
- Department of Psychological and Cognitive Sciences, Tsinghua UniversityBeijingChina
| | - Weiru Lin
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of SciencesBeijingChina
- University of Chinese Academy of SciencesChangshaChina
| | - Chengwen Liu
- Department of Psychology and Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal UniversityHunanChina
- Center for Mind & Brain Sciences, Hunan Normal UniversityChangshChina
| | - Nihong Chen
- Department of Psychological and Cognitive Sciences, Tsinghua UniversityBeijingChina
- THU-IDG/McGovern Institute for Brain Research, Tsinghua UniversityBeijingChina
| | - Peng Zhang
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of SciencesBeijingChina
- University of Chinese Academy of SciencesChangshaChina
- Institute of Artificial Intelligence, Hefei Comprehensive National Science CenterHefeiChina
| |
Collapse
|
2
|
Srinath R, Ni AM, Marucci C, Cohen MR, Brainard DH. Orthogonal neural representations support perceptual judgments of natural stimuli. Sci Rep 2025; 15:5316. [PMID: 39939679 PMCID: PMC11821992 DOI: 10.1038/s41598-025-88910-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 01/31/2025] [Indexed: 02/14/2025] Open
Abstract
In natural visually guided behavior, observers must separate relevant information from a barrage of irrelevant information. Many studies have investigated the neural underpinnings of this ability using artificial stimuli presented on blank backgrounds. Natural images, however, contain task-irrelevant background elements that might interfere with the perception of object features. Recent studies suggest that visual feature estimation can be modeled through the linear decoding of task-relevant information from visual cortex. So, if the representations of task-relevant and irrelevant features are not orthogonal in the neural population, then variation in the task-irrelevant features would impair task performance. We tested this hypothesis using human psychophysics and monkey neurophysiology combined with parametrically variable naturalistic stimuli. We demonstrate that (1) the neural representation of one feature (the position of an object) in visual area V4 is orthogonal to those of several background features, (2) the ability of human observers to precisely judge object position was largely unaffected by those background features, and (3) many features of the object and the background (and of objects from a separate stimulus set) are orthogonally represented in V4 neural population responses. Our observations are consistent with the hypothesis that orthogonal neural representations can support stable perception of object features despite the richness of natural visual scenes.
Collapse
Affiliation(s)
- Ramanujan Srinath
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL, 60637, USA
| | - Amy M Ni
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL, 60637, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Claire Marucci
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marlene R Cohen
- Department of Neurobiology and Neuroscience Institute, The University of Chicago, Chicago, IL, 60637, USA
| | - David H Brainard
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
3
|
Watanabe T, Sasaki Y, Ogawa D, Shibata K. Unsupervised learning as a computational principle works in visual learning of natural scenes, but not of artificial stimuli. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.31.605957. [PMID: 39211147 PMCID: PMC11361125 DOI: 10.1101/2024.07.31.605957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
The question of whether we learn exposed visual features remains a subject of controversy. A prevalent computational model suggests that visual features frequently exposed to observers in natural environments are likely to be learned. However, this unsupervised learning model appears to be contradicted by the significant body of experimental results with human participants that indicates visual perceptual learning (VPL) of visible task-irrelevant features does not occur with frequent exposure. Here, we demonstrate a resolution to this controversy with a new finding: Exposure to a dominant global orientation as task-irrelevant leads to VPL of the orientation, particularly when the orientation is derived from natural scene images, whereas VPL did not occur with artificial images even with matched distributions of local orientations and spatial frequencies to natural scene images. Further investigation revealed that this disparity arises from the presence of higher-order statistics derived from natural scene images-global structures such as correlations between different local orientation and spatial frequency channels. Moreover, behavioral and neuroimaging results indicate that the dominant orientation from these higher-order statistics undergoes less attentional suppression than that from artificial images, which may facilitate VPL. Our results contribute to resolving the controversy by affirming the validity of unsupervised learning models for natural scenes but not for artificial stimuli. They challenge the assumption that VPL occurring in everyday life can be predicted by laws governing VPL for conventionally used artificial stimuli.
Collapse
|
4
|
Hamano Y, Nagasaka S, Shouno H. Exploring the role of texture features in deep convolutional neural networks: Insights from Portilla-Simoncelli statistics. Neural Netw 2023; 168:300-312. [PMID: 37774515 DOI: 10.1016/j.neunet.2023.09.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 09/11/2023] [Accepted: 09/15/2023] [Indexed: 10/01/2023]
Abstract
It is well-understood that the performance of Deep Convolutional Neural Networks (DCNNs) in image recognition tasks is influenced not only by shape but also by texture information. Despite this, understanding the internal representations of DCNNs remains a challenging task. This study employs a simplified version of the Portilla-Simoncelli Statistics, termed "minPS," to explore how texture information is represented in a pre-trained VGG network. Using minPS features extracted from texture images, we perform a sparse regression on the activations across various channels in VGG layers. Our findings reveal that channels in the early to middle layers of the VGG network can be effectively described by minPS features. Additionally, we observe that the explanatory power of minPS sub-groups evolves as one ascends the network hierarchy. Specifically, sub-groups termed Linear Cross Scale (LCS) and Energy Cross Scale (ECS) exhibit weak explanatory power for VGG channels. To investigate the relationship further, we compare the original texture images with their synthesized counterparts, generated using VGG, in terms of minPS features. Our results indicate that the absence of certain minPS features suggests their non-utilization in VGG's internal representations.
Collapse
Affiliation(s)
- Yusuke Hamano
- NEC Corporation, Shiba 5-7-1, Minato-ku, Tokyo, Japan
| | - Shoko Nagasaka
- The University of Electro-Communications, Chofu, Tokyo, Japan
| | - Hayaru Shouno
- The University of Electro-Communications, Chofu, Tokyo, Japan.
| |
Collapse
|
5
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
6
|
Takemura H, Rosa MGP. Understanding structure-function relationships in the mammalian visual system: part two. Brain Struct Funct 2022; 227:1167-1170. [PMID: 35419751 DOI: 10.1007/s00429-022-02495-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Hiromasa Takemura
- Division of Sensory and Cognitive Brain Mapping, Department of System Neuroscience, National Institute for Physiological Sciences, Okazaki, Japan. .,Department of Physiological Sciences, School of Life Science, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan. .,Center for Information and Neural Networks (CiNet), Advanced ICT Research Institute, National Institute of Information and Communications Technology, Suita, Japan.
| | - Marcello G P Rosa
- Biomedicine Discovery Institute, Neuroscience Program, Monash University, Clayton, VIC, 3800, Australia.,Department of Physiology, Monash University, Clayton, VIC, 3800, Australia.,Australian Research Council, Centre of Excellence for Integrative Brain Function, Monash University Node, Melbourne, VIC, 3800, Australia
| |
Collapse
|