1
|
Yu L, Dugan P, Doyle W, Devinsky O, Friedman D, Flinker A. A left-lateralized dorsolateral prefrontal network for naming. Cell Rep 2025; 44:115677. [PMID: 40347472 DOI: 10.1016/j.celrep.2025.115677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 12/24/2024] [Accepted: 04/16/2025] [Indexed: 05/14/2025] Open
Abstract
The ability to connect the form and meaning of a concept, known as word retrieval, is fundamental to human communication. While various input modalities could lead to identical word retrieval, the exact neural dynamics supporting this process relevant to daily auditory discourse remain poorly understood. Here, we recorded neurosurgical electrocorticography (ECoG) data from 48 patients and dissociated two key language networks that highly overlap in time and space, critical for word retrieval. Using unsupervised temporal clustering techniques, we found a semantic processing network located in the middle and inferior frontal gyri. This network was distinct from an articulatory planning network in the inferior frontal and precentral gyri, which was invariant to input modalities. Functionally, we confirmed that the semantic processing network encodes word surprisal during sentence perception. These findings elucidate neurophysiological mechanisms underlying the processing of semantic auditory inputs ranging from passive language comprehension to conversational speech.
Collapse
Affiliation(s)
- Leyao Yu
- Department of Biomedical Engineering, NYU Tandon School of Engineering, New York, NY 10016, USA.
| | - Patricia Dugan
- Department of Neurology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Werner Doyle
- Department of Neurosurgery, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Orrin Devinsky
- Department of Neurology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Daniel Friedman
- Department of Neurology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Adeen Flinker
- Department of Biomedical Engineering, NYU Tandon School of Engineering, New York, NY 10016, USA; Department of Neurology, NYU Grossman School of Medicine, New York, NY 10016, USA
| |
Collapse
|
2
|
Steel A, Prasad D, Garcia BD, Robertson CE. Relating scene memory and perception activity to functional properties, networks, and landmarks of posterior cerebral cortex - a probabilistic atlas. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.06.631538. [PMID: 39829755 PMCID: PMC11741410 DOI: 10.1101/2025.01.06.631538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Adaptive behavior in complex environments requires integrating visual perception with memory of our spatial environment. Recent work has implicated three brain areas in posterior cerebral cortex - the place memory areas (PMAs) that are anterior to the three visual scene perception areas (SPAs) - in this function. However, PMAs' relationship to the broader cortical hierarchy remains unclear due to limited group-level characterization. Here, we examined the PMA and SPA locations across three fMRI datasets (44 participants, 29 female). SPAs were identified using a standard visual localizer where participants viewed scenes versus faces. PMAs were identified by contrasting activity when participants recalled personally familiar places versus familiar faces (Datasets 1-2) or places versus multiple categories (familiar faces, bodies, and objects, and famous faces; Dataset 3). Across datasets, the PMAs were located anterior to the SPAs on the ventral and lateral cortical surfaces. The anterior displacement between PMAs and SPAs was highly reproducible. Compared to public atlases, the PMAs fell at the boundary between externally-oriented networks (dorsal attention) and internally-oriented networks (default mode). Additionally, while SPAs overlapped with retinotopic maps, the PMAs were consistently located anterior to mapped visual cortex. These results establish the anatomical position of the PMAs at inflection points along the cortical hierarchy between unimodal sensory and transmodal, apical regions, which informs broader theories of how the brain integrates perception and memory for scenes. We have released probabilistic parcels of these regions to facilitate future research into their roles in spatial cognition.
Collapse
Affiliation(s)
- Adam Steel
- Department of Psychology, University of Illinois
- Beckman Institute for Advanced Science and Technology, University of Illinois
| | | | - Brenda D. Garcia
- University of California San Diego Medical School, University of California San Diego
| | | |
Collapse
|
3
|
Jang G, Kragel PA. Understanding human amygdala function with artificial neural networks. J Neurosci 2025; 45:e1436242025. [PMID: 40086868 PMCID: PMC12044042 DOI: 10.1523/jneurosci.1436-24.2025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 01/07/2025] [Accepted: 01/16/2025] [Indexed: 03/16/2025] Open
Abstract
The amygdala is a cluster of subcortical nuclei that receives diverse sensory inputs and projects to the cortex, midbrain, and other subcortical structures. Numerous accounts of amygdalar contributions to social and emotional behavior have been offered, yet an overarching description of amygdala function remains elusive. Here we adopt a computationally explicit framework that aims to develop a model of amygdala function based on the types of sensory inputs it receives, rather than individual constructs such as threat, arousal, or valence. Characterizing human fMRI signal acquired as male and female participants viewed a full-length film, we developed encoding models that predict both patterns of amygdala activity and self-reported valence evoked by naturalistic images. We use deep image synthesis to generate artificial stimuli that distinctly engage encoding models of amygdala subregions that systematically differ from one another in terms of their low-level visual properties. These findings characterize how the amygdala compresses high-dimensional sensory inputs into low-dimensional representations relevant for behavior.Significance Statement The amygdala is a cluster of subcortical nuclei critical for motivation, emotion, and social behavior. Characterizing the contribution of the amygdala to behavior has been challenging due to its structural complexity, broad connectivity, and functional heterogeneity. Here we use a combination of human neuroimaging and computational modeling to investigate how visual inputs relate to low-dimensional representations encoded in the amygdala. We find that the amygdala encodes an array of visual features, which systematically vary across specific nuclei and relate to the affective properties of the sensory environment.
Collapse
|
4
|
Mononen R, Saarela T, Vallinoja J, Olkkonen M, Henriksson L. Cortical Encoding of Spatial Structure and Semantic Content in 3D Natural Scenes. J Neurosci 2025; 45:e2157232024. [PMID: 39788741 PMCID: PMC11866997 DOI: 10.1523/jneurosci.2157-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 11/25/2024] [Accepted: 12/24/2024] [Indexed: 01/12/2025] Open
Abstract
Our visual system enables us to effortlessly navigate and recognize real-world visual environments. Functional magnetic resonance imaging (fMRI) studies suggest a network of scene-responsive cortical visual areas, but much less is known about the temporal order in which different scene properties are analyzed by the human visual system. In this study, we selected a set of 36 full-color natural scenes that varied in spatial structure and semantic content that our male and female human participants viewed both in 2D and 3D while we recorded magnetoencephalography (MEG) data. MEG enables tracking of cortical activity in humans at millisecond timescale. We compared the representational geometry in the MEG responses with predictions based on the scene stimuli using the representational similarity analysis framework. The representational structure first reflected the spatial structure in the scenes in time window 90-125 ms, followed by the semantic content in time window 140-175 ms after stimulus onset. The 3D stereoscopic viewing of the scenes affected the responses relatively late, from ∼140 ms from stimulus onset. Taken together, our results indicate that the human visual system rapidly encodes a scene's spatial structure and suggest that this information is based on monocular instead of binocular depth cues.
Collapse
Affiliation(s)
- Riikka Mononen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- MEG Core, Aalto NeuroImaging, Aalto University, Espoo FI-00076, Finland
| | - Toni Saarela
- Department of Psychology, University of Helsinki, Helsinki FI-00014, Finland
| | - Jaakko Vallinoja
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- MEG Core, Aalto NeuroImaging, Aalto University, Espoo FI-00076, Finland
| | - Maria Olkkonen
- Department of Psychology, University of Helsinki, Helsinki FI-00014, Finland
| | - Linda Henriksson
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- MEG Core, Aalto NeuroImaging, Aalto University, Espoo FI-00076, Finland
| |
Collapse
|
5
|
Li A, Chen H, Naya Y. Mnemonically modulated perceptual processing to represent allocentric space in macaque inferotemporal cortex. Prog Neurobiol 2024; 241:102670. [PMID: 39366505 DOI: 10.1016/j.pneurobio.2024.102670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 09/22/2024] [Accepted: 09/30/2024] [Indexed: 10/06/2024]
Abstract
To encode allocentric space information of a viewing object, it is important to relate perceptual information in the first-person perspective to the representation of an entire scene which would be constructed before. A substantial number of studies investigated the constructed scene information (e.g., cognitive map). However, only few studies have focused on its influence on perceptual processing. Therefore, we designed a visually guided saccade task requiring monkeys to gaze at objects in different locations on different backgrounds clipped from large self-designed mosaic pictures (parental pictures). In each trial, we presented moving backgrounds prior to object presentations, indicating a frame position of the background image on a parental picture. We recorded single-unit activities from 377 neurons in the posterior inferotemporal (PIT) cortex of two macaques. Equivalent numbers of neurons showed space-related (119 of 377) and object-related (125 of 377) information. The space-related neurons coded the gaze locations and background images jointly rather than separately. These results suggest that PIT neurons represent a particular location within a particular background image. Interestingly, frame positions of background images on parental pictures modulated the space-related responses dependently on parental pictures. As the frame positions could be acquired by only preceding visual experiences, the present results may provide neuronal evidence of a mnemonic effect on current perception, which might represent allocentric object location in a scene beyond the current view.
Collapse
Affiliation(s)
- Ao Li
- School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China
| | - He Chen
- School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China; Department of Biological Structure, University of Washington, Seattle, WA 98195, United States; Washington National Primate Research Center, University of Washington, Seattle, WA 98195, United States
| | - Yuji Naya
- School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China; IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China; Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China.
| |
Collapse
|
6
|
Lin R, Naselaris T, Kay K, Wehbe L. Stacked regressions and structured variance partitioning for interpretable brain maps. Neuroimage 2024; 298:120772. [PMID: 39117095 DOI: 10.1016/j.neuroimage.2024.120772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 07/26/2024] [Accepted: 08/02/2024] [Indexed: 08/10/2024] Open
Abstract
Relating brain activity associated with a complex stimulus to different properties of that stimulus is a powerful approach for constructing functional brain maps. However, when stimuli are naturalistic, their properties are often correlated (e.g., visual and semantic features of natural images, or different layers of a convolutional neural network that are used as features of images). Correlated properties can act as confounders for each other and complicate the interpretability of brain maps, and can impact the robustness of statistical estimators. Here, we present an approach for brain mapping based on two proposed methods: stacking different encoding models and structured variance partitioning. Our stacking algorithm combines encoding models that each uses as input a feature space that describes a different stimulus attribute. The algorithm learns to predict the activity of a voxel as a linear combination of the outputs of different encoding models. We show that the resulting combined model can predict held-out brain activity better or at least as well as the individual encoding models. Further, the weights of the linear combination are readily interpretable; they show the importance of each feature space for predicting a voxel. We then build on our stacking models to introduce structured variance partitioning, a new type of variance partitioning that takes into account the known relationships between features. Our approach constrains the size of the hypothesis space and allows us to ask targeted questions about the similarity between feature spaces and brain regions even in the presence of correlations between the feature spaces. We validate our approach in simulation, showcase its brain mapping potential on fMRI data, and release a Python package. Our methods can be useful for researchers interested in aligning brain activity with different layers of a neural network, or with other types of correlated feature spaces.
Collapse
Affiliation(s)
- Ruogu Lin
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | - Thomas Naselaris
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, United States of America; Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Kendrick Kay
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America.
| |
Collapse
|
7
|
Kang J, Park S. Combined representation of visual features in the scene-selective cortex. Behav Brain Res 2024; 471:115110. [PMID: 38871131 PMCID: PMC11375617 DOI: 10.1016/j.bbr.2024.115110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 06/05/2024] [Accepted: 06/10/2024] [Indexed: 06/15/2024]
Abstract
Visual features of separable dimensions conjoin to represent an integrated entity. We investigated how visual features bind to form a complex visual scene. Specifically, we focused on features important for visually guided navigation: direction and distance. Previously, separate works have shown that directions and distances of navigable paths are coded in the occipital place area (OPA). Using functional magnetic resonance imaging (fMRI), we tested how separate features are concurrently represented in the OPA. Participants saw eight types of scenes, in which four of them had one path and the other four had two paths. In single-path scenes, path direction was either to the left or to the right. In double-path scenes, both directions were present. A glass wall was placed in some paths to restrict navigational distance. To test how the OPA represents path directions and distances, we took three approaches. First, the independent-features approach examined whether the OPA codes each direction and distance. Second, the integrated-features approach explored how directions and distances are integrated into path units, as compared to pooled features, using double-path scenes. Finally, the integrated-paths approach asked how separate paths are combined into a scene. Using multi-voxel pattern similarity analysis, we found that the OPA's representations of single-path scenes were similar to other single-path scenes of either the same direction or the same distance. Representations of double-path scenes were similar to the combination of two constituent single-paths, as a combined unit of direction and distance rather than as a pooled representation of all features. These results show that the OPA combines the two features to form path units, which are then used to build multiple-path scenes. Altogether, these results suggest that visually guided navigation may be supported by the OPA that automatically and efficiently combines multiple features relevant for navigation and represent a navigation file.
Collapse
Affiliation(s)
- Jisu Kang
- Department of Psychology, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul 03722, the Republic of Korea
| | - Soojin Park
- Department of Psychology, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul 03722, the Republic of Korea.
| |
Collapse
|
8
|
Scrivener CL, Zamboni E, Morland AB, Silson EH. Retinotopy drives the variation in scene responses across visual field map divisions of the occipital place area. J Vis 2024; 24:10. [PMID: 39167394 PMCID: PMC11343012 DOI: 10.1167/jov.24.8.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/09/2024] [Indexed: 08/23/2024] Open
Abstract
The occipital place area (OPA) is a scene-selective region on the lateral surface of human occipitotemporal cortex that spatially overlaps multiple visual field maps, as well as portions of cortex that are not currently defined as retinotopic. Here we combined population receptive field modeling and responses to scenes in a representational similarity analysis (RSA) framework to test the prediction that the OPA's visual field map divisions contribute uniquely to the overall pattern of scene selectivity within the OPA. Consistent with this prediction, the patterns of response to a set of complex scenes were heterogeneous between maps. To explain this heterogeneity, we tested the explanatory power of seven candidate models using RSA. These models spanned different scene dimensions (Content, Expanse, Distance), low- and high-level visual features, and navigational affordances. None of the tested models could account for the variation in scene response observed between the OPA's visual field maps. However, the heterogeneity in scene response was correlated with the differences in retinotopic profiles across maps. These data highlight the need to carefully examine the relationship between regions defined as category-selective and the underlying retinotopy, and they suggest that, in the case of the OPA, it may not be appropriate to conceptualize it as a single scene-selective region.
Collapse
Affiliation(s)
| | - Elisa Zamboni
- Department of Psychology, University of York, York, UK
- School of Psychology, University of Nottingham, University Park, Nottingham, UK
| | - Antony B Morland
- Department of Psychology, University of York, York, UK
- York Biomedical Research Institute, University of York, York, UK
- York Neuroimaging Centre, Department of Psychology, University of York, York, UK
| | - Edward H Silson
- Department of Psychology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
9
|
Wu 吴奕忱 Y, Li 李晟 S. Complexity Matters: Normalization to Prototypical Viewpoint Induces Memory Distortion along the Vertical Axis of Scenes. J Neurosci 2024; 44:e1175232024. [PMID: 38777600 PMCID: PMC11223457 DOI: 10.1523/jneurosci.1175-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 04/24/2024] [Accepted: 05/13/2024] [Indexed: 05/25/2024] Open
Abstract
Scene memory is prone to systematic distortions potentially arising from experience with the external world. Boundary transformation, a well-known memory distortion effect along the near-far axis of the three-dimensional space, represents the observer's erroneous recall of scenes' viewing distance. Researchers argued that normalization to the prototypical viewpoint with the high-probability viewing distance influenced this phenomenon. Herein, we hypothesized that the prototypical viewpoint also exists in the vertical angle of view (AOV) dimension and could cause memory distortion along scenes' vertical axis. Human subjects of both sexes were recruited to test this hypothesis, and two behavioral experiments were conducted, revealing a systematic memory distortion in the vertical AOV in both the forced choice (n = 79) and free adjustment (n = 30) tasks. Furthermore, the regression analysis implied that the complexity information asymmetry in scenes' vertical axis and the independent subjective AOV ratings from a large set of online participants (n = 1,208) could jointly predict AOV biases. Furthermore, in a functional magnetic resonance imaging experiment (n = 24), we demonstrated the involvement of areas in the ventral visual pathway (V3/V4, PPA, and OPA) in AOV bias judgment. Additionally, in a magnetoencephalography experiment (n = 20), we could significantly decode the subjects' AOV bias judgments ∼140 ms after scene onset and the low-level visual complexity information around the similar temporal interval. These findings suggest that AOV bias is driven by the normalization process and associated with the neural activities in the early stage of scene processing.
Collapse
Affiliation(s)
- Yichen Wu 吴奕忱
- School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China
- Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
- PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
- National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871, China
| | - Sheng Li 李晟
- School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China
- Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
- PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
- National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871, China
| |
Collapse
|
10
|
Dado T, Papale P, Lozano A, Le L, Wang F, van Gerven M, Roelfsema P, Güçlütürk Y, Güçlü U. Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Comput Biol 2024; 20:e1012058. [PMID: 38709818 PMCID: PMC11098503 DOI: 10.1371/journal.pcbi.1012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 05/16/2024] [Accepted: 04/08/2024] [Indexed: 05/08/2024] Open
Abstract
A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Antonio Lozano
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Lynn Le
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Feng Wang
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne University, Paris, France
- Department of Integrative Neurophysiology, VU Amsterdam, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC, Amsterdam, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
11
|
Jiang C, Chen Z, Wolfe JM. Toward viewing behavior for aerial scene categorization. Cogn Res Princ Implic 2024; 9:17. [PMID: 38530617 PMCID: PMC10965882 DOI: 10.1186/s41235-024-00541-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/07/2024] [Indexed: 03/28/2024] Open
Abstract
Previous work has demonstrated similarities and differences between aerial and terrestrial image viewing. Aerial scene categorization, a pivotal visual processing task for gathering geoinformation, heavily depends on rotation-invariant information. Aerial image-centered research has revealed effects of low-level features on performance of various aerial image interpretation tasks. However, there are fewer studies of viewing behavior for aerial scene categorization and of higher-level factors that might influence that categorization. In this paper, experienced subjects' eye movements were recorded while they were asked to categorize aerial scenes. A typical viewing center bias was observed. Eye movement patterns varied among categories. We explored the relationship of nine image statistics to observers' eye movements. Results showed that if the images were less homogeneous, and/or if they contained fewer or no salient diagnostic objects, viewing behavior became more exploratory. Higher- and object-level image statistics were predictive at both the image and scene category levels. Scanpaths were generally organized and small differences in scanpath randomness could be roughly captured by critical object saliency. Participants tended to fixate on critical objects. Image statistics included in this study showed rotational invariance. The results supported our hypothesis that the availability of diagnostic objects strongly influences eye movements in this task. In addition, this study provides supporting evidence for Loschky et al.'s (Journal of Vision, 15(6), 11, 2015) speculation that aerial scenes are categorized on the basis of image parts and individual objects. The findings were discussed in relation to theories of scene perception and their implications for automation development.
Collapse
Affiliation(s)
- Chenxi Jiang
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China
| | - Zhenzhong Chen
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China.
- Hubei Luojia Laboratory, Wuhan, Hubei, China.
| | - Jeremy M Wolfe
- Harvard Medical School, Boston, MA, USA
- Brigham & Women's Hospital, Boston, MA, USA
| |
Collapse
|
12
|
Kennedy B, Malladi SN, Tootell RBH, Nasr S. A previously undescribed scene-selective site is the key to encoding ego-motion in naturalistic environments. eLife 2024; 13:RP91601. [PMID: 38506719 PMCID: PMC10954307 DOI: 10.7554/elife.91601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024] Open
Abstract
Current models of scene processing in the human brain include three scene-selective areas: the parahippocampal place area (or the temporal place areas), the restrosplenial cortex (or the medial place area), and the transverse occipital sulcus (or the occipital place area). Here, we challenged this model by showing that at least one other scene-selective site can also be detected within the human posterior intraparietal gyrus. Despite the smaller size of this site compared to the other scene-selective areas, the posterior intraparietal gyrus scene-selective (PIGS) site was detected consistently in a large pool of subjects (n = 59; 33 females). The reproducibility of this finding was tested based on multiple criteria, including comparing the results across sessions, utilizing different scanners (3T and 7T) and stimulus sets. Furthermore, we found that this site (but not the other three scene-selective areas) is significantly sensitive to ego-motion in scenes, thus distinguishing the role of PIGS in scene perception relative to other scene-selective areas. These results highlight the importance of including finer scale scene-selective sites in models of scene processing - a crucial step toward a more comprehensive understanding of how scenes are encoded under dynamic conditions.
Collapse
Affiliation(s)
- Bryan Kennedy
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General HospitalCharlestownUnited States
| | - Sarala N Malladi
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General HospitalCharlestownUnited States
| | - Roger BH Tootell
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General HospitalCharlestownUnited States
- Department of Radiology, Harvard Medical SchoolBostonUnited States
| | - Shahin Nasr
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General HospitalCharlestownUnited States
- Department of Radiology, Harvard Medical SchoolBostonUnited States
| |
Collapse
|
13
|
Dwivedi K, Sadiya S, Balode MP, Roig G, Cichy RM. Visual features are processed before navigational affordances in the human brain. Sci Rep 2024; 14:5573. [PMID: 38448446 PMCID: PMC10917749 DOI: 10.1038/s41598-024-55652-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 02/26/2024] [Indexed: 03/08/2024] Open
Abstract
To navigate through their immediate environment humans process scene information rapidly. How does the cascade of neural processing elicited by scene viewing to facilitate navigational planning unfold over time? To investigate, we recorded human brain responses to visual scenes with electroencephalography and related those to computational models that operationalize three aspects of scene processing (2D, 3D, and semantic information), as well as to a behavioral model capturing navigational affordances. We found a temporal processing hierarchy: navigational affordance is processed later than the other scene features (2D, 3D, and semantic) investigated. This reveals the temporal order with which the human brain computes complex scene information and suggests that the brain leverages these pieces of information to plan navigation.
Collapse
Affiliation(s)
- Kshitij Dwivedi
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany
| | - Sari Sadiya
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany.
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt, Germany.
| | - Marta P Balode
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Institute of Neuroinformatics, ETH Zurich and University of Zurich, Zurich, Switzerland
| | - Gemma Roig
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany
- The Hessian Center for Artificial Intelligence (hessian.AI), Darmstadt, Germany
| | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
14
|
Jung Y, Hsu D, Dilks DD. "Walking selectivity" in the occipital place area in 8-year-olds, not 5-year-olds. Cereb Cortex 2024; 34:bhae101. [PMID: 38494889 PMCID: PMC10945045 DOI: 10.1093/cercor/bhae101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 03/19/2024] Open
Abstract
A recent neuroimaging study in adults found that the occipital place area (OPA)-a cortical region involved in "visually guided navigation" (i.e. moving about the immediately visible environment, avoiding boundaries, and obstacles)-represents visual information about walking, not crawling, suggesting that OPA is late developing, emerging only when children are walking, not beforehand. But when precisely does this "walking selectivity" in OPA emerge-when children first begin to walk in early childhood, or perhaps counterintuitively, much later in childhood, around 8 years of age, when children are adult-like walking? To directly test these two hypotheses, using functional magnetic resonance imaging (fMRI) in two groups of children, 5- and 8-year-olds, we measured the responses in OPA to first-person perspective videos through scenes from a "walking" perspective, as well as three control perspectives ("crawling," "flying," and "scrambled"). We found that the OPA in 8-year-olds-like adults-exhibited walking selectivity (i.e. responding significantly more to the walking videos than to any of the others, and no significant differences across the crawling, flying, and scrambled videos), while the OPA in 5-year-olds exhibited no walking selectively. These findings reveal that OPA undergoes protracted development, with walking selectivity only emerging around 8 years of age.
Collapse
Affiliation(s)
- Yaelan Jung
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| | - Debbie Hsu
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| | - Daniel D Dilks
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
15
|
Kennedy B, Malladi SN, Tootell RBH, Nasr S. A previously undescribed scene-selective site is the key to encoding ego-motion in naturalistic environments. RESEARCH SQUARE 2024:rs.3.rs-3378081. [PMID: 38260553 PMCID: PMC10802707 DOI: 10.21203/rs.3.rs-3378081/v2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Current models of scene processing in the human brain include three scene-selective areas: the Parahippocampal Place Area (or the temporal place areas; PPA/TPA), the restrosplenial cortex (or the medial place area; RSC/MPA) and the transverse occipital sulcus (or the occipital place area; TOS/OPA). Here, we challenged this model by showing that at least one other scene-selective site can also be detected within the human posterior intraparietal gyrus. Despite the smaller size of this site compared to the other scene-selective areas, the posterior intraparietal gyrus scene-selective (PIGS) site was detected consistently in a large pool of subjects (n=59; 33 females). The reproducibility of this finding was tested based on multiple criteria, including comparing the results across sessions, utilizing different scanners (3T and 7T) and stimulus sets. Furthermore, we found that this site (but not the other three scene-selective areas) is significantly sensitive to ego-motion in scenes, thus distinguishing the role of PIGS in scene perception relative to other scene-selective areas. These results highlight the importance of including finer scale scene-selective sites in models of scene processing - a crucial step toward a more comprehensive understanding of how scenes are encoded under dynamic conditions.
Collapse
Affiliation(s)
- Bryan Kennedy
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
| | - Sarala N. Malladi
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
| | - Roger B. H. Tootell
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
| | - Shahin Nasr
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
16
|
Kennedy B, Malladi SN, Tootell RBH, Nasr S. A previously undescribed scene-selective site is the key to encoding ego-motion in naturalistic environments. RESEARCH SQUARE 2024:rs.3.rs-3378081. [PMID: 38260553 PMCID: PMC10802707 DOI: 10.21203/rs.3.rs-3378081/v3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Current models of scene processing in the human brain include three scene-selective areas: the Parahippocampal Place Area (or the temporal place areas; PPA/TPA), the restrosplenial cortex (or the medial place area; RSC/MPA) and the transverse occipital sulcus (or the occipital place area; TOS/OPA). Here, we challenged this model by showing that at least one other scene-selective site can also be detected within the human posterior intraparietal gyrus. Despite the smaller size of this site compared to the other scene-selective areas, the posterior intraparietal gyrus scene-selective (PIGS) site was detected consistently in a large pool of subjects (n=59; 33 females). The reproducibility of this finding was tested based on multiple criteria, including comparing the results across sessions, utilizing different scanners (3T and 7T) and stimulus sets. Furthermore, we found that this site (but not the other three scene-selective areas) is significantly sensitive to ego-motion in scenes, thus distinguishing the role of PIGS in scene perception relative to other scene-selective areas. These results highlight the importance of including finer scale scene-selective sites in models of scene processing - a crucial step toward a more comprehensive understanding of how scenes are encoded under dynamic conditions.
Collapse
Affiliation(s)
- Bryan Kennedy
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
| | - Sarala N. Malladi
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
| | - Roger B. H. Tootell
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
| | - Shahin Nasr
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Radiology, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
17
|
Hopp FR, Amir O, Fisher JT, Grafton S, Sinnott-Armstrong W, Weber R. Moral foundations elicit shared and dissociable cortical activation modulated by political ideology. Nat Hum Behav 2023; 7:2182-2198. [PMID: 37679440 DOI: 10.1038/s41562-023-01693-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 08/03/2023] [Indexed: 09/09/2023]
Abstract
Moral foundations theory (MFT) holds that moral judgements are driven by modular and ideologically variable moral foundations but where and how these foundations are represented in the brain and shaped by political beliefs remains an open question. Using a moral vignette judgement task (n = 64), we probed the neural (dis)unity of moral foundations. Univariate analyses revealed that moral judgement of moral foundations, versus conventional norms, reliably recruits core areas implicated in theory of mind. Yet, multivariate pattern analysis demonstrated that each moral foundation elicits dissociable neural representations distributed throughout the cortex. As predicted by MFT, individuals' liberal or conservative orientation modulated neural responses to moral foundations. Our results confirm that each moral foundation recruits domain-general mechanisms of social cognition but also has a dissociable neural signature malleable by sociomoral experience. We discuss these findings in view of unified versus dissociable accounts of morality and their neurological support for MFT.
Collapse
Affiliation(s)
- Frederic R Hopp
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands
| | - Ori Amir
- Pomona College, Claremont, CA, USA
| | - Jacob T Fisher
- Department of Communication, Michigan State University, Lansing, MI, USA
| | - Scott Grafton
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| | | | - René Weber
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA.
- Department of Communication, Media Neuroscience Lab, University of California, Santa Barbara, CA, USA.
- School of Communication and Media, Ewha Womans University, Seoul, South Korea.
| |
Collapse
|
18
|
Li C, Ficco L, Trapp S, Rostalski SM, Korn L, Kovács G. The effect of context congruency on fMRI repetition suppression for objects. Neuropsychologia 2023; 188:108603. [PMID: 37270029 DOI: 10.1016/j.neuropsychologia.2023.108603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 05/31/2023] [Accepted: 05/31/2023] [Indexed: 06/05/2023]
Abstract
The recognition of objects is strongly facilitated when they are presented in the context of other objects (Biederman, 1972). Such contexts facilitate perception and induce expectations of context-congruent objects (Trapp and Bar, 2015). The neural mechanisms underlying these facilitatory effects of context on object processing, however, are not yet fully understood. In the present study, we investigate how context-induced expectations affect subsequent object processing. We used functional magnetic resonance imaging and measured repetition suppression as a proxy for prediction error processing. Participants viewed pairs of alternating or repeated object images which were preceded by context-congruent, context-incongruent or neutral cues. We found a stronger repetition suppression in congruent as compared to incongruent or neutral cues in the object sensitive lateral occipital cortex. Interestingly, this stronger effect was driven by enhanced responses to alternating stimulus pairs in the congruent contexts, rather than by suppressed responses to repeated stimulus pairs, which emphasizes the contribution of surprise-related response enhancement for the context modulation on RS when expectations are violated. In addition, in the congruent condition, we discovered significant functional connectivity between object-responsive and frontal cortical regions, as well as between object-responsive regions and the fusiform gyrus. Our findings indicate that prediction errors, reflected in enhanced brain responses to violated contextual expectations, underlie the facilitating effect of context during object perception.
Collapse
Affiliation(s)
- Chenglin Li
- School of Psychology, Zhejiang Normal University, China; Department of Biological Psychology and Cognitive Neurosciences, Institute of Psychology, Friedrich-Schiller-Universität Jena, Germany
| | - Linda Ficco
- Department of General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich-Schiller-Universität Jena, Germany; Department of Linguistics and Cultural Evolution, International Max Planck Research School for the Science of Human History, Jena, Germany
| | - Sabrina Trapp
- Macromedia University of Applied Sciences, Munich, Germany
| | - Sophie-Marie Rostalski
- Department of Biological Psychology and Cognitive Neurosciences, Institute of Psychology, Friedrich-Schiller-Universität Jena, Germany
| | - Lukas Korn
- Department of Biological Psychology and Cognitive Neurosciences, Institute of Psychology, Friedrich-Schiller-Universität Jena, Germany
| | - Gyula Kovács
- Department of Biological Psychology and Cognitive Neurosciences, Institute of Psychology, Friedrich-Schiller-Universität Jena, Germany.
| |
Collapse
|
19
|
Sagar V, Shanahan LK, Zelano CM, Gottfried JA, Kahnt T. High-precision mapping reveals the structure of odor coding in the human brain. Nat Neurosci 2023; 26:1595-1602. [PMID: 37620443 PMCID: PMC10726579 DOI: 10.1038/s41593-023-01414-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Accepted: 07/18/2023] [Indexed: 08/26/2023]
Abstract
Odor perception is inherently subjective. Previous work has shown that odorous molecules evoke distributed activity patterns in olfactory cortices, but how these patterns map on to subjective odor percepts remains unclear. In the present study, we collected neuroimaging responses to 160 odors from 3 individual subjects (18 h per subject) to probe the neural coding scheme underlying idiosyncratic odor perception. We found that activity in the orbitofrontal cortex (OFC) represents the fine-grained perceptual identity of odors over and above coarsely defined percepts, whereas this difference is less pronounced in the piriform cortex (PirC) and amygdala. Furthermore, the implementation of perceptual encoding models enabled us to predict olfactory functional magnetic resonance imaging responses to new odors, revealing that the dimensionality of the encoded perceptual spaces increases from the PirC to the OFC. Whereas encoding of lower-order dimensions generalizes across subjects, encoding of higher-order dimensions is idiosyncratic. These results provide new insights into cortical mechanisms of odor coding and suggest that subjective olfactory percepts reside in the OFC.
Collapse
Affiliation(s)
- Vivek Sagar
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | | | - Christina M Zelano
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Jay A Gottfried
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Thorsten Kahnt
- National Institute on Drug Abuse Intramural Research Program, Baltimore, MD, USA.
| |
Collapse
|
20
|
Emonds AMX, Srinath R, Nielsen KJ, Connor CE. Object representation in a gravitational reference frame. eLife 2023; 12:e81701. [PMID: 37561119 PMCID: PMC10414968 DOI: 10.7554/elife.81701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 07/19/2023] [Indexed: 08/11/2023] Open
Abstract
When your head tilts laterally, as in sports, reaching, and resting, your eyes counterrotate less than 20%, and thus eye images rotate, over a total range of about 180°. Yet, the world appears stable and vision remains normal. We discovered a neural strategy for rotational stability in anterior inferotemporal cortex (IT), the final stage of object vision in primates. We measured object orientation tuning of IT neurons in macaque monkeys tilted +25 and -25° laterally, producing ~40° difference in retinal image orientation. Among IT neurons with consistent object orientation tuning, 63% remained stable with respect to gravity across tilts. Gravitational tuning depended on vestibular/somatosensory but also visual cues, consistent with previous evidence that IT processes scene cues for gravity's orientation. In addition to stability across image rotations, an internal gravitational reference frame is important for physical understanding of a world where object position, posture, structure, shape, movement, and behavior interact critically with gravity.
Collapse
Affiliation(s)
- Alexandriya MX Emonds
- Department of Biomedical Engineering, Johns Hopkins University School of MedicineBaltimoreUnited States
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
| | - Ramanujan Srinath
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| | - Kristina J Nielsen
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| | - Charles E Connor
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| |
Collapse
|
21
|
Steel A, Garcia BD, Goyal K, Mynick A, Robertson CE. Scene Perception and Visuospatial Memory Converge at the Anterior Edge of Visually Responsive Cortex. J Neurosci 2023; 43:5723-5737. [PMID: 37474310 PMCID: PMC10401646 DOI: 10.1523/jneurosci.2043-22.2023] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 07/10/2023] [Accepted: 07/14/2023] [Indexed: 07/22/2023] Open
Abstract
To fluidly engage with the world, our brains must simultaneously represent both the scene in front of us and our memory of the immediate surrounding environment (i.e., local visuospatial context). How does the brain's functional architecture enable sensory and mnemonic representations to closely interface while also avoiding sensory-mnemonic interference? Here, we asked this question using first-person, head-mounted virtual reality and fMRI. Using virtual reality, human participants of both sexes learned a set of immersive, real-world visuospatial environments in which we systematically manipulated the extent of visuospatial context associated with a scene image in memory across three learning conditions, spanning from a single FOV to a city street. We used individualized, within-subject fMRI to determine which brain areas support memory of the visuospatial context associated with a scene during recall (Experiment 1) and recognition (Experiment 2). Across the whole brain, activity in three patches of cortex was modulated by the amount of known visuospatial context, each located immediately anterior to one of the three scene perception areas of high-level visual cortex. Individual subject analyses revealed that these anterior patches corresponded to three functionally defined place memory areas, which selectively respond when visually recalling personally familiar places. In addition to showing activity levels that were modulated by the amount of visuospatial context, multivariate analyses showed that these anterior areas represented the identity of the specific environment being recalled. Together, these results suggest a convergence zone for scene perception and memory of the local visuospatial context at the anterior edge of high-level visual cortex.SIGNIFICANCE STATEMENT As we move through the world, the visual scene around us is integrated with our memory of the wider visuospatial context. Here, we sought to understand how the functional architecture of the brain enables coexisting representations of the current visual scene and memory of the surrounding environment. Using a combination of immersive virtual reality and fMRI, we show that memory of visuospatial context outside the current FOV is represented in a distinct set of brain areas immediately anterior and adjacent to the perceptually oriented scene-selective areas of high-level visual cortex. This functional architecture would allow efficient interaction between immediately adjacent mnemonic and perceptual areas while also minimizing interference between mnemonic and perceptual representations.
Collapse
Affiliation(s)
- Adam Steel
- Department of Psychological & Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - Brenda D Garcia
- Department of Psychological & Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - Kala Goyal
- Department of Psychological & Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - Anna Mynick
- Department of Psychological & Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - Caroline E Robertson
- Department of Psychological & Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| |
Collapse
|
22
|
Meschke EX, Castello MVDO, la Tour TD, Gallant JL. Model connectivity: leveraging the power of encoding models to overcome the limitations of functional connectivity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.17.549356. [PMID: 37503232 PMCID: PMC10370105 DOI: 10.1101/2023.07.17.549356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Functional connectivity (FC) is the most popular method for recovering functional networks of brain areas with fMRI. However, because FC is defined as temporal correlations in brain activity, FC networks are confounded by noise and lack a precise functional role. To overcome these limitations, we developed model connectivity (MC). MC is defined as similarities in encoding model weights, which quantify reliable functional activity in terms of interpretable stimulus- or task-related features. To compare FC and MC, both methods were applied to a naturalistic story listening dataset. FC recovered spatially broad networks that are confounded by noise, and that lack a clear role during natural language comprehension. By contrast, MC recovered spatially localized networks that are robust to noise, and that represent distinct categories of semantic concepts. Thus, MC is a powerful data-driven approach for recovering and interpreting the functional networks that support complex cognitive processes.
Collapse
|
23
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
24
|
Ho JK, Horikawa T, Majima K, Cheng F, Kamitani Y. Inter-individual deep image reconstruction via hierarchical neural code conversion. Neuroimage 2023; 271:120007. [PMID: 36914105 DOI: 10.1016/j.neuroimage.2023.120007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 02/26/2023] [Accepted: 03/07/2023] [Indexed: 03/13/2023] Open
Abstract
The sensory cortex is characterized by general organizational principles such as topography and hierarchy. However, measured brain activity given identical input exhibits substantially different patterns across individuals. Although anatomical and functional alignment methods have been proposed in functional magnetic resonance imaging (fMRI) studies, it remains unclear whether and how hierarchical and fine-grained representations can be converted between individuals while preserving the encoded perceptual content. In this study, we trained a method of functional alignment called neural code converter that predicts a target subject's brain activity pattern from a source subject given the same stimulus, and analyzed the converted patterns by decoding hierarchical visual features and reconstructing perceived images. The converters were trained on fMRI responses to identical sets of natural images presented to pairs of individuals, using the voxels on the visual cortex that covers from V1 through the ventral object areas without explicit labels of the visual areas. We decoded the converted brain activity patterns into the hierarchical visual features of a deep neural network using decoders pre-trained on the target subject and then reconstructed images via the decoded features. Without explicit information about the visual cortical hierarchy, the converters automatically learned the correspondence between visual areas of the same levels. Deep neural network feature decoding at each layer showed higher decoding accuracies from corresponding levels of visual areas, indicating that hierarchical representations were preserved after conversion. The visual images were reconstructed with recognizable silhouettes of objects even with relatively small numbers of data for converter training. The decoders trained on pooled data from multiple individuals through conversions led to a slight improvement over those trained on a single individual. These results demonstrate that the hierarchical and fine-grained representation can be converted by functional alignment, while preserving sufficient visual information to enable inter-individual visual image reconstruction.
Collapse
Affiliation(s)
- Jun Kai Ho
- Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan.
| | - Tomoyasu Horikawa
- Department of Neuroinformatics, ATR Computational Neuroscience Laboratories, Hikaridai, Seika, Soraku, Kyoto, 619-0288, Japan
| | - Kei Majima
- Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
| | - Fan Cheng
- Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan; Department of Neuroinformatics, ATR Computational Neuroscience Laboratories, Hikaridai, Seika, Soraku, Kyoto, 619-0288, Japan
| | - Yukiyasu Kamitani
- Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan; Department of Neuroinformatics, ATR Computational Neuroscience Laboratories, Hikaridai, Seika, Soraku, Kyoto, 619-0288, Japan.
| |
Collapse
|
25
|
Lin R, Naselaris T, Kay K, Wehbe L. Stacked regressions and structured variance partitioning for interpretable brain maps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.23.537988. [PMID: 37163111 PMCID: PMC10168225 DOI: 10.1101/2023.04.23.537988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Relating brain activity associated with a complex stimulus to different properties of that stimulus is a powerful approach for constructing functional brain maps. However, when stimuli are naturalistic, their properties are often correlated (e.g., visual and semantic features of natural images, or different layers of a convolutional neural network that are used as features of images). Correlated properties can act as confounders for each other and complicate the interpretability of brain maps, and can impact the robustness of statistical estimators. Here, we present an approach for brain mapping based on two proposed methods: stacking different encoding models and structured variance partitioning. Our stacking algorithm combines encoding models that each use as input a feature space that describes a different stimulus attribute. The algorithm learns to predict the activity of a voxel as a linear combination of the outputs of different encoding models. We show that the resulting combined model can predict held-out brain activity better or at least as well as the individual encoding models. Further, the weights of the linear combination are readily interpretable; they show the importance of each feature space for predicting a voxel. We then build on our stacking models to introduce structured variance partitioning, a new type of variance partitioning that takes into account the known relationships between features. Our approach constrains the size of the hypothesis space and allows us to ask targeted questions about the similarity between feature spaces and brain regions even in the presence of correlations between the feature spaces. We validate our approach in simulation, showcase its brain mapping potential on fMRI data, and release a Python package. Our methods can be useful for researchers interested in aligning brain activity with different layers of a neural network, or with other types of correlated feature spaces.
Collapse
Affiliation(s)
- Ruogu Lin
- Computational Biology Department, Carnegie Mellon University
| | - Thomas Naselaris
- Department of Neuroscience, University of Minnesota
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota
| | - Kendrick Kay
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University
- Machine Learning Department, Carnegie Mellon University
| |
Collapse
|
26
|
Noah S, Meyyappan S, Ding M, Mangun GR. Time Courses of Attended and Ignored Object Representations. J Cogn Neurosci 2023; 35:645-658. [PMID: 36735619 PMCID: PMC10024573 DOI: 10.1162/jocn_a_01972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Selective attention prioritizes information that is relevant to behavioral goals. Previous studies have shown that attended visual information is processed and represented more efficiently, but distracting visual information is not fully suppressed, and may also continue to be represented in the brain. In natural vision, to-be-attended and to-be-ignored objects may be present simultaneously in the scene. Understanding precisely how each is represented in the visual system, and how these neural representations evolve over time, remains a key goal in cognitive neuroscience. In this study, we recorded EEG while participants performed a cued object-based attention task that involved attending to target objects and ignoring simultaneously presented and spatially overlapping distractor objects. We performed support vector machine classification on the stimulus-evoked EEG data to separately track the temporal dynamics of target and distractor representations. We found that (1) both target and distractor objects were decodable during the early phase of object processing (∼100 msec to ∼200 msec after target onset), and (2) the representations of both objects were sustained over time, remaining decodable above chance until ∼1000-msec latency. However, (3) the distractor object information faded significantly beginning after about 300-msec latency. These findings provide information about the fate of attended and ignored visual information in complex scene perception.
Collapse
Affiliation(s)
- Sean Noah
- University of California, Davis.,University of California, Berkeley
| | | | | | | |
Collapse
|
27
|
Cheng A, Chen Z, Dilks DD. A stimulus-driven approach reveals vertical luminance gradient as a stimulus feature that drives human cortical scene selectivity. Neuroimage 2023; 269:119935. [PMID: 36764369 PMCID: PMC10044493 DOI: 10.1016/j.neuroimage.2023.119935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/19/2023] [Accepted: 02/07/2023] [Indexed: 02/11/2023] Open
Abstract
Human neuroimaging studies have revealed a dedicated cortical system for visual scene processing. But what is a "scene"? Here, we use a stimulus-driven approach to identify a stimulus feature that selectively drives cortical scene processing. Specifically, using fMRI data from BOLD5000, we examined the images that elicited the greatest response in the cortical scene processing system, and found that there is a common "vertical luminance gradient" (VLG), with the top half of a scene image brighter than the bottom half; moreover, across the entire set of images, VLG systematically increases with the neural response in the scene-selective regions (Study 1). Thus, we hypothesized that VLG is a stimulus feature that selectively engages cortical scene processing, and directly tested the role of VLG in driving cortical scene selectivity using tightly controlled VLG stimuli (Study 2). Consistent with our hypothesis, we found that the scene-selective cortical regions-but not an object-selective region or early visual cortex-responded significantly more to images of VLG over control stimuli with minimal VLG. Interestingly, such selectivity was also found for images with an "inverted" VLG, resembling the luminance gradient in night scenes. Finally, we also tested the behavioral relevance of VLG for visual scene recognition (Study 3); we found that participants even categorized tightly controlled stimuli of both upright and inverted VLG to be a place more than an object, indicating that VLG is also used for behavioral scene recognition. Taken together, these results reveal that VLG is a stimulus feature that selectively engages cortical scene processing, and provide evidence for a recent proposal that visual scenes can be characterized by a set of common and unique visual features.
Collapse
Affiliation(s)
- Annie Cheng
- Department of Psychology, Emory University, Atlanta, GA, USA; Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
| | - Zirui Chen
- Department of Psychology, Emory University, Atlanta, GA, USA; Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel D Dilks
- Department of Psychology, Emory University, Atlanta, GA, USA.
| |
Collapse
|
28
|
Vannuscorps G, Galaburda A, Caramazza A. From intermediate shape-centered representations to the perception of oriented shapes: response to commentaries. Cogn Neuropsychol 2023; 40:71-94. [PMID: 37642330 DOI: 10.1080/02643294.2023.2250511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/14/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023]
Abstract
In this response paper, we start by addressing the main points made by the commentators on the target article's main theoretical conclusions: the existence and characteristics of the intermediate shape-centered representations (ISCRs) in the visual system, their emergence from edge detection mechanisms operating on different types of visual properties, and how they are eventually reunited in higher order frames of reference underlying conscious visual perception. We also address the much-commented issue of the possible neural mechanisms of the ISCRs. In the final section, we address more specific and general comments, questions, and suggestions which, albeit very interesting, were less directly focused on the main conclusions of the target paper.
Collapse
Affiliation(s)
- Gilles Vannuscorps
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Institute of Psychological Sciences, Université catholique de Louvain, Louvain-la-Neuve, Belgium
- Institute of Neuroscience, Université catholique de Louvain, Louvain-la-Neuve, Belgium
- Louvain Bionics, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Albert Galaburda
- Department of Neurology, Harvard Medical School and Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Alfonso Caramazza
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Center for Mind/Brain Sciences (CIMeC), Università degli Studi di Trento, Rovereto, Italy
| |
Collapse
|
29
|
Mapping of Orthopaedic Fractures for Optimal Surgical Guidance. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1392:43-59. [PMID: 36460845 DOI: 10.1007/978-3-031-13021-2_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Orthopaedic fractures may be difficult to treat surgically if accurate information on the fracture propagation and its exit points are not known. Even with two-dimensional (2D) radiographic images, it is difficult to be completely certain of the exact location of the fracture site, the fracture propagation pattern and the exit points of the fracture. Three-dimensional (3D) computerised tomographic models are better in providing surgeons with the extent of bone fractures, but they may still not be sufficient to allow surgeons to plan open reduction and internal fixation (ORIF) surgery.Fracture patterns and fracture maps are developed to be visual tools in 2D and 3D. These tools can be developed using fractured bones either before or after fracture reduction. Aside from being beneficial to surgeons during pre-surgical planning, these maps aid bioengineers who design fracture fixation plates and implants for these fractures, as well as represent fracture classifications.Fracture maps can be either created ex silico or in silico. Ex silico models are created using 3D printed bone models, onto which fracture patterns are marked. In silico fracture models are created by tracing the fracture lines from a fractured bone to a healthy bone template on a computer. The points of interest in both of these representations are the path of fracture propagation on the bone's surface and exit zones, which eventually determine the surgeon's choice of plate and fracture reduction. Both ex silico and in silico fracture maps are used for pre-surgical planning by the surgeons where they can plan the best way to reduce the fracture as well as template various implants in a low-risk environment before performing the surgery.Recently, fracture maps have been further digitised into heat maps. These heat maps provide visual representations of critical regions of fractures propagating through the bone and identify the weaker zones in the bone structure. These heat maps can allow engineers to develop optimal surgical plates to fix an array of fracture patterns propagating through the bone. Correlation of fractured regions with the mechanisms of injury, age, gender, etc. may improve fracture predictability in the future and optimise the intervention, along with making sure that surgeons do not miss fractures of the bone that may otherwise be hidden from plain sight.
Collapse
|
30
|
Steel A, Garcia BD, Silson EH, Robertson CE. Evaluating the efficacy of multi-echo ICA denoising on model-based fMRI. Neuroimage 2022; 264:119723. [PMID: 36328274 DOI: 10.1016/j.neuroimage.2022.119723] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 09/30/2022] [Accepted: 10/30/2022] [Indexed: 11/05/2022] Open
Abstract
fMRI is an indispensable tool for neuroscience investigation, but this technique is limited by multiple sources of physiological and measurement noise. These noise sources are particularly problematic for analysis techniques that require high signal-to-noise ratio for stable model fitting, such as voxel-wise modeling. Multi-echo data acquisition in combination with echo-time dependent ICA denoising (ME-ICA) represents one promising strategy to mitigate physiological and hardware-related noise sources as well as motion-related artifacts. However, most studies employing ME-ICA to date are resting-state fMRI studies, and therefore we have a limited understanding of the impact of ME-ICA on complex task or model-based fMRI paradigms. Here, we addressed this knowledge gap by comparing data quality and model fitting performance of data acquired during a visual population receptive field (pRF) mapping (N = 13 participants) experiment after applying one of three preprocessing procedures: ME-ICA, optimally combined multi-echo data without ICA-denoising, and typical single echo processing. As expected, multi-echo fMRI improved temporal signal-to-noise compared to single echo fMRI, with ME-ICA amplifying the improvement compared to optimal combination alone. However, unexpectedly, this boost in temporal signal-to-noise did not directly translate to improved model fitting performance: compared to single echo acquisition, model fitting was only improved after ICA-denoising. Specifically, compared to single echo acquisition, ME-ICA resulted in improved variance explained by our pRF model throughout the visual system, including anterior regions of the temporal and parietal lobes where SNR is typically low, while optimal combination without ICA did not. ME-ICA also improved reliability of parameter estimates compared to single echo and optimally combined multi-echo data without ICA-denoising. Collectively, these results suggest that ME-ICA is effective for denoising task-based fMRI data for modeling analyzes and maintains the integrity of the original data. Therefore, ME-ICA may be beneficial for complex fMRI experiments, including voxel-wise modeling and naturalistic paradigms.
Collapse
Affiliation(s)
- Adam Steel
- Department of Psychology and Brain Sciences, Dartmouth College, 3 Maynard Street, Hanover, NH 03755, US.
| | - Brenda D Garcia
- Department of Psychology and Brain Sciences, Dartmouth College, 3 Maynard Street, Hanover, NH 03755, US
| | - Edward H Silson
- Psychology, School of Philosophy, Psychology, and Language Sciences, University of Edinburgh, Edinburgh EH8 9JZ, UK
| | - Caroline E Robertson
- Department of Psychology and Brain Sciences, Dartmouth College, 3 Maynard Street, Hanover, NH 03755, US
| |
Collapse
|
31
|
Feature-space selection with banded ridge regression. Neuroimage 2022; 264:119728. [PMID: 36334814 PMCID: PMC9807218 DOI: 10.1016/j.neuroimage.2022.119728] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 10/05/2022] [Accepted: 10/31/2022] [Indexed: 11/09/2022] Open
Abstract
Encoding models provide a powerful framework to identify the information represented in brain recordings. In this framework, a stimulus representation is expressed within a feature space and is used in a regularized linear regression to predict brain activity. To account for a potential complementarity of different feature spaces, a joint model is fit on multiple feature spaces simultaneously. To adapt regularization strength to each feature space, ridge regression is extended to banded ridge regression, which optimizes a different regularization hyperparameter per feature space. The present paper proposes a method to decompose over feature spaces the variance explained by a banded ridge regression model. It also describes how banded ridge regression performs a feature-space selection, effectively ignoring non-predictive and redundant feature spaces. This feature-space selection leads to better prediction accuracy and to better interpretability. Banded ridge regression is then mathematically linked to a number of other regression methods with similar feature-space selection mechanisms. Finally, several methods are proposed to address the computational challenge of fitting banded ridge regressions on large numbers of voxels and feature spaces. All implementations are released in an open-source Python package called Himalaya.
Collapse
|
32
|
Park J, Josephs E, Konkle T. Ramp-shaped neural tuning supports graded population-level representation of the object-to-scene continuum. Sci Rep 2022; 12:18081. [PMID: 36302932 PMCID: PMC9613906 DOI: 10.1038/s41598-022-21768-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/30/2022] [Indexed: 01/24/2023] Open
Abstract
We can easily perceive the spatial scale depicted in a picture, regardless of whether it is a small space (e.g., a close-up view of a chair) or a much larger space (e.g., an entire class room). How does the human visual system encode this continuous dimension? Here, we investigated the underlying neural coding of depicted spatial scale, by examining the voxel tuning and topographic organization of brain responses. We created naturalistic yet carefully-controlled stimuli by constructing virtual indoor environments, and rendered a series of snapshots to smoothly sample between a close-up view of the central object and far-scale view of the full environment (object-to-scene continuum). Human brain responses were measured to each position using functional magnetic resonance imaging. We did not find evidence for a smooth topographic mapping for the object-to-scene continuum on the cortex. Instead, we observed large swaths of cortex with opposing ramp-shaped profiles, with highest responses to one end of the object-to-scene continuum or the other, and a small region showing a weak tuning to intermediate scale views. However, when we considered the population code of the entire ventral occipito-temporal cortex, we found smooth and linear representation of the object-to-scene continuum. Our results together suggest that depicted spatial scale information is encoded parametrically in large-scale population codes across the entire ventral occipito-temporal cortex.
Collapse
Affiliation(s)
- Jeongho Park
- Department of Psychology, Harvard University, Cambridge, USA.
| | - Emilie Josephs
- Computer Science & Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, USA
| |
Collapse
|
33
|
Abstract
Humans are exquisitely sensitive to the spatial arrangement of visual features in objects and scenes, but not in visual textures. Category-selective regions in the visual cortex are widely believed to underlie object perception, suggesting such regions should distinguish natural images of objects from synthesized images containing similar visual features in scrambled arrangements. Contrarily, we demonstrate that representations in category-selective cortex do not discriminate natural images from feature-matched scrambles but can discriminate images of different categories, suggesting a texture-like encoding. We find similar insensitivity to feature arrangement in Imagenet-trained deep convolutional neural networks. This suggests the need to reconceptualize the role of category-selective cortex as representing a basis set of complex texture-like features, useful for a myriad of behaviors. The human visual ability to recognize objects and scenes is widely thought to rely on representations in category-selective regions of the visual cortex. These representations could support object vision by specifically representing objects, or, more simply, by representing complex visual features regardless of the particular spatial arrangement needed to constitute real-world objects, that is, by representing visual textures. To discriminate between these hypotheses, we leveraged an image synthesis approach that, unlike previous methods, provides independent control over the complexity and spatial arrangement of visual features. We found that human observers could easily detect a natural object among synthetic images with similar complex features that were spatially scrambled. However, observer models built from BOLD responses from category-selective regions, as well as a model of macaque inferotemporal cortex and Imagenet-trained deep convolutional neural networks, were all unable to identify the real object. This inability was not due to a lack of signal to noise, as all observer models could predict human performance in image categorization tasks. How then might these texture-like representations in category-selective regions support object perception? An image-specific readout from category-selective cortex yielded a representation that was more selective for natural feature arrangement, showing that the information necessary for natural object discrimination is available. Thus, our results suggest that the role of the human category-selective visual cortex is not to explicitly encode objects but rather to provide a basis set of texture-like features that can be infinitely reconfigured to flexibly learn and identify new object categories.
Collapse
|
34
|
Häusler CO, Eickhoff SB, Hanke M. Processing of visual and non-visual naturalistic spatial information in the "parahippocampal place area". Sci Data 2022; 9:147. [PMID: 35365659 PMCID: PMC8975992 DOI: 10.1038/s41597-022-01250-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 02/14/2022] [Indexed: 11/09/2022] Open
Abstract
The "parahippocampal place area" (PPA) in the human ventral visual stream exhibits increased hemodynamic activity correlated with the perception of landscape photos compared to faces or objects. Here, we investigate the perception of scene-related, spatial information embedded in two naturalistic stimuli. The same 14 participants were watching a Hollywood movie and listening to its audio-description as part of the open-data resource studyforrest.org. We model hemodynamic activity based on annotations of selected stimulus features, and compare results to a block-design visual localizer. On a group level, increased activation correlating with visual spatial information occurring in the movie is overlapping with a traditionally localized PPA. Activation correlating with semantic spatial information occurring in the audio-description is more restricted to the anterior PPA. On an individual level, we find significant bilateral activity in the PPA of nine individuals and unilateral activity in one individual. Results suggest that activation in the PPA generalizes to spatial information embedded in a movie and an auditory narrative, and may call for considering a functional subdivision of the PPA.
Collapse
Affiliation(s)
- Christian O Häusler
- Psychoinformatics Lab, Institute of Neuroscience and Medicine, Brain & Behaviour (INM-7), Research Centre Jülich, Jülich, Germany. .,Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany.
| | - Simon B Eickhoff
- Psychoinformatics Lab, Institute of Neuroscience and Medicine, Brain & Behaviour (INM-7), Research Centre Jülich, Jülich, Germany.,Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Michael Hanke
- Psychoinformatics Lab, Institute of Neuroscience and Medicine, Brain & Behaviour (INM-7), Research Centre Jülich, Jülich, Germany.,Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| |
Collapse
|
35
|
SIlson EH, Morland AB. The search for shape-centered representations. Cogn Neuropsychol 2022; 39:85-87. [PMID: 35337256 DOI: 10.1080/02643294.2022.2052718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Edward H SIlson
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, The University of Edinburgh, Edinburgh, UK
| | - Antony B Morland
- Department of Psychology, University of York, Heslington, UK.,York NeuroImaging Centre, The Biocentre, York Science Park, Heslington, UK
| |
Collapse
|
36
|
Wilder J, Rezanejad M, Dickinson S, Siddiqi K, Jepson A, Walther DB. Neural correlates of local parallelism during naturalistic vision. PLoS One 2022; 17:e0260266. [PMID: 35061699 PMCID: PMC8782314 DOI: 10.1371/journal.pone.0260266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 11/07/2021] [Indexed: 11/18/2022] Open
Abstract
Human observers can rapidly perceive complex real-world scenes. Grouping visual elements into meaningful units is an integral part of this process. Yet, so far, the neural underpinnings of perceptual grouping have only been studied with simple lab stimuli. We here uncover the neural mechanisms of one important perceptual grouping cue, local parallelism. Using a new, image-computable algorithm for detecting local symmetry in line drawings and photographs, we manipulated the local parallelism content of real-world scenes. We decoded scene categories from patterns of brain activity obtained via functional magnetic resonance imaging (fMRI) in 38 human observers while they viewed the manipulated scenes. Decoding was significantly more accurate for scenes containing strong local parallelism compared to weak local parallelism in the parahippocampal place area (PPA), indicating a central role of parallelism in scene perception. To investigate the origin of the parallelism signal we performed a model-based fMRI analysis of the public BOLD5000 dataset, looking for voxels whose activation time course matches that of the locally parallel content of the 4916 photographs viewed by the participants in the experiment. We found a strong relationship with average local symmetry in visual areas V1-4, PPA, and retrosplenial cortex (RSC). Notably, the parallelism-related signal peaked first in V4, suggesting V4 as the site for extracting paralleism from the visual input. We conclude that local parallelism is a perceptual grouping cue that influences neuronal activity throughout the visual hierarchy, presumably starting at V4. Parallelism plays a key role in the representation of scene categories in PPA.
Collapse
Affiliation(s)
| | - Morteza Rezanejad
- University of Toronto, Toronto, Canada
- McGill University, Montreal, Canada
| | - Sven Dickinson
- University of Toronto, Toronto, Canada
- Samsung Toronto AI Research Center, Toronto, Canada
- Vector Institute, Toronto, Canada
| | | | - Allan Jepson
- University of Toronto, Toronto, Canada
- Samsung Toronto AI Research Center, Toronto, Canada
| | | |
Collapse
|
37
|
Harel A, Nador JD, Bonner MF, Epstein RA. Early Electrophysiological Markers of Navigational Affordances in Scenes. J Cogn Neurosci 2021; 34:397-410. [PMID: 35015877 DOI: 10.1162/jocn_a_01810] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Scene perception and spatial navigation are interdependent cognitive functions, and there is increasing evidence that cortical areas that process perceptual scene properties also carry information about the potential for navigation in the environment (navigational affordances). However, the temporal stages by which visual information is transformed into navigationally relevant information are not yet known. We hypothesized that navigational affordances are encoded during perceptual processing and therefore should modulate early visually evoked ERPs, especially the scene-selective P2 component. To test this idea, we recorded ERPs from participants while they passively viewed computer-generated room scenes matched in visual complexity. By simply changing the number of doors (no doors, 1 door, 2 doors, 3 doors), we were able to systematically vary the number of pathways that afford movement in the local environment, while keeping the overall size and shape of the environment constant. We found that rooms with no doors evoked a higher P2 response than rooms with three doors, consistent with prior research reporting higher P2 amplitude to closed relative to open scenes. Moreover, we found P2 amplitude scaled linearly with the number of doors in the scenes. Navigability effects on the ERP waveform were also observed in a multivariate analysis, which showed significant decoding of the number of doors and their location at earlier time windows. Together, our results suggest that navigational affordances are represented in the early stages of scene perception. This complements research showing that the occipital place area automatically encodes the structure of navigable space and strengthens the link between scene perception and navigation.
Collapse
|
38
|
Keles U, Lin C, Adolphs R. A Cautionary Note on Predicting Social Judgments from Faces with Deep Neural Networks. AFFECTIVE SCIENCE 2021; 2:438-454. [PMID: 34966898 PMCID: PMC8664800 DOI: 10.1007/s42761-021-00075-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 08/23/2021] [Indexed: 11/27/2022]
Abstract
AbstractPeople spontaneously infer other people’s psychology from faces, encompassing inferences of their affective states, cognitive states, and stable traits such as personality. These judgments are known to be often invalid, but nonetheless bias many social decisions. Their importance and ubiquity have made them popular targets for automated prediction using deep convolutional neural networks (DCNNs). Here, we investigated the applicability of this approach: how well does it generalize, and what biases does it introduce? We compared three distinct sets of features (from a face identification DCNN, an object recognition DCNN, and using facial geometry), and tested their prediction across multiple out-of-sample datasets. Across judgments and datasets, features from both pre-trained DCNNs provided better predictions than did facial geometry. However, predictions using object recognition DCNN features were not robust to superficial cues (e.g., color and hair style). Importantly, predictions using face identification DCNN features were not specific: models trained to predict one social judgment (e.g., trustworthiness) also significantly predicted other social judgments (e.g., femininity and criminal), and at an even higher accuracy in some cases than predicting the judgment of interest (e.g., trustworthiness). Models trained to predict affective states (e.g., happy) also significantly predicted judgments of stable traits (e.g., sociable), and vice versa. Our analysis pipeline not only provides a flexible and efficient framework for predicting affective and social judgments from faces but also highlights the dangers of such automated predictions: correlated but unintended judgments can drive the predictions of the intended judgments.
Collapse
Affiliation(s)
- Umit Keles
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA USA
| | - Chujun Lin
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Ralph Adolphs
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA USA
| |
Collapse
|
39
|
Foster JJ, Ling S. Normalizing population receptive fields. Proc Natl Acad Sci U S A 2021; 118:e2118367118. [PMID: 34789580 PMCID: PMC8617414 DOI: 10.1073/pnas.2118367118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/20/2021] [Indexed: 11/18/2022] Open
Affiliation(s)
- Joshua J Foster
- Department of Psychological and Brain Sciences, Boston University, Boston, MA 02215
- Center for Systems Neuroscience, Boston University, Boston, MA 02215
| | - Sam Ling
- Department of Psychological and Brain Sciences, Boston University, Boston, MA 02215;
- Center for Systems Neuroscience, Boston University, Boston, MA 02215
| |
Collapse
|
40
|
Direct comparison of contralateral bias and face/scene selectivity in human occipitotemporal cortex. Brain Struct Funct 2021; 227:1405-1421. [PMID: 34727232 PMCID: PMC9046350 DOI: 10.1007/s00429-021-02411-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 10/08/2021] [Indexed: 10/27/2022]
Abstract
Human visual cortex is organised broadly according to two major principles: retinotopy (the spatial mapping of the retina in cortex) and category-selectivity (preferential responses to specific categories of stimuli). Historically, these principles were considered anatomically separate, with retinotopy restricted to the occipital cortex and category-selectivity emerging in the lateral-occipital and ventral-temporal cortex. However, recent studies show that category-selective regions exhibit systematic retinotopic biases, for example exhibiting stronger activation for stimuli presented in the contra- compared to the ipsilateral visual field. It is unclear, however, whether responses within category-selective regions are more strongly driven by retinotopic location or by category preference, and if there are systematic differences between category-selective regions in the relative strengths of these preferences. Here, we directly compare contralateral and category preferences by measuring fMRI responses to scene and face stimuli presented in the left or right visual field and computing two bias indices: a contralateral bias (response to the contralateral minus ipsilateral visual field) and a face/scene bias (preferred response to scenes compared to faces, or vice versa). We compare these biases within and between scene- and face-selective regions and across the lateral and ventral surfaces of the visual cortex more broadly. We find an interaction between surface and bias: lateral surface regions show a stronger contralateral than face/scene bias, whilst ventral surface regions show the opposite. These effects are robust across and within subjects, and appear to reflect large-scale, smoothly varying gradients. Together, these findings support distinct functional roles for the lateral and ventral visual cortex in terms of the relative importance of the spatial location of stimuli during visual information processing.
Collapse
|
41
|
Abstract
During natural vision, our brains are constantly exposed to complex, but regularly structured environments. Real-world scenes are defined by typical part-whole relationships, where the meaning of the whole scene emerges from configurations of localized information present in individual parts of the scene. Such typical part-whole relationships suggest that information from individual scene parts is not processed independently, but that there are mutual influences between the parts and the whole during scene analysis. Here, we review recent research that used a straightforward, but effective approach to study such mutual influences: By dissecting scenes into multiple arbitrary pieces, these studies provide new insights into how the processing of whole scenes is shaped by their constituent parts and, conversely, how the processing of individual parts is determined by their role within the whole scene. We highlight three facets of this research: First, we discuss studies demonstrating that the spatial configuration of multiple scene parts has a profound impact on the neural processing of the whole scene. Second, we review work showing that cortical responses to individual scene parts are shaped by the context in which these parts typically appear within the environment. Third, we discuss studies demonstrating that missing scene parts are interpolated from the surrounding scene context. Bridging these findings, we argue that efficient scene processing relies on an active use of the scene's part-whole structure, where the visual brain matches scene inputs with internal models of what the world should look like.
Collapse
Affiliation(s)
- Daniel Kaiser
- Justus-Liebig-Universität Gießen, Germany.,Philipps-Universität Marburg, Germany.,University of York, United Kingdom
| | - Radoslaw M Cichy
- Freie Universität Berlin, Germany.,Humboldt-Universität zu Berlin, Germany.,Bernstein Centre for Computational Neuroscience Berlin, Germany
| |
Collapse
|
42
|
Daube C, Xu T, Zhan J, Webb A, Ince RA, Garrod OG, Schyns PG. Grounding deep neural network predictions of human categorization behavior in understandable functional features: The case of face identity. PATTERNS (NEW YORK, N.Y.) 2021; 2:100348. [PMID: 34693374 PMCID: PMC8515012 DOI: 10.1016/j.patter.2021.100348] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 11/30/2020] [Accepted: 08/20/2021] [Indexed: 01/24/2023]
Abstract
Deep neural networks (DNNs) can resolve real-world categorization tasks with apparent human-level performance. However, true equivalence of behavioral performance between humans and their DNN models requires that their internal mechanisms process equivalent features of the stimulus. To develop such feature equivalence, our methodology leveraged an interpretable and experimentally controlled generative model of the stimuli (realistic three-dimensional textured faces). Humans rated the similarity of randomly generated faces to four familiar identities. We predicted these similarity ratings from the activations of five DNNs trained with different optimization objectives. Using information theoretic redundancy, reverse correlation, and the testing of generalization gradients, we show that DNN predictions of human behavior improve because their shape and texture features overlap with those that subsume human behavior. Thus, we must equate the functional features that subsume the behavioral performances of the brain and its models before comparing where, when, and how these features are processed.
Collapse
Affiliation(s)
- Christoph Daube
- Institute of Neuroscience and Psychology, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, Scotland, UK
| | - Tian Xu
- Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Avenue, Cambridge CB3 0FD, England, UK
| | - Jiayu Zhan
- Institute of Neuroscience and Psychology, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, Scotland, UK
| | - Andrew Webb
- Institute of Neuroscience and Psychology, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, Scotland, UK
| | - Robin A.A. Ince
- Institute of Neuroscience and Psychology, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, Scotland, UK
| | - Oliver G.B. Garrod
- Institute of Neuroscience and Psychology, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, Scotland, UK
| | - Philippe G. Schyns
- Institute of Neuroscience and Psychology, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, Scotland, UK
| |
Collapse
|
43
|
Pezzulo G, Zorzi M, Corbetta M. The secret life of predictive brains: what's spontaneous activity for? Trends Cogn Sci 2021; 25:730-743. [PMID: 34144895 PMCID: PMC8363551 DOI: 10.1016/j.tics.2021.05.007] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 05/14/2021] [Accepted: 05/19/2021] [Indexed: 01/23/2023]
Abstract
Brains at rest generate dynamical activity that is highly structured in space and time. We suggest that spontaneous activity, as in rest or dreaming, underlies top-down dynamics of generative models. During active tasks, generative models provide top-down predictive signals for perception, cognition, and action. When the brain is at rest and stimuli are weak or absent, top-down dynamics optimize the generative models for future interactions by maximizing the entropy of explanations and minimizing model complexity. Spontaneous fluctuations of correlated activity within and across brain regions may reflect transitions between 'generic priors' of the generative model: low dimensional latent variables and connectivity patterns of the most common perceptual, motor, cognitive, and interoceptive states. Even at rest, brains are proactive and predictive.
Collapse
Affiliation(s)
- Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Roma, Italy.
| | - Marco Zorzi
- Department of General Psychology and Padova Neuroscience Center (PNC), University of Padova, Padova, Italy; IRCCS San Camillo Hospital, Venice, Italy
| | - Maurizio Corbetta
- Department of Neuroscience and Padova Neuroscience Center (PNC), University of Padova, Padova, Italy; Venetian Institute of Molecular Medicine (VIMM), Fondazione Biomedica, Padova, Italy
| |
Collapse
|
44
|
Çelik E, Keles U, Kiremitçi İ, Gallant JL, Çukur T. Cortical networks of dynamic scene category representation in the human brain. Cortex 2021; 143:127-147. [PMID: 34411847 DOI: 10.1016/j.cortex.2021.07.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 06/28/2021] [Accepted: 07/14/2021] [Indexed: 10/20/2022]
Abstract
Humans have an impressive ability to rapidly process global information in natural scenes to infer their category. Yet, it remains unclear whether and how scene categories observed dynamically in the natural world are represented in cerebral cortex beyond few canonical scene-selective areas. To address this question, here we examined the representation of dynamic visual scenes by recording whole-brain blood oxygenation level-dependent (BOLD) responses while subjects viewed natural movies. We fit voxelwise encoding models to estimate tuning for scene categories that reflect statistical ensembles of objects and actions in the natural world. We find that this scene-category model explains a significant portion of the response variance broadly across cerebral cortex. Cluster analysis of scene-category tuning profiles across cortex reveals nine spatially-segregated networks of brain regions consistently across subjects. These networks show heterogeneous tuning for a diverse set of dynamic scene categories related to navigation, human activity, social interaction, civilization, natural environment, non-human animals, motion-energy, and texture, suggesting that the organization of scene category representation is quite complex.
Collapse
Affiliation(s)
- Emin Çelik
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey.
| | - Umit Keles
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey; Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| | - İbrahim Kiremitçi
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey
| | - Jack L Gallant
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA; Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA; Department of Psychology, University of California, Berkeley, CA, USA
| | - Tolga Çukur
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey; Department of Electrical and Electronics Engineering, Bilkent University, Ankara, Turkey
| |
Collapse
|
45
|
Dwivedi K, Bonner MF, Cichy RM, Roig G. Unveiling functions of the visual cortex using task-specific deep neural networks. PLoS Comput Biol 2021; 17:e1009267. [PMID: 34388161 PMCID: PMC8407579 DOI: 10.1371/journal.pcbi.1009267] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 08/31/2021] [Accepted: 07/11/2021] [Indexed: 11/20/2022] Open
Abstract
The human visual cortex enables visual perception through a cascade of hierarchical computations in cortical regions with distinct functionalities. Here, we introduce an AI-driven approach to discover the functional mapping of the visual cortex. We related human brain responses to scene images measured with functional MRI (fMRI) systematically to a diverse set of deep neural networks (DNNs) optimized to perform different scene perception tasks. We found a structured mapping between DNN tasks and brain regions along the ventral and dorsal visual streams. Low-level visual tasks mapped onto early brain regions, 3-dimensional scene perception tasks mapped onto the dorsal stream, and semantic tasks mapped onto the ventral stream. This mapping was of high fidelity, with more than 60% of the explainable variance in nine key regions being explained. Together, our results provide a novel functional mapping of the human visual cortex and demonstrate the power of the computational approach.
Collapse
Affiliation(s)
- Kshitij Dwivedi
- Department of Education and Psychology, Freie Universität Berlin, Germany
- Department of Computer Science, Goethe University, Frankfurt am Main, Germany
| | - Michael F. Bonner
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | | | - Gemma Roig
- Department of Computer Science, Goethe University, Frankfurt am Main, Germany
| |
Collapse
|
46
|
Wehbe L, Blank IA, Shain C, Futrell R, Levy R, von der Malsburg T, Smith N, Gibson E, Fedorenko E. Incremental Language Comprehension Difficulty Predicts Activity in the Language Network but Not the Multiple Demand Network. Cereb Cortex 2021. [PMID: 33895807 DOI: 10.1101/2020.04.15.043844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Open
Abstract
What role do domain-general executive functions play in human language comprehension? To address this question, we examine the relationship between behavioral measures of comprehension and neural activity in the domain-general "multiple demand" (MD) network, which has been linked to constructs like attention, working memory, inhibitory control, and selection, and implicated in diverse goal-directed behaviors. Specifically, functional magnetic resonance imaging data collected during naturalistic story listening are compared with theory-neutral measures of online comprehension difficulty and incremental processing load (reading times and eye-fixation durations). Critically, to ensure that variance in these measures is driven by features of the linguistic stimulus rather than reflecting participant- or trial-level variability, the neuroimaging and behavioral datasets were collected in nonoverlapping samples. We find no behavioral-neural link in functionally localized MD regions; instead, this link is found in the domain-specific, fronto-temporal "core language network," in both left-hemispheric areas and their right hemispheric homotopic areas. These results argue against strong involvement of domain-general executive circuits in language comprehension.
Collapse
Affiliation(s)
- Leila Wehbe
- Carnegie Mellon University, Machine Learning Department PA 15213, USA
| | - Idan Asher Blank
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of California Los Angeles, Department of Psychology CA 90095, USA
| | - Cory Shain
- Ohio State University, Department of Linguistics OH 43210, USA
| | - Richard Futrell
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of California Irvine, Department of Linguistics CA 92697, USA
| | - Roger Levy
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of California San Diego, Department of Linguistics CA 92161, USA
| | - Titus von der Malsburg
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of Stuttgart, Institute of Linguistics, 70049 Stuttgart, Germany
| | - Nathaniel Smith
- University of California San Diego, Department of Linguistics CA 92161, USA
| | - Edward Gibson
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
| | - Evelina Fedorenko
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- Massachusetts Institute of Technology, McGovern Institute for Brain ResearchMA 02139, USA
| |
Collapse
|
47
|
Wehbe L, Blank IA, Shain C, Futrell R, Levy R, von der Malsburg T, Smith N, Gibson E, Fedorenko E. Incremental Language Comprehension Difficulty Predicts Activity in the Language Network but Not the Multiple Demand Network. Cereb Cortex 2021; 31:4006-4023. [PMID: 33895807 PMCID: PMC8328211 DOI: 10.1093/cercor/bhab065] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 01/15/2021] [Accepted: 02/21/2021] [Indexed: 12/28/2022] Open
Abstract
What role do domain-general executive functions play in human language comprehension? To address this question, we examine the relationship between behavioral measures of comprehension and neural activity in the domain-general "multiple demand" (MD) network, which has been linked to constructs like attention, working memory, inhibitory control, and selection, and implicated in diverse goal-directed behaviors. Specifically, functional magnetic resonance imaging data collected during naturalistic story listening are compared with theory-neutral measures of online comprehension difficulty and incremental processing load (reading times and eye-fixation durations). Critically, to ensure that variance in these measures is driven by features of the linguistic stimulus rather than reflecting participant- or trial-level variability, the neuroimaging and behavioral datasets were collected in nonoverlapping samples. We find no behavioral-neural link in functionally localized MD regions; instead, this link is found in the domain-specific, fronto-temporal "core language network," in both left-hemispheric areas and their right hemispheric homotopic areas. These results argue against strong involvement of domain-general executive circuits in language comprehension.
Collapse
Affiliation(s)
- Leila Wehbe
- Carnegie Mellon University, Machine Learning Department PA 15213, USA
| | - Idan Asher Blank
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of California Los Angeles, Department of Psychology CA 90095, USA
| | - Cory Shain
- Ohio State University, Department of Linguistics OH 43210, USA
| | - Richard Futrell
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of California Irvine, Department of Linguistics CA 92697, USA
| | - Roger Levy
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of California San Diego, Department of Linguistics CA 92161, USA
| | - Titus von der Malsburg
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- University of Stuttgart, Institute of Linguistics, 70049 Stuttgart, Germany
| | - Nathaniel Smith
- University of California San Diego, Department of Linguistics CA 92161, USA
| | - Edward Gibson
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
| | - Evelina Fedorenko
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
- Massachusetts Institute of Technology, McGovern Institute for Brain ResearchMA 02139, USA
| |
Collapse
|
48
|
Abstract
The scientific study of reading has a rich history that spans disciplines from vision science to linguistics, psychology, cognitive neuroscience, neurology, and education. The study of reading can elucidate important general mechanisms in spatial vision, attentional control, object recognition, and perceptual learning, as well as the principles of plasticity and cortical topography. However, literacy also prompts the development of specific neural circuits to process a unique and artificial stimulus. In this review, we describe the sequence of operations that transforms visual features into language, how the key neural circuits are sculpted by experience during development, and what goes awry in children for whom learning to read is a struggle. Expected final online publication date for the Annual Review of Vision Science, Volume 7 is September 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Jason D Yeatman
- Graduate School of Education, Stanford University, Stanford, California 93405, USA; .,Division of Developmental-Behavioral Pediatrics, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Psychology, Stanford University, Stanford, California 94305, USA
| | - Alex L White
- Graduate School of Education, Stanford University, Stanford, California 93405, USA; .,Division of Developmental-Behavioral Pediatrics, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Neuroscience and Behavior, Barnard College, New York, New York 10027, USA
| |
Collapse
|
49
|
Steel A, Billings MM, Silson EH, Robertson CE. A network linking scene perception and spatial memory systems in posterior cerebral cortex. Nat Commun 2021; 12:2632. [PMID: 33976141 PMCID: PMC8113503 DOI: 10.1038/s41467-021-22848-z] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 04/05/2021] [Indexed: 02/03/2023] Open
Abstract
The neural systems supporting scene-perception and spatial-memory systems of the human brain are well-described. But how do these neural systems interact? Here, using fine-grained individual-subject fMRI, we report three cortical areas of the human brain, each lying immediately anterior to a region of the scene perception network in posterior cerebral cortex, that selectively activate when recalling familiar real-world locations. Despite their close proximity to the scene-perception areas, network analyses show that these regions constitute a distinct functional network that interfaces with spatial memory systems during naturalistic scene understanding. These "place-memory areas" offer a new framework for understanding how the brain implements memory-guided visual behaviors, including navigation.
Collapse
Affiliation(s)
- Adam Steel
- grid.254880.30000 0001 2179 2404Department of Psychology and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Madeleine M. Billings
- grid.254880.30000 0001 2179 2404Department of Psychology and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Edward H. Silson
- grid.4305.20000 0004 1936 7988Psychology, School of Philosophy, Psychology, and Language Sciences, University of Edinburgh, Edinburgh, EH8 9JZ UK
| | - Caroline E. Robertson
- grid.254880.30000 0001 2179 2404Department of Psychology and Brain Sciences, Dartmouth College, Hanover, NH USA
| |
Collapse
|
50
|
Kaiser N, Butler E. Introducing Social Breathing: A Model of Engaging in Relational Systems. Front Psychol 2021; 12:571298. [PMID: 33897512 PMCID: PMC8060442 DOI: 10.3389/fpsyg.2021.571298] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 03/15/2021] [Indexed: 11/13/2022] Open
Abstract
We address what it means to "engage in a relationship" and suggest Social Breathing as a model of immersing ourselves in the metaphorical social air around us, which is necessary for shared intention and joint action. We emphasize how emergent properties of social systems arise, such as the shared culture of groups, which cannot be reduced to the individuals involved. We argue that the processes involved in Social Breathing are: (1) automatic, (2) implicit, (3) temporal, (4) in the form of mutual bi-directional interwoven exchanges between social partners and (5) embodied in the coordination of the brains and behaviors of social partners. We summarize cross-disciplinary evidence suggesting that these processes involve a multi-person whole-brain-body network which is critical for the development of both we-ness and relational skills. We propose that Social Breathing depends on each individual's ability to sustain multimodal interwovenness, thus providing a theoretical link between social neuroscience and relational/multi-person psychology. We discuss how the model could guide research on autism, relationships, and psychotherapy.
Collapse
Affiliation(s)
- Niclas Kaiser
- Department of Psychology, Faculty of Social Sciences, Umeå University, Umeå, Sweden
| | - Emily Butler
- Family Studies and Human Development, University of Arizona, Tucson, AZ, United States
| |
Collapse
|