1
|
Dado T, Papale P, Lozano A, Le L, Wang F, van Gerven M, Roelfsema P, Güçlütürk Y, Güçlü U. Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Comput Biol 2024; 20:e1012058. [PMID: 38709818 PMCID: PMC11098503 DOI: 10.1371/journal.pcbi.1012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 05/16/2024] [Accepted: 04/08/2024] [Indexed: 05/08/2024] Open
Abstract
A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Antonio Lozano
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Lynn Le
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Feng Wang
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne University, Paris, France
- Department of Integrative Neurophysiology, VU Amsterdam, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC, Amsterdam, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
2
|
Brady TF, Störmer VS. Comparing memory capacity across stimuli requires maximally dissimilar foils: Using deep convolutional neural networks to understand visual working memory capacity for real-world objects. Mem Cognit 2024; 52:595-609. [PMID: 37973770 DOI: 10.3758/s13421-023-01485-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/17/2023] [Indexed: 11/19/2023]
Abstract
The capacity of visual working and visual long-term memory plays a critical role in theories of cognitive architecture and the relationship between memory and other cognitive systems. Here, we argue that before asking the question of how capacity varies across different stimuli or what the upper bound of capacity is for a given memory system, it is necessary to establish a methodology that allows a fair comparison between distinct stimulus sets and conditions. One of the most important factors determining performance in a memory task is target/foil dissimilarity. We argue that only by maximizing the dissimilarity of the target and foil in each stimulus set can we provide a fair basis for memory comparisons between stimuli. In the current work we focus on a way to pick such foils objectively for complex, meaningful real-world objects by using deep convolutional neural networks, and we validate this using both memory tests and similarity metrics. Using this method, we then provide evidence that there is a greater capacity for real-world objects relative to simple colors in visual working memory; critically, we also show that this difference can be reduced or eliminated when non-comparable foils are used, potentially explaining why previous work has not always found such a difference. Our study thus demonstrates that working memory capacity depends on the type of information that is remembered and that assessing capacity depends critically on foil dissimilarity, especially when comparing memory performance and other cognitive systems across different stimulus sets.
Collapse
Affiliation(s)
- Timothy F Brady
- Department of Psychology, University of California San Diego, La Jolla, CA, 92093, USA.
| | - Viola S Störmer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| |
Collapse
|
3
|
Li W, Li J, Chu C, Cao D, Shi W, Zhang Y, Jiang T. Common Sequential Organization of Face Processing in the Human Brain and Convolutional Neural Networks. Neuroscience 2024; 541:1-13. [PMID: 38266906 DOI: 10.1016/j.neuroscience.2024.01.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/11/2024] [Accepted: 01/16/2024] [Indexed: 01/26/2024]
Abstract
Face processing includes two crucial processing levels - face detection and face recognition. However, it remains unclear how human brains organize the two processing levels sequentially. While some studies found that faces are recognized as fast as they are detected, others have reported that faces are detected first, followed by recognition. We discriminated the two processing levels on a fine time scale by combining human intracranial EEG (two females, three males, and three subjects without reported sex information) and representation similarity analysis. Our results demonstrate that the human brain exhibits a "detection-first, recognition-later" pattern during face processing. In addition, we used convolutional neural networks to test the hypothesis that the sequential organization of the two face processing levels in the brain reflects computational optimization. Our findings showed that the networks trained on face recognition also exhibited the "detection-first, recognition-later" pattern. Moreover, this sequential organization mechanism developed gradually during the training of the networks and was observed only for correctly predicted images. These findings collectively support the computational account as to why the brain organizes them in this way.
Collapse
Affiliation(s)
- Wenlu Li
- Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jin Li
- School of Psychology, Capital Normal University, Beijing 100048, China
| | - Congying Chu
- Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Dan Cao
- Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Weiyang Shi
- Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Yu Zhang
- Research Center for Augmented Intelligence, Zhejiang Lab, Hangzhou 311100, China
| | - Tianzi Jiang
- Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China; Research Center for Augmented Intelligence, Zhejiang Lab, Hangzhou 311100, China; Xiaoxiang Institute for Brain Health and Yongzhou Central Hospital, Yongzhou 425000, Hunan Province, China.
| |
Collapse
|
4
|
Liu P, Bo K, Ding M, Fang R. Emergence of Emotion Selectivity in Deep Neural Networks Trained to Recognize Visual Objects. bioRxiv 2024:2023.04.16.537079. [PMID: 37163104 PMCID: PMC10168209 DOI: 10.1101/2023.04.16.537079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Recent neuroimaging studies have shown that the visual cortex plays an important role in representing the affective significance of visual input. The origin of these affect-specific visual representations is debated: they are intrinsic to the visual system versus they arise through reentry from frontal emotion processing structures such as the amygdala. We examined this problem by combining convolutional neural network (CNN) models of the human ventral visual cortex pre-trained on ImageNet with two datasets of affective images. Our results show that (1) in all layers of the CNN models, there were artificial neurons that responded consistently and selectively to neutral, pleasant, or unpleasant images and (2) lesioning these neurons by setting their output to 0 or enhancing these neurons by increasing their gain led to decreased or increased emotion recognition performance respectively. These results support the idea that the visual system may have the intrinsic ability to represent the affective significance of visual input and suggest that CNNs offer a fruitful platform for testing neuroscientific theories.
Collapse
Affiliation(s)
- Peng Liu
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, USA
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Ke Bo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Mingzhou Ding
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, USA
| | - Ruogu Fang
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, USA
- Center for Cognitive Aging and Memory, McKnight Brain Institute, University of Florida, Gainesville, FL, USA
| |
Collapse
|
5
|
Loke J, Seijdel N, Snoek L, Sörensen LKA, van de Klundert R, van der Meer M, Quispel E, Cappaert N, Scholte HS. Human Visual Cortex and Deep Convolutional Neural Network Care Deeply about Object Background. J Cogn Neurosci 2024; 36:551-566. [PMID: 38165735 DOI: 10.1162/jocn_a_02098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Deep convolutional neural networks (DCNNs) are able to partially predict brain activity during object categorization tasks, but factors contributing to this predictive power are not fully understood. Our study aimed to investigate the factors contributing to the predictive power of DCNNs in object categorization tasks. We compared the activity of four DCNN architectures with EEG recordings obtained from 62 human participants during an object categorization task. Previous physiological studies on object categorization have highlighted the importance of figure-ground segregation-the ability to distinguish objects from their backgrounds. Therefore, we investigated whether figure-ground segregation could explain the predictive power of DCNNs. Using a stimulus set consisting of identical target objects embedded in different backgrounds, we examined the influence of object background versus object category within both EEG and DCNN activity. Crucially, the recombination of naturalistic objects and experimentally controlled backgrounds creates a challenging and naturalistic task, while retaining experimental control. Our results showed that early EEG activity (< 100 msec) and early DCNN layers represent object background rather than object category. We also found that the ability of DCNNs to predict EEG activity is primarily influenced by how both systems process object backgrounds, rather than object categories. We demonstrated the role of figure-ground segregation as a potential prerequisite for recognition of object features, by contrasting the activations of trained and untrained (i.e., random weights) DCNNs. These findings suggest that both human visual cortex and DCNNs prioritize the segregation of object backgrounds and target objects to perform object categorization. Altogether, our study provides new insights into the mechanisms underlying object categorization as we demonstrated that both human visual cortex and DCNNs care deeply about object background.
Collapse
|
6
|
Liu P, Bo K, Ding M, Fang R. Emergence of Emotion Selectivity in Deep Neural Networks Trained to Recognize Visual Objects. PLoS Comput Biol 2024; 20:e1011943. [PMID: 38547053 PMCID: PMC10977720 DOI: 10.1371/journal.pcbi.1011943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 02/24/2024] [Indexed: 04/02/2024] Open
Abstract
Recent neuroimaging studies have shown that the visual cortex plays an important role in representing the affective significance of visual input. The origin of these affect-specific visual representations is debated: they are intrinsic to the visual system versus they arise through reentry from frontal emotion processing structures such as the amygdala. We examined this problem by combining convolutional neural network (CNN) models of the human ventral visual cortex pre-trained on ImageNet with two datasets of affective images. Our results show that in all layers of the CNN models, there were artificial neurons that responded consistently and selectively to neutral, pleasant, or unpleasant images and lesioning these neurons by setting their output to zero or enhancing these neurons by increasing their gain led to decreased or increased emotion recognition performance respectively. These results support the idea that the visual system may have the intrinsic ability to represent the affective significance of visual input and suggest that CNNs offer a fruitful platform for testing neuroscientific theories.
Collapse
Affiliation(s)
- Peng Liu
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Ke Bo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Mingzhou Ding
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Ruogu Fang
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- Center for Cognitive Aging and Memory, McKnight Brain Institute, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|
7
|
Anderson TM, Hepler SA, Holdo RM, Donaldson JE, Erhardt RJ, Hopcraft JGC, Hutchinson MC, Huebner SE, Morrison TA, Muday J, Munuo IN, Palmer MS, Pansu J, Pringle RM, Sketch R, Packer C. Interplay of competition and facilitation in grazing succession by migrant Serengeti herbivores. Science 2024; 383:782-788. [PMID: 38359113 DOI: 10.1126/science.adg0744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 01/10/2024] [Indexed: 02/17/2024]
Abstract
Competition, facilitation, and predation offer alternative explanations for successional patterns of migratory herbivores. However, these interactions are difficult to measure, leaving uncertainty about the mechanisms underlying body-size-dependent grazing-and even whether succession occurs at all. We used data from an 8-year camera-trap survey, GPS-collared herbivores, and fecal DNA metabarcoding to analyze the timing, arrival order, and interactions among migratory grazers in Serengeti National Park. Temporal grazing succession is characterized by a "push-pull" dynamic: Competitive grazing nudges zebra ahead of co-migrating wildebeest, whereas grass consumption by these large-bodied migrants attracts trailing, small-bodied gazelle that benefit from facilitation. "Natural experiments" involving intense wildfires and rainfall respectively disrupted and strengthened these effects. Our results highlight a balance between facilitative and competitive forces in co-regulating large-scale ungulate migrations.
Collapse
Affiliation(s)
- T Michael Anderson
- Department of Biology, Wake Forest University, Winston-Salem, NC 27109, USA
| | - Staci A Hepler
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC 27109, USA
| | - Ricardo M Holdo
- Odum School of Ecology, University of Georgia, Athens, GA 30602, USA
| | - Jason E Donaldson
- Odum School of Ecology, University of Georgia, Athens, GA 30602, USA
| | - Robert J Erhardt
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC 27109, USA
| | - J Grant C Hopcraft
- School of Biodiversity, One Health and Veterinary Medicine, University of Glasgow, Glasgow G61 1QH, UK
| | - Matthew C Hutchinson
- Department of Life & Environmental Sciences, University of California Merced, Merced, CA 95343, USA
| | - Sarah E Huebner
- Department of Ecology, Evolution and Behavior, University of Minnesota, St. Paul, MN 55108, USA
| | - Thomas A Morrison
- School of Biodiversity, One Health and Veterinary Medicine, University of Glasgow, Glasgow G61 1QH, UK
| | - Jeffry Muday
- Department of Biology, Wake Forest University, Winston-Salem, NC 27109, USA
| | - Issack N Munuo
- Serengeti Wildlife Research Centre, 2113 Lemara, Arusha, TZ
| | - Meredith S Palmer
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Johan Pansu
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Robert M Pringle
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Robert Sketch
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC 27109, USA
| | - Craig Packer
- Department of Ecology, Evolution and Behavior, University of Minnesota, St. Paul, MN 55108, USA
| |
Collapse
|
8
|
Tuckute G, Feather J, Boebinger D, McDermott JH. Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions. PLoS Biol 2023; 21:e3002366. [PMID: 38091351 PMCID: PMC10718467 DOI: 10.1371/journal.pbio.3002366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 10/06/2023] [Indexed: 12/18/2023] Open
Abstract
Models that predict brain responses to stimuli provide one measure of understanding of a sensory system and have many potential applications in science and engineering. Deep artificial neural networks have emerged as the leading such predictive models of the visual system but are less explored in audition. Prior work provided examples of audio-trained neural networks that produced good predictions of auditory cortical fMRI responses and exhibited correspondence between model stages and brain regions, but left it unclear whether these results generalize to other neural network models and, thus, how to further improve models in this domain. We evaluated model-brain correspondence for publicly available audio neural network models along with in-house models trained on 4 different tasks. Most tested models outpredicted standard spectromporal filter-bank models of auditory cortex and exhibited systematic model-brain correspondence: Middle stages best predicted primary auditory cortex, while deep stages best predicted non-primary cortex. However, some state-of-the-art models produced substantially worse brain predictions. Models trained to recognize speech in background noise produced better brain predictions than models trained to recognize speech in quiet, potentially because hearing in noise imposes constraints on biological auditory representations. The training task influenced the prediction quality for specific cortical tuning properties, with best overall predictions resulting from models trained on multiple tasks. The results generally support the promise of deep neural networks as models of audition, though they also indicate that current models do not explain auditory cortical responses in their entirety.
Collapse
Affiliation(s)
- Greta Tuckute
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
| | - Jenelle Feather
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
| | - Dana Boebinger
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- Program in Speech and Hearing Biosciences and Technology, Harvard, Cambridge, Massachusetts, United States of America
- University of Rochester Medical Center, Rochester, New York, New York, United States of America
| | - Josh H. McDermott
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- Program in Speech and Hearing Biosciences and Technology, Harvard, Cambridge, Massachusetts, United States of America
| |
Collapse
|
9
|
Rastegarnia S, St-Laurent M, DuPre E, Pinsard B, Bellec P. Brain decoding of the Human Connectome Project tasks in a dense individual fMRI dataset. Neuroimage 2023; 283:120395. [PMID: 37832707 DOI: 10.1016/j.neuroimage.2023.120395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 09/21/2023] [Accepted: 09/27/2023] [Indexed: 10/15/2023] Open
Abstract
Brain decoding aims to infer cognitive states from patterns of brain activity. Substantial inter-individual variations in functional brain organization challenge accurate decoding performed at the group level. In this paper, we tested whether accurate brain decoding models can be trained entirely at the individual level. We trained several classifiers on a dense individual functional magnetic resonance imaging (fMRI) dataset for which six participants completed the entire Human Connectome Project (HCP) task battery >13 times over ten separate fMRI sessions. We evaluated nine decoding methods, from Support Vector Machines (SVM) and Multi-Layer Perceptron (MLP) to Graph Convolutional Neural Networks (GCN). All decoders were trained to classify single fMRI volumes into 21 experimental conditions simultaneously, using ∼7 h of fMRI data per participant. The best prediction accuracies were achieved with GCN and MLP models, whose performance (57-67 % accuracy) approached state-of-the-art accuracy (76 %) with models trained at the group level on >1 K hours of data from the original HCP sample. Our SVM model also performed very well (54-62 % accuracy). Feature importance maps derived from MLP -our best-performing model- revealed informative features in regions relevant to particular cognitive domains, notably in the motor cortex. We also observed that inter-subject classification achieved substantially lower accuracy than subject-specific models, indicating that our decoders learned individual-specific features. This work demonstrates that densely-sampled neuroimaging datasets can be used to train accurate brain decoding models at the individual level. We expect this work to become a useful benchmark for techniques that improve model generalization across multiple subjects and acquisition conditions.
Collapse
Affiliation(s)
- Shima Rastegarnia
- Université de Montréal, Montréal, QC, Canada; Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada.
| | - Marie St-Laurent
- Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada
| | | | - Basile Pinsard
- Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada
| | - Pierre Bellec
- Université de Montréal, Montréal, QC, Canada; Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada
| |
Collapse
|
10
|
Favila SE, Aly M. Hippocampal mechanisms resolve competition in memory and perception. bioRxiv 2023:2023.10.09.561548. [PMID: 37873400 PMCID: PMC10592663 DOI: 10.1101/2023.10.09.561548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Behaving adaptively requires selection of relevant memories and sensations and suppression of competing ones. We hypothesized that these mechanisms are linked, such that hippocampal computations that resolve competition in memory also shape the precision of sensory representations to guide selective attention. We leveraged f MRI-based pattern similarity, receptive field modeling, and eye tracking to test this hypothesis in humans performing a memory-dependent visual search task. In the hippocampus, differentiation of competing memories predicted the precision of memory-guided eye movements. In visual cortex, preparatory coding of remembered target locations predicted search successes, whereas preparatory coding of competing locations predicted search failures due to interference. These effects were linked: stronger hippocampal memory differentiation was associated with lower competitor activation in visual cortex, yielding more precise preparatory representations. These results demonstrate a role for memory differentiation in shaping the precision of sensory representations, highlighting links between mechanisms that overcome competition in memory and perception.
Collapse
Affiliation(s)
- Serra E Favila
- Department of Psychology, Columbia University, New York, NY, 10027
| | - Mariam Aly
- Department of Psychology, Columbia University, New York, NY, 10027
| |
Collapse
|
11
|
Karapetian A, Boyanova A, Pandaram M, Obermayer K, Kietzmann TC, Cichy RM. Empirically Identifying and Computationally Modeling the Brain-Behavior Relationship for Human Scene Categorization. J Cogn Neurosci 2023; 35:1879-1897. [PMID: 37590093 PMCID: PMC10586810 DOI: 10.1162/jocn_a_02043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Humans effortlessly make quick and accurate perceptual decisions about the nature of their immediate visual environment, such as the category of the scene they face. Previous research has revealed a rich set of cortical representations potentially underlying this feat. However, it remains unknown which of these representations are suitably formatted for decision-making. Here, we approached this question empirically and computationally, using neuroimaging and computational modeling. For the empirical part, we collected EEG data and RTs from human participants during a scene categorization task (natural vs. man-made). We then related EEG data to behavior to behavior using a multivariate extension of signal detection theory. We observed a correlation between neural data and behavior specifically between ∼100 msec and ∼200 msec after stimulus onset, suggesting that the neural scene representations in this time period are suitably formatted for decision-making. For the computational part, we evaluated a recurrent convolutional neural network (RCNN) as a model of brain and behavior. Unifying our previous observations in an image-computable model, the RCNN predicted well the neural representations, the behavioral scene categorization data, as well as the relationship between them. Our results identify and computationally characterize the neural and behavioral correlates of scene categorization in humans.
Collapse
Affiliation(s)
- Agnessa Karapetian
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | | | | | - Klaus Obermayer
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Technische Universität Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| | | | - Radoslaw M Cichy
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| |
Collapse
|
12
|
Pham TQ, Matsui T, Chikazoe J. Evaluation of the Hierarchical Correspondence between the Human Brain and Artificial Neural Networks: A Review. Biology (Basel) 2023; 12:1330. [PMID: 37887040 PMCID: PMC10604784 DOI: 10.3390/biology12101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/22/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023]
Abstract
Artificial neural networks (ANNs) that are heavily inspired by the human brain now achieve human-level performance across multiple task domains. ANNs have thus drawn attention in neuroscience, raising the possibility of providing a framework for understanding the information encoded in the human brain. However, the correspondence between ANNs and the brain cannot be measured directly. They differ in outputs and substrates, neurons vastly outnumber their ANN analogs (i.e., nodes), and the key algorithm responsible for most of modern ANN training (i.e., backpropagation) is likely absent from the brain. Neuroscientists have thus taken a variety of approaches to examine the similarity between the brain and ANNs at multiple levels of their information hierarchy. This review provides an overview of the currently available approaches and their limitations for evaluating brain-ANN correspondence.
Collapse
Affiliation(s)
| | - Teppei Matsui
- Graduate School of Brain Science, Doshisha University, Kyoto 610-0321, Japan
| | | |
Collapse
|
13
|
Celeghin A, Borriero A, Orsenigo D, Diano M, Méndez Guerrero CA, Perotti A, Petri G, Tamietto M. Convolutional neural networks for vision neuroscience: significance, developments, and outstanding issues. Front Comput Neurosci 2023; 17:1153572. [PMID: 37485400 PMCID: PMC10359983 DOI: 10.3389/fncom.2023.1153572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 06/19/2023] [Indexed: 07/25/2023] Open
Abstract
Convolutional Neural Networks (CNN) are a class of machine learning models predominately used in computer vision tasks and can achieve human-like performance through learning from experience. Their striking similarities to the structural and functional principles of the primate visual system allow for comparisons between these artificial networks and their biological counterparts, enabling exploration of how visual functions and neural representations may emerge in the real brain from a limited set of computational principles. After considering the basic features of CNNs, we discuss the opportunities and challenges of endorsing CNNs as in silico models of the primate visual system. Specifically, we highlight several emerging notions about the anatomical and physiological properties of the visual system that still need to be systematically integrated into current CNN models. These tenets include the implementation of parallel processing pathways from the early stages of retinal input and the reconsideration of several assumptions concerning the serial progression of information flow. We suggest design choices and architectural constraints that could facilitate a closer alignment with biology provide causal evidence of the predictive link between the artificial and biological visual systems. Adopting this principled perspective could potentially lead to new research questions and applications of CNNs beyond modeling object recognition.
Collapse
Affiliation(s)
| | | | - Davide Orsenigo
- Department of Psychology, University of Torino, Turin, Italy
| | - Matteo Diano
- Department of Psychology, University of Torino, Turin, Italy
| | | | | | | | - Marco Tamietto
- Department of Psychology, University of Torino, Turin, Italy
- Department of Medical and Clinical Psychology, and CoRPS–Center of Research on Psychology in Somatic Diseases–Tilburg University, Tilburg, Netherlands
| |
Collapse
|
14
|
Kay K, Bonnen K, Denison RN, Arcaro MJ, Barack DL. Tasks and their role in visual neuroscience. Neuron 2023; 111:1697-1713. [PMID: 37040765 DOI: 10.1016/j.neuron.2023.03.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 03/13/2023] [Accepted: 03/15/2023] [Indexed: 04/13/2023]
Abstract
Vision is widely used as a model system to gain insights into how sensory inputs are processed and interpreted by the brain. Historically, careful quantification and control of visual stimuli have served as the backbone of visual neuroscience. There has been less emphasis, however, on how an observer's task influences the processing of sensory inputs. Motivated by diverse observations of task-dependent activity in the visual system, we propose a framework for thinking about tasks, their role in sensory processing, and how we might formally incorporate tasks into our models of vision.
Collapse
Affiliation(s)
- Kendrick Kay
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Kathryn Bonnen
- School of Optometry, Indiana University, Bloomington, IN 47405, USA
| | - Rachel N Denison
- Department of Psychological and Brain Sciences, Boston University, Boston, MA 02215, USA
| | - Mike J Arcaro
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19146, USA
| | - David L Barack
- Departments of Neuroscience and Philosophy, University of Pennsylvania, Philadelphia, PA 19146, USA
| |
Collapse
|
15
|
St-Yves G, Allen EJ, Wu Y, Kay K, Naselaris T. Brain-optimized deep neural network models of human visual areas learn non-hierarchical representations. Nat Commun 2023; 14:3329. [PMID: 37286563 DOI: 10.1038/s41467-023-38674-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 05/05/2023] [Indexed: 06/09/2023] Open
Abstract
Deep neural networks (DNNs) optimized for visual tasks learn representations that align layer depth with the hierarchy of visual areas in the primate brain. One interpretation of this finding is that hierarchical representations are necessary to accurately predict brain activity in the primate visual system. To test this interpretation, we optimized DNNs to directly predict brain activity measured with fMRI in human visual areas V1-V4. We trained a single-branch DNN to predict activity in all four visual areas jointly, and a multi-branch DNN to predict each visual area independently. Although it was possible for the multi-branch DNN to learn hierarchical representations, only the single-branch DNN did so. This result shows that hierarchical representations are not necessary to accurately predict human brain activity in V1-V4, and that DNNs that encode brain-like visual representations may differ widely in their architecture, ranging from strict serial hierarchies to multiple independent branches.
Collapse
Affiliation(s)
- Ghislain St-Yves
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, 55455, USA
- Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Emily J Allen
- Department of Psychology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Yihan Wu
- Graduate Program in Cognitive Science, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Kendrick Kay
- Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, 55455, USA
- Department of Radiology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Thomas Naselaris
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, 55455, USA.
- Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, 55455, USA.
| |
Collapse
|
16
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
17
|
Taylor J, Xu Y. Comparing the Dominance of Color and Form Information across the Human Ventral Visual Pathway and Convolutional Neural Networks. J Cogn Neurosci 2023; 35:816-840. [PMID: 36877074 DOI: 10.1162/jocn_a_01979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
Abstract
Color and form information can be decoded in every region of the human ventral visual hierarchy, and at every layer of many convolutional neural networks (CNNs) trained to recognize objects, but how does the coding strength of these features vary over processing? Here, we characterize for these features both their absolute coding strength-how strongly each feature is represented independent of the other feature-and their relative coding strength-how strongly each feature is encoded relative to the other, which could constrain how well a feature can be read out by downstream regions across variation in the other feature. To quantify relative coding strength, we define a measure called the form dominance index that compares the relative influence of color and form on the representational geometry at each processing stage. We analyze brain and CNN responses to stimuli varying based on color and either a simple form feature, orientation, or a more complex form feature, curvature. We find that while the brain and CNNs largely differ in how the absolute coding strength of color and form vary over processing, comparing them in terms of their relative emphasis of these features reveals a striking similarity: For both the brain and for CNNs trained for object recognition (but not for untrained CNNs), orientation information is increasingly de-emphasized, and curvature information is increasingly emphasized, relative to color information over processing, with corresponding processing stages showing largely similar values of the form dominance index.
Collapse
|
18
|
Akbarinia A, Morgenstern Y, Gegenfurtner KR. Contrast sensitivity function in deep networks. Neural Netw 2023; 164:228-244. [PMID: 37156217 DOI: 10.1016/j.neunet.2023.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/14/2023] [Accepted: 04/18/2023] [Indexed: 05/10/2023]
Abstract
The contrast sensitivity function (CSF) is a fundamental signature of the visual system that has been measured extensively in several species. It is defined by the visibility threshold for sinusoidal gratings at all spatial frequencies. Here, we investigated the CSF in deep neural networks using the same 2AFC contrast detection paradigm as in human psychophysics. We examined 240 networks pretrained on several tasks. To obtain their corresponding CSFs, we trained a linear classifier on top of the extracted features from frozen pretrained networks. The linear classifier is exclusively trained on a contrast discrimination task with natural images. It has to find which of the two input images has higher contrast. The network's CSF is measured by detecting which one of two images contains a sinusoidal grating of varying orientation and spatial frequency. Our results demonstrate characteristics of the human CSF are manifested in deep networks both in the luminance channel (a band-limited inverted U-shaped function) and in the chromatic channels (two low-pass functions of similar properties). The exact shape of the networks' CSF appears to be task-dependent. The human CSF is better captured by networks trained on low-level visual tasks such as image-denoising or autoencoding. However, human-like CSF also emerges in mid- and high-level tasks such as edge detection and object recognition. Our analysis shows that human-like CSF appears in all architectures but at different depths of processing, some at early layers, while others in intermediate and final layers. Overall, these results suggest that (i) deep networks model the human CSF faithfully, making them suitable candidates for applications of image quality and compression, (ii) efficient/purposeful processing of the natural world drives the CSF shape, and (iii) visual representation from all levels of visual hierarchy contribute to the tuning curve of the CSF, in turn implying a function which we intuitively think of as modulated by low-level visual features may arise as a consequence of pooling from a larger set of neurons at all levels of the visual system.
Collapse
Affiliation(s)
- Arash Akbarinia
- Department of Experimental Psychology, University of Giessen, Germany.
| | - Yaniv Morgenstern
- Department of Experimental Psychology, University of Giessen, Germany; Faculty of Psychology and Educational Sciences, KU Leuven, Belgium
| | | |
Collapse
|
19
|
Lin R, Naselaris T, Kay K, Wehbe L. Stacked regressions and structured variance partitioning for interpretable brain maps. bioRxiv 2023:2023.04.23.537988. [PMID: 37163111 PMCID: PMC10168225 DOI: 10.1101/2023.04.23.537988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Relating brain activity associated with a complex stimulus to different properties of that stimulus is a powerful approach for constructing functional brain maps. However, when stimuli are naturalistic, their properties are often correlated (e.g., visual and semantic features of natural images, or different layers of a convolutional neural network that are used as features of images). Correlated properties can act as confounders for each other and complicate the interpretability of brain maps, and can impact the robustness of statistical estimators. Here, we present an approach for brain mapping based on two proposed methods: stacking different encoding models and structured variance partitioning. Our stacking algorithm combines encoding models that each use as input a feature space that describes a different stimulus attribute. The algorithm learns to predict the activity of a voxel as a linear combination of the outputs of different encoding models. We show that the resulting combined model can predict held-out brain activity better or at least as well as the individual encoding models. Further, the weights of the linear combination are readily interpretable; they show the importance of each feature space for predicting a voxel. We then build on our stacking models to introduce structured variance partitioning, a new type of variance partitioning that takes into account the known relationships between features. Our approach constrains the size of the hypothesis space and allows us to ask targeted questions about the similarity between feature spaces and brain regions even in the presence of correlations between the feature spaces. We validate our approach in simulation, showcase its brain mapping potential on fMRI data, and release a Python package. Our methods can be useful for researchers interested in aligning brain activity with different layers of a neural network, or with other types of correlated feature spaces.
Collapse
Affiliation(s)
- Ruogu Lin
- Computational Biology Department, Carnegie Mellon University
| | - Thomas Naselaris
- Department of Neuroscience, University of Minnesota
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota
| | - Kendrick Kay
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University
- Machine Learning Department, Carnegie Mellon University
| |
Collapse
|
20
|
Beguš G, Zhou A, Zhao TC. Encoding of speech in convolutional layers and the brain stem based on language experience. Sci Rep 2023; 13:6480. [PMID: 37081119 PMCID: PMC10119295 DOI: 10.1038/s41598-023-33384-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 04/12/2023] [Indexed: 04/22/2023] Open
Abstract
Comparing artificial neural networks with outputs of neuroimaging techniques has recently seen substantial advances in (computer) vision and text-based language models. Here, we propose a framework to compare biological and artificial neural computations of spoken language representations and propose several new challenges to this paradigm. The proposed technique is based on a similar principle that underlies electroencephalography (EEG): averaging of neural (artificial or biological) activity across neurons in the time domain, and allows to compare encoding of any acoustic property in the brain and in intermediate convolutional layers of an artificial neural network. Our approach allows a direct comparison of responses to a phonetic property in the brain and in deep neural networks that requires no linear transformations between the signals. We argue that the brain stem response (cABR) and the response in intermediate convolutional layers to the exact same stimulus are highly similar without applying any transformations, and we quantify this observation. The proposed technique not only reveals similarities, but also allows for analysis of the encoding of actual acoustic properties in the two signals: we compare peak latency (i) in cABR relative to the stimulus in the brain stem and in (ii) intermediate convolutional layers relative to the input/output in deep convolutional networks. We also examine and compare the effect of prior language exposure on the peak latency in cABR and in intermediate convolutional layers. Substantial similarities in peak latency encoding between the human brain and intermediate convolutional networks emerge based on results from eight trained networks (including a replication experiment). The proposed technique can be used to compare encoding between the human brain and intermediate convolutional layers for any acoustic property and for other neuroimaging techniques.
Collapse
Affiliation(s)
- Gašper Beguš
- Department of Linguistics, University of California, Berkeley, USA.
| | - Alan Zhou
- Department of Cognitive Science, Johns Hopkins University, Baltimore, USA
| | - T Christina Zhao
- Institute for Learning and Brain Sciences, University of Washington, Seattle, USA
- Department of Speech and Hearing Sciences, University of Washington, Seattle, USA
| |
Collapse
|
21
|
Mocz V, Jeong SK, Chun M, Xu Y. Representing Multiple Visual Objects in the Human Brain and Convolutional Neural Networks. bioRxiv 2023:2023.02.28.530472. [PMID: 36909506 PMCID: PMC10002658 DOI: 10.1101/2023.02.28.530472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Objects in the real world often appear with other objects. To recover the identity of an object whether or not other objects are encoded concurrently, in primate object-processing regions, neural responses to an object pair have been shown to be well approximated by the average responses to each constituent object shown alone, indicating the whole is equal to the average of its parts. This is present at the single unit level in the slope of response amplitudes of macaque IT neurons to paired and single objects, and at the population level in response patterns of fMRI voxels in human ventral object processing regions (e.g., LO). Here we show that averaging exists in both single fMRI voxels and voxel population responses in human LO, with better averaging in single voxels leading to better averaging in fMRI response patterns, demonstrating a close correspondence of averaging at the fMRI unit and population levels. To understand if a similar averaging mechanism exists in convolutional neural networks (CNNs) pretrained for object classification, we examined five CNNs with varying architecture, depth and the presence/absence of recurrent processing. We observed averaging at the CNN unit level but rarely at the population level, with CNN unit response distribution in most cases did not resemble human LO or macaque IT responses. The whole is thus not equal to the average of its parts in CNNs, potentially rendering the individual objects in a pair less accessible in CNNs during visual processing than they are in the human brain.
Collapse
Affiliation(s)
- Viola Mocz
- Visual Cognitive Neuroscience Lab, Department of Psychology, Yale University, New Haven, CT 06520, USA
| | - Su Keun Jeong
- Department of Psychology, Chungbuk National University, South Korea
| | - Marvin Chun
- Visual Cognitive Neuroscience Lab, Department of Psychology, Yale University, New Haven, CT 06520, USA
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Yaoda Xu
- Visual Cognitive Neuroscience Lab, Department of Psychology, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
22
|
Lee J, Jung M, Lustig N, Lee J. Neural representations of the perception of handwritten digits and visual objects from a convolutional neural network compared to humans. Hum Brain Mapp 2023; 44:2018-2038. [PMID: 36637109 PMCID: PMC9980894 DOI: 10.1002/hbm.26189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 12/04/2022] [Accepted: 12/12/2022] [Indexed: 01/14/2023] Open
Abstract
We investigated neural representations for visual perception of 10 handwritten digits and six visual objects from a convolutional neural network (CNN) and humans using functional magnetic resonance imaging (fMRI). Once our CNN model was fine-tuned using a pre-trained VGG16 model to recognize the visual stimuli from the digit and object categories, representational similarity analysis (RSA) was conducted using neural activations from fMRI and feature representations from the CNN model across all 16 classes. The encoded neural representation of the CNN model exhibited the hierarchical topography mapping of the human visual system. The feature representations in the lower convolutional (Conv) layers showed greater similarity with the neural representations in the early visual areas and parietal cortices, including the posterior cingulate cortex. The feature representations in the higher Conv layers were encoded in the higher-order visual areas, including the ventral/medial/dorsal stream and middle temporal complex. The neural representations in the classification layers were observed mainly in the ventral stream visual cortex (including the inferior temporal cortex), superior parietal cortex, and prefrontal cortex. There was a surprising similarity between the neural representations from the CNN model and the neural representations for human visual perception in the context of the perception of digits versus objects, particularly in the primary visual and associated areas. This study also illustrates the uniqueness of human visual perception. Unlike the CNN model, the neural representation of digits and objects for humans is more widely distributed across the whole brain, including the frontal and temporal areas.
Collapse
Affiliation(s)
- Juhyeon Lee
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| | - Minyoung Jung
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| | - Niv Lustig
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| | - Jong‐Hwan Lee
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| |
Collapse
|
23
|
Xie S, Hoehl S, Moeskops M, Kayhan E, Kliesch C, Turtleton B, Köster M, Cichy RM. Visual category representations in the infant brain. Curr Biol 2022; 32:5422-5432.e6. [PMID: 36455560 PMCID: PMC9796816 DOI: 10.1016/j.cub.2022.11.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 09/22/2022] [Accepted: 11/07/2022] [Indexed: 12/05/2022]
Abstract
Visual categorization is a human core cognitive capacity1,2 that depends on the development of visual category representations in the infant brain.3,4,5,6,7 However, the exact nature of infant visual category representations and their relationship to the corresponding adult form remains unknown.8 Our results clarify the nature of visual category representations from electroencephalography (EEG) data in 6- to 8-month-old infants and their developmental trajectory toward adult maturity in the key characteristics of temporal dynamics,2,9 representational format,10,11,12 and spectral properties.13,14 Temporal dynamics change from slowly emerging, developing representations in infants to quickly emerging, complex representations in adults. Despite those differences, infants and adults already partly share visual category representations. The format of infants' representations is visual features of low to intermediate complexity, whereas adults' representations also encode high-complexity features. Theta band activity contributes to visual category representations in infants, and these representations are shifted to the alpha/beta band in adults. Together, we reveal the developmental neural basis of visual categorization in humans, show how information transmission channels change in development, and demonstrate the power of advanced multivariate analysis techniques in infant EEG research for theory building in developmental cognitive science.
Collapse
Affiliation(s)
- Siying Xie
- Department of Education and Psychology, Freie Universität Berlin, Habelschwerdter Allee, Berlin 14195, Germany,Corresponding author
| | - Stefanie Hoehl
- Faculty of Psychology, Department of Developmental and Educational Psychology, University of Vienna, Liebiggasse, Wien 1010, Austria,Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße, 04103 Leipzig, Germany
| | - Merle Moeskops
- Department of Education and Psychology, Freie Universität Berlin, Habelschwerdter Allee, Berlin 14195, Germany
| | - Ezgi Kayhan
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße, 04103 Leipzig, Germany,Department of Developmental Psychology, University of Potsdam, Karl-Liebknecht-Straße, 14476 Potsdam, Germany
| | - Christian Kliesch
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße, 04103 Leipzig, Germany,Department of Developmental Psychology, University of Potsdam, Karl-Liebknecht-Straße, 14476 Potsdam, Germany
| | - Bert Turtleton
- Department of Education and Psychology, Freie Universität Berlin, Habelschwerdter Allee, Berlin 14195, Germany
| | - Moritz Köster
- Department of Education and Psychology, Freie Universität Berlin, Habelschwerdter Allee, Berlin 14195, Germany,Institute of Psychology, University of Regensburg, Universitätsstraße, 93053 Regensburg, Germany
| | - Radoslaw M. Cichy
- Department of Education and Psychology, Freie Universität Berlin, Habelschwerdter Allee, Berlin 14195, Germany,Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Unter den Linden, 10099 Berlin, Germany,Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, Charitéplatz, 10117 Berlin, Germany,Bernstein Center for Computational Neuroscience Berlin, Humboldt-Universität zu Berlin, Unter den Linden, 10099 Berlin, Germany,Corresponding author
| |
Collapse
|
24
|
Chen Y, Wei Z, Gou H, Liu H, Gao L, He X, Zhang X. How far is brain-inspired artificial intelligence away from brain? Front Neurosci 2022; 16:1096737. [PMID: 36570836 PMCID: PMC9783913 DOI: 10.3389/fnins.2022.1096737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open
Abstract
Fueled by the development of neuroscience and artificial intelligence (AI), recent advances in the brain-inspired AI have manifested a tipping-point in the collaboration of the two fields. AI began with the inspiration of neuroscience, but has evolved to achieve a remarkable performance with little dependence upon neuroscience. However, in a recent collaboration, research into neurobiological explainability of AI models found that these highly accurate models may resemble the neurobiological representation of the same computational processes in the brain, although these models have been developed in the absence of such neuroscientific references. In this perspective, we review the cooperation and separation between neuroscience and AI, and emphasize on the current advance, that is, a new cooperation, the neurobiological explainability of AI. Under the intertwined development of the two fields, we propose a practical framework to evaluate the brain-likeness of AI models, paving the way for their further improvements.
Collapse
Affiliation(s)
- Yucan Chen
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Radiology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, China
| | - Zhengde Wei
- Department of Psychology, School of Humanities and Social Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Huixing Gou
- Division of Life Sciences and Medicine, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Haiyi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Li Gao
- SILC Business School, Shanghai University, Shanghai, China,*Correspondence: Li Gao,
| | - Xiaosong He
- Department of Psychology, School of Humanities and Social Sciences, University of Science and Technology of China, Hefei, Anhui, China,Xiaosong He,
| | - Xiaochu Zhang
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Radiology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, China,Department of Psychology, School of Humanities and Social Sciences, University of Science and Technology of China, Hefei, Anhui, China,Application Technology Center of Physical Therapy to Brain Disorders, Institute of Advanced Technology, University of Science and Technology of China, Hefei, China,Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, China,Xiaochu Zhang,
| |
Collapse
|
25
|
Ding X, Zhang H. Dissociation and hierarchy of human visual pathways for simultaneously coding facial identity and expression. Neuroimage 2022; 264:119769. [PMID: 36435341 DOI: 10.1016/j.neuroimage.2022.119769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 11/14/2022] [Accepted: 11/22/2022] [Indexed: 11/25/2022] Open
Abstract
Humans have an extraordinary ability to recognize facial expression and identity from a single face simultaneously and effortlessly, however, the underlying neural computation is not well understood. Here, we optimized a multi-task deep neural network to classify facial expression and identity simultaneously. Under various optimization training strategies, the best-performing model consistently showed 'share-separate' organization. The two separate branches of the best-performing model also exhibited distinct abilities to categorize facial expression and identity, and these abilities increased along the facial expression or identity branches toward high layers. By comparing the representational similarities between the best-performing model and functional magnetic resonance imaging (fMRI) responses in the human visual cortex to the same face stimuli, the face-selective posterior superior temporal sulcus (pSTS) in the dorsal visual cortex was significantly correlated with layers in the expression branch of the model, and the anterior inferotemporal cortex (aIT) and anterior fusiform face area (aFFA) in the ventral visual cortex were significantly correlated with layers in the identity branch of the model. Besides, the aFFA and aIT better matched the high layers of the model, while the posterior FFA (pFFA) and occipital facial area (OFA) better matched the middle and early layers of the model, respectively. Overall, our study provides a task-optimization computational model to better understand the neural mechanism underlying face recognition, which suggest that similar to the best-performing model, the human visual system exhibits both dissociated and hierarchical neuroanatomical organization when simultaneously coding facial identity and expression.
Collapse
|
26
|
Dupré la Tour T, Eickenberg M, Nunez-Elizalde AO, Gallant JL. Feature-space selection with banded ridge regression. Neuroimage 2022; 264:119728. [PMID: 36334814 DOI: 10.1016/j.neuroimage.2022.119728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 10/05/2022] [Accepted: 10/31/2022] [Indexed: 11/09/2022] Open
Abstract
Encoding models provide a powerful framework to identify the information represented in brain recordings. In this framework, a stimulus representation is expressed within a feature space and is used in a regularized linear regression to predict brain activity. To account for a potential complementarity of different feature spaces, a joint model is fit on multiple feature spaces simultaneously. To adapt regularization strength to each feature space, ridge regression is extended to banded ridge regression, which optimizes a different regularization hyperparameter per feature space. The present paper proposes a method to decompose over feature spaces the variance explained by a banded ridge regression model. It also describes how banded ridge regression performs a feature-space selection, effectively ignoring non-predictive and redundant feature spaces. This feature-space selection leads to better prediction accuracy and to better interpretability. Banded ridge regression is then mathematically linked to a number of other regression methods with similar feature-space selection mechanisms. Finally, several methods are proposed to address the computational challenge of fitting banded ridge regressions on large numbers of voxels and feature spaces. All implementations are released in an open-source Python package called Himalaya.
Collapse
|
27
|
Mocz V, Vaziri-Pashkam M, Chun M, Xu Y. Predicting Identity-Preserving Object Transformations in Human Posterior Parietal Cortex and Convolutional Neural Networks. J Cogn Neurosci 2022; 34:2406-2435. [PMID: 36122358 PMCID: PMC9988239 DOI: 10.1162/jocn_a_01916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Previous research shows that, within human occipito-temporal cortex (OTC), we can use a general linear mapping function to link visual object responses across nonidentity feature changes, including Euclidean features (e.g., position and size) and non-Euclidean features (e.g., image statistics and spatial frequency). Although the learned mapping is capable of predicting responses of objects not included in training, these predictions are better for categories included than those not included in training. These findings demonstrate a near-orthogonal representation of object identity and nonidentity features throughout human OTC. Here, we extended these findings to examine the mapping across both Euclidean and non-Euclidean feature changes in human posterior parietal cortex (PPC), including functionally defined regions in inferior and superior intraparietal sulcus. We additionally examined responses in five convolutional neural networks (CNNs) pretrained with object classification, as CNNs are considered as the current best model of the primate ventral visual system. We separately compared results from PPC and CNNs with those of OTC. We found that a linear mapping function could successfully link object responses in different states of nonidentity transformations in human PPC and CNNs for both Euclidean and non-Euclidean features. Overall, we found that object identity and nonidentity features are represented in a near-orthogonal, rather than complete-orthogonal, manner in PPC and CNNs, just like they do in OTC. Meanwhile, some differences existed among OTC, PPC, and CNNs. These results demonstrate the similarities and differences in how visual object information across an identity-preserving image transformation may be represented in OTC, PPC, and CNNs.
Collapse
|
28
|
Xiao W, Li J, Zhang C, Wang L, Chen P, Yu Z, Tong L, Yan B. High-Level Visual Encoding Model Framework with Hierarchical Ventral Stream-Optimized Neural Networks. Brain Sci 2022; 12:1101. [PMID: 36009164 DOI: 10.3390/brainsci12081101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 07/26/2022] [Accepted: 08/18/2022] [Indexed: 11/20/2022] Open
Abstract
Visual encoding models based on deep neural networks (DNN) show good performance in predicting brain activity in low-level visual areas. However, due to the amount of neural data limitation, DNN-based visual encoding models are difficult to fit for high-level visual areas, resulting in insufficient encoding performance. The ventral stream suggests that higher visual areas receive information from lower visual areas, which is not fully reflected in the current encoding models. In the present study, we propose a novel visual encoding model framework which uses the hierarchy of representations in the ventral stream to improve the model’s performance in high-level visual areas. Under the framework, we propose two categories of hierarchical encoding models from the voxel and the feature perspectives to realize the hierarchical representations. From the voxel perspective, we first constructed an encoding model for the low-level visual area (V1 or V2) and extracted the voxel space predicted by the model. Then we use the extracted voxel space of the low-level visual area to predict the voxel space of the high-level visual area (V4 or LO) via constructing a voxel-to-voxel model. From the feature perspective, the feature space of the first model is extracted to predict the voxel space of the high-level visual area. The experimental results show that two categories of hierarchical encoding models effectively improve the encoding performance in V4 and LO. In addition, the proportion of the best-encoded voxels for different models in V4 and LO show that our proposed models have obvious advantages in prediction accuracy. We find that the hierarchy of representations in the ventral stream has a positive effect on improving the performance of the existing model in high-level visual areas.
Collapse
|
29
|
Dillmann R, Rönnau A. Biomorphic robot controls: event driven model free deep SNNs for complex visuomotor tasks. Artif Life Robotics 2022. [DOI: 10.1007/s10015-022-00769-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
30
|
Xu H, Liu M, Zhang D. How does the brain represent the semantic content of an image? Neural Netw 2022; 154:31-42. [DOI: 10.1016/j.neunet.2022.06.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 04/13/2022] [Accepted: 06/28/2022] [Indexed: 11/24/2022]
|
31
|
Gaziv G, Beliy R, Granot N, Hoogi A, Strappini F, Golan T, Irani M. Self-supervised Natural Image Reconstruction and Large-scale Semantic Classification from Brain Activity. Neuroimage 2022; 254:119121. [PMID: 35342004 PMCID: PMC9133799 DOI: 10.1016/j.neuroimage.2022.119121] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 01/19/2022] [Accepted: 03/19/2022] [Indexed: 11/24/2022] Open
Abstract
Reconstructing natural images and decoding their semantic category from fMRI brain recordings is challenging. Acquiring sufficient pairs of images and their corresponding fMRI responses, which span the huge space of natural images, is prohibitive. We present a novel self-supervised approach that goes well beyond the scarce paired data, for achieving both: (i) state-of-the art fMRI-to-image reconstruction, and (ii) first-ever large-scale semantic classification from fMRI responses. By imposing cycle consistency between a pair of deep neural networks (from image-to-fMRI & from fMRI-to-image), we train our image reconstruction network on a large number of "unpaired" natural images (images without fMRI recordings) from many novel semantic categories. This enables to adapt our reconstruction network to a very rich semantic coverage without requiring any explicit semantic supervision. Specifically, we find that combining our self-supervised training with high-level perceptual losses, gives rise to new reconstruction & classification capabilities. In particular, this perceptual training enables to classify well fMRIs of never-before-seen semantic classes, without requiring any class labels during training. This gives rise to: (i) Unprecedented image-reconstruction from fMRI of never-before-seen images (evaluated by image metrics and human testing), and (ii) Large-scale semantic classification of categories that were never-before-seen during network training. Such large-scale (1000-way) semantic classification from fMRI recordings has never been demonstrated before. Finally, we provide evidence for the biological consistency of our learned model.
Collapse
Affiliation(s)
- Guy Gaziv
- Dept. of Computer Science and Applied Math, Weizmann Institute of Science, Rehovot, Israel.
| | - Roman Beliy
- Dept. of Computer Science and Applied Math, Weizmann Institute of Science, Rehovot, Israel
| | - Niv Granot
- Dept. of Computer Science and Applied Math, Weizmann Institute of Science, Rehovot, Israel
| | - Assaf Hoogi
- Dept. of Computer Science and Applied Math, Weizmann Institute of Science, Rehovot, Israel
| | | | - Tal Golan
- Zuckerman Institute, Columbia University, New York, NY USA
| | - Michal Irani
- Dept. of Computer Science and Applied Math, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
32
|
Feldotto B, Lengenfelder H, Röhrbein F, Knoll AC. Network Layer Analysis for a RL-Based Robotic Reaching Task. Front Robot AI 2022; 9:799644. [PMID: 35813855 PMCID: PMC9260386 DOI: 10.3389/frobt.2022.799644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
Recent experiments indicate that pretraining of end-to-end reinforcement learning neural networks on general tasks can speed up the training process for specific robotic applications. However, it remains open if these networks form general feature extractors and a hierarchical organization that can be reused as in, for example, convolutional neural networks. In this study, we analyze the intrinsic neuron activation in networks trained for target reaching of robot manipulators with increasing joint number and analyze the individual neuron activation distribution within the network. We introduce a pruning algorithm to increase network information density and depict correlations of neuron activation patterns. Finally, we search for projections of neuron activation among networks trained for robot kinematics of different complexity. As a result, we show that the input and output network layers entail more distinct neuron activation in contrast to inner layers. Our pruning algorithm reduces the network size significantly and increases the distance of neuron activation while keeping a high performance in training and evaluation. Our results demonstrate that robots with small difference in joint number show higher layer-wise projection accuracy, whereas more distinct robot kinematics reveal dominant projections to the first layer.
Collapse
Affiliation(s)
- Benedikt Feldotto
- Robotics, Artificial Intelligence and Real-Time Systems, Department of Computer Science, Technical University of Munich, Munich, Germany
- *Correspondence: Benedikt Feldotto,
| | - Heiko Lengenfelder
- Robotics, Artificial Intelligence and Real-Time Systems, Department of Computer Science, Technical University of Munich, Munich, Germany
| | - Florian Röhrbein
- Neurorobotics, Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany
| | - Alois C. Knoll
- Robotics, Artificial Intelligence and Real-Time Systems, Department of Computer Science, Technical University of Munich, Munich, Germany
| |
Collapse
|
33
|
Fiser J, Lengyel G. Statistical Learning in Vision. Annu Rev Vis Sci 2022; 8:265-290. [PMID: 35727961 DOI: 10.1146/annurev-vision-100720-103343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Vision and learning have long been considered to be two areas of research linked only distantly. However, recent developments in vision research have changed the conceptual definition of vision from a signal-evaluating process to a goal-oriented interpreting process, and this shift binds learning, together with the resulting internal representations, intimately to vision. In this review, we consider various types of learning (perceptual, statistical, and rule/abstract) associated with vision in the past decades and argue that they represent differently specialized versions of the fundamental learning process, which must be captured in its entirety when applied to complex visual processes. We show why the generalized version of statistical learning can provide the appropriate setup for such a unified treatment of learning in vision, what computational framework best accommodates this kind of statistical learning, and what plausible neural scheme could feasibly implement this framework. Finally, we list the challenges that the field of statistical learning faces in fulfilling the promise of being the right vehicle for advancing our understanding of vision in its entirety. Expected final online publication date for the Annual Review of Vision Science, Volume 8 is September 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- József Fiser
- Department of Cognitive Science, Center for Cognitive Computation, Central European University, Vienna 1100, Austria;
| | - Gábor Lengyel
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
34
|
Liu M, Amey RC, Backer RA, Simon JP, Forbes CE. Behavioral Studies Using Large-Scale Brain Networks – Methods and Validations. Front Hum Neurosci 2022; 16:875201. [PMID: 35782044 PMCID: PMC9244405 DOI: 10.3389/fnhum.2022.875201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
Mapping human behaviors to brain activity has become a key focus in modern cognitive neuroscience. As methods such as functional MRI (fMRI) advance cognitive scientists show an increasing interest in investigating neural activity in terms of functional connectivity and brain networks, rather than activation in a single brain region. Due to the noisy nature of neural activity, determining how behaviors are associated with specific neural signals is not well-established. Previous research has suggested graph theory techniques as a solution. Graph theory provides an opportunity to interpret human behaviors in terms of the topological organization of brain network architecture. Graph theory-based approaches, however, only scratch the surface of what neural connections relate to human behavior. Recently, the development of data-driven methods, e.g., machine learning and deep learning approaches, provide a new perspective to study the relationship between brain networks and human behaviors across the whole brain, expanding upon past literatures. In this review, we sought to revisit these data-driven approaches to facilitate our understanding of neural mechanisms and build models of human behaviors. We start with the popular graph theory approach and then discuss other data-driven approaches such as connectome-based predictive modeling, multivariate pattern analysis, network dynamic modeling, and deep learning techniques that quantify meaningful networks and connectivity related to cognition and behaviors. Importantly, for each topic, we discuss the pros and cons of the methods in addition to providing examples using our own data for each technique to describe how these methods can be applied to real-world neuroimaging data.
Collapse
Affiliation(s)
- Mengting Liu
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen, China
- Mengting Liu,
| | - Rachel C. Amey
- Department of Psychological and Brain Sciences, University of Delaware, Newark, DE, United States
- *Correspondence: Rachel C. Amey,
| | - Robert A. Backer
- Department of Psychological and Brain Sciences, University of Delaware, Newark, DE, United States
| | - Julia P. Simon
- Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| | - Chad E. Forbes
- Department of Psychology, Florida Atlantic University, Boca Raton, FL, United States
| |
Collapse
|
35
|
Nguyen H, Nguyen T, Tran V, Dao T. A Deep Learning Approach for Predicting Subject-Specific Human Skull Shape from Head Toward a Decision Support System for Home-Based Facial Rehabilitation. Ing Rech Biomed 2022. [DOI: 10.1016/j.irbm.2022.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
36
|
Millidge B, Tschantz A, Buckley CL. Predictive Coding Approximates Backprop along Arbitrary Computation Graphs. Neural Comput 2022; 34:1329-1368. [PMID: 35534010 DOI: 10.1162/neco_a_01497] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 11/10/2021] [Indexed: 11/04/2022]
Abstract
Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. Recently it has been shown that backprop in multilayer perceptrons (MLPs) can be approximated using predictive coding, a biologically plausible process theory of cortical computation that relies solely on local and Hebbian updates. The power of backprop, however, lies not in its instantiation in MLPs but in the concept of automatic differentiation, which allows for the optimization of any differentiable program expressed as a computation graph. Here, we demonstrate that predictive coding converges asymptotically (and in practice, rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules. We apply this result to develop a straightforward strategy to translate core machine learning architectures into their predictive coding equivalents. We construct predictive coding convolutional neural networks, recurrent neural networks, and the more complex long short-term memory, which include a nonlayer-like branching internal graph structure and multiplicative interactions. Our models perform equivalently to backprop on challenging machine learning benchmarks while using only local and (mostly) Hebbian plasticity. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry and may also contribute to the development of completely distributed neuromorphic architectures.
Collapse
Affiliation(s)
- Beren Millidge
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, U.K.
| | - Alexander Tschantz
- Sackler Center for Consciousness Science, School of Engineering and Informatics, University of Sussex, Brighton BN1 9QJ, U.K.
| | - Christopher L Buckley
- Evolutionary and Adaptive Systems Research Group, School of Engineering and Informatics, University of Sussex, Brighton BN1 9QJ, U.K.
| |
Collapse
|
37
|
Sörensen LKA, Bohté SM, Slagter HA, Scholte HS. Arousal state affects perceptual decision-making by modulating hierarchical sensory processing in a large-scale visual system model. PLoS Comput Biol 2022; 18:e1009976. [PMID: 35377876 PMCID: PMC9009767 DOI: 10.1371/journal.pcbi.1009976] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 04/14/2022] [Accepted: 02/26/2022] [Indexed: 11/18/2022] Open
Abstract
Arousal levels strongly affect task performance. Yet, what arousal level is optimal for a task depends on its difficulty. Easy task performance peaks at higher arousal levels, whereas performance on difficult tasks displays an inverted U-shape relationship with arousal, peaking at medium arousal levels, an observation first made by Yerkes and Dodson in 1908. It is commonly proposed that the noradrenergic locus coeruleus system regulates these effects on performance through a widespread release of noradrenaline resulting in changes of cortical gain. This account, however, does not explain why performance decays with high arousal levels only in difficult, but not in simple tasks. Here, we present a mechanistic model that revisits the Yerkes-Dodson effect from a sensory perspective: a deep convolutional neural network augmented with a global gain mechanism reproduced the same interaction between arousal state and task difficulty in its performance. Investigating this model revealed that global gain states differentially modulated sensory information encoding across the processing hierarchy, which explained their differential effects on performance on simple versus difficult tasks. These findings offer a novel hierarchical sensory processing account of how, and why, arousal state affects task performance.
Collapse
Affiliation(s)
- Lynn K. A. Sörensen
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Amsterdam Brain & Cognition (ABC), University of Amsterdam, Amsterdam, Netherlands
- * E-mail: (LKAS); (HSS)
| | - Sander M. Bohté
- Machine Learning Group, Centrum Wiskunde & Informatica, Amsterdam, Netherlands
- Swammerdam Institute of Life Sciences (SILS), University of Amsterdam, Amsterdam, Netherlands
- Bernoulli Institute, Rijksuniversiteit Groningen, Groningen, Netherlands
| | - Heleen A. Slagter
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
- Institute of Brain and Behaviour Amsterdam, Vrije Universiteit Amsterdam, Netherlands
| | - H. Steven Scholte
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Amsterdam Brain & Cognition (ABC), University of Amsterdam, Amsterdam, Netherlands
- * E-mail: (LKAS); (HSS)
| |
Collapse
|
38
|
Abstract
Deep learning algorithms trained to predict masked words from large amount of text have recently been shown to generate activations similar to those of the human brain. However, what drives this similarity remains currently unknown. Here, we systematically compare a variety of deep language models to identify the computational principles that lead them to generate brain-like representations of sentences. Specifically, we analyze the brain responses to 400 isolated sentences in a large cohort of 102 subjects, each recorded for two hours with functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG). We then test where and when each of these algorithms maps onto the brain responses. Finally, we estimate how the architecture, training, and performance of these models independently account for the generation of brain-like representations. Our analyses reveal two main findings. First, the similarity between the algorithms and the brain primarily depends on their ability to predict words from context. Second, this similarity reveals the rise and maintenance of perceptual, lexical, and compositional representations within each cortical region. Overall, this study shows that modern language algorithms partially converge towards brain-like solutions, and thus delineates a promising path to unravel the foundations of natural language processing. Charlotte Caucheteux and Jean-Rémi King examine the ability of transformer neural networks trained on word prediction tasks to fit representations in the human brain measured with fMRI and MEG. Their results provide further insight into the workings of transformer language models and their relevance to brain responses.
Collapse
Affiliation(s)
- Charlotte Caucheteux
- Facebook AI Research, Paris, France. .,Université Paris-Saclay, Inria, CEA, Palaiseau, France.
| | - Jean-Rémi King
- Facebook AI Research, Paris, France. .,École normale supérieure, PSL University, CNRS, Paris, France.
| |
Collapse
|
39
|
|
40
|
Abstract
In recent years, deep learning models have demonstrated an inherently better ability to tackle non-linear classification tasks, due to advances in deep learning architectures. However, much remains to be achieved, especially in designing deep convolutional neural network (CNN) configurations. The number of hyper-parameters that need to be optimized to achieve accuracy in classification problems increases with every layer used, and the selection of kernels in each CNN layer has an impact on the overall CNN performance in the training stage, as well as in the classification process. When a popular classifier fails to perform acceptably in practical applications, it may be due to deficiencies in the algorithm and data processing. Thus, understanding the feature extraction process provides insights to help optimize pre-trained architectures, better generalize the models, and obtain the context of each layer’s features. In this work, we aim to improve feature extraction through the use of a texture amortization map (TAM). An algorithm was developed to obtain characteristics from the filters amortizing the filter’s effect depending on the texture of the neighboring pixels. From the initial algorithm, a novel geometric classification score (GCS) was developed, in order to obtain a measure that indicates the effect of one class on another in a classification problem, in terms of the complexity of the learnability in every layer of the deep learning architecture. For this, we assume that all the data transformations in the inner layers still belong to a Euclidean space. In this scenario, we can evaluate which layers provide the best transformations in a CNN, allowing us to reduce the weights of the deep learning architecture using the geometric hypothesis.
Collapse
|
41
|
Du C, Du C, Huang L, Wang H, He H. Structured Neural Decoding With Multitask Transfer Learning of Deep Neural Network Representations. IEEE Trans Neural Netw Learn Syst 2022; 33:600-614. [PMID: 33074832 DOI: 10.1109/tnnls.2020.3028167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The reconstruction of visual information from human brain activity is a very important research topic in brain decoding. Existing methods ignore the structural information underlying the brain activities and the visual features, which severely limits their performance and interpretability. Here, we propose a hierarchically structured neural decoding framework by using multitask transfer learning of deep neural network (DNN) representations and a matrix-variate Gaussian prior. Our framework consists of two stages, Voxel2Unit and Unit2Pixel. In Voxel2Unit, we decode the functional magnetic resonance imaging (fMRI) data to the intermediate features of a pretrained convolutional neural network (CNN). In Unit2Pixel, we further invert the predicted CNN features back to the visual images. Matrix-variate Gaussian prior allows us to take into account the structures between feature dimensions and between regression tasks, which are useful for improving decoding effectiveness and interpretability. This is in contrast with the existing single-output regression models that usually ignore these structures. We conduct extensive experiments on two real-world fMRI data sets, and the results show that our method can predict CNN features more accurately and reconstruct the perceived natural images and faces with higher quality.
Collapse
|
42
|
Francl A, McDermott JH. Deep neural network models of sound localization reveal how perception is adapted to real-world environments. Nat Hum Behav 2022; 6:111-33. [PMID: 35087192 DOI: 10.1038/s41562-021-01244-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 10/29/2021] [Indexed: 11/15/2022]
Abstract
Mammals localize sounds using information from their two ears.
Localization in real-world conditions is challenging, as echoes provide
erroneous information, and noises mask parts of target sounds. To better
understand real-world localization we equipped a deep neural network with human
ears and trained it to localize sounds in a virtual environment. The resulting
model localized accurately in realistic conditions with noise and reverberation.
In simulated experiments, the model exhibited many features of human spatial
hearing: sensitivity to monaural spectral cues and interaural time and level
differences, integration across frequency, biases for sound onsets, and limits
on localization of concurrent sources. But when trained in unnatural
environments without either reverberation, noise, or natural sounds, these
performance characteristics deviated from those of humans. The results show how
biological hearing is adapted to the challenges of real-world environments and
illustrate how artificial neural networks can reveal the real-world constraints
that shape perception.
Collapse
|
43
|
Abstract
Anterior regions of the ventral visual stream encode substantial information about object categories. Are top-down category-level forces critical for arriving at this representation, or can this representation be formed purely through domain-general learning of natural image structure? Here we present a fully self-supervised model which learns to represent individual images, rather than categories, such that views of the same image are embedded nearby in a low-dimensional feature space, distinctly from other recently encountered views. We find that category information implicitly emerges in the local similarity structure of this feature space. Further, these models learn hierarchical features which capture the structure of brain responses across the human ventral visual stream, on par with category-supervised models. These results provide computational support for a domain-general framework guiding the formation of visual representation, where the proximate goal is not explicitly about category information, but is instead to learn unique, compressed descriptions of the visual world.
Collapse
Affiliation(s)
- Talia Konkle
- Department of Psychology & Center for Brain Science, Harvard University, Cambridge, MA, USA.
| | - George A Alvarez
- Department of Psychology & Center for Brain Science, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
44
|
Sörensen LKA, Zambrano D, Slagter HA, Bohté SM, Scholte HS. Leveraging Spiking Deep Neural Networks to Understand the Neural Mechanisms Underlying Selective Attention. J Cogn Neurosci 2022; 34:655-674. [DOI: 10.1162/jocn_a_01819] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Spatial attention enhances sensory processing of goal-relevant information and improves perceptual sensitivity. Yet, the specific neural mechanisms underlying the effects of spatial attention on performance are still contested. Here, we examine different attention mechanisms in spiking deep convolutional neural networks. We directly contrast effects of precision (internal noise suppression) and two different gain modulation mechanisms on performance on a visual search task with complex real-world images. Unlike standard artificial neurons, biological neurons have saturating activation functions, permitting implementation of attentional gain as gain on a neuron's input or on its outgoing connection. We show that modulating the connection is most effective in selectively enhancing information processing by redistributing spiking activity and by introducing additional task-relevant information, as shown by representational similarity analyses. Precision only produced minor attentional effects in performance. Our results, which mirror empirical findings, show that it is possible to adjudicate between attention mechanisms using more biologically realistic models and natural stimuli.
Collapse
Affiliation(s)
| | - Davide Zambrano
- Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- École Polytechnique Fédérale de Lausanne, Switzerland
| | | | - Sander M. Bohté
- University of Amsterdam, The Netherlands
- Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Rijksuniversiteit Groningen, The Netherlands
| | | |
Collapse
|
45
|
|
46
|
Wammes J, Norman KA, Turk-Browne N. Increasing stimulus similarity drives nonmonotonic representational change in hippocampus. eLife 2022; 11:e68344. [PMID: 34989336 PMCID: PMC8735866 DOI: 10.7554/elife.68344] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 08/09/2021] [Indexed: 12/16/2022] Open
Abstract
Studies of hippocampal learning have obtained seemingly contradictory results, with manipulations that increase coactivation of memories sometimes leading to differentiation of these memories, but sometimes not. These results could potentially be reconciled using the nonmonotonic plasticity hypothesis, which posits that representational change (memories moving apart or together) is a U-shaped function of the coactivation of these memories during learning. Testing this hypothesis requires manipulating coactivation over a wide enough range to reveal the full U-shape. To accomplish this, we used a novel neural network image synthesis procedure to create pairs of stimuli that varied parametrically in their similarity in high-level visual regions that provide input to the hippocampus. Sequences of these pairs were shown to human participants during high-resolution fMRI. As predicted, learning changed the representations of paired images in the dentate gyrus as a U-shaped function of image similarity, with neural differentiation occurring only for moderately similar images.
Collapse
Affiliation(s)
- Jeffrey Wammes
- Department of Psychology, Yale UniversityNew HavenUnited States
- Department of Psychology, Queen’s UniversityKingstonCanada
| | - Kenneth A Norman
- Department of Psychology, Princeton UniversityPrincetonUnited States
- Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| | | |
Collapse
|
47
|
Abstract
Functions in higher-order brain regions are the source of extensive debate. Although past trends have been to describe the brain-especially posterior cortical areas-in terms of a set of functional modules, a new emerging paradigm focuses on the integration of proximal functions. In this review, we synthesize emerging evidence that a variety of novel functions in the higher-order brain regions are due to convergence: convergence of macroscale gradients brings feature-rich representations into close proximity, presenting an opportunity for novel functions to arise. Using the TPJ as an example, we demonstrate that convergence is enabled via three properties of the brain: (1) hierarchical organization, (2) abstraction, and (3) equidistance. As gradients travel from primary sensory cortices to higher-order brain regions, information becomes abstracted and hierarchical, and eventually, gradients meet at a point maximally and equally distant from their sensory origins. This convergence, which produces multifaceted combinations, such as mentalizing another person's thought or projecting into a future space, parallels evolutionary and developmental characteristics in such regions, resulting in new cognitive and affective faculties.
Collapse
Affiliation(s)
- Heejung Jung
- University of Colorado Boulder.,Dartmouth College
| | - Tor D Wager
- University of Colorado Boulder.,Dartmouth College
| | | |
Collapse
|
48
|
Rakhimberdina Z, Jodelet Q, Liu X, Murata T. Natural Image Reconstruction From fMRI Using Deep Learning: A Survey. Front Neurosci 2021; 15:795488. [PMID: 34987359 PMCID: PMC8722107 DOI: 10.3389/fnins.2021.795488] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 11/23/2021] [Indexed: 11/17/2022] Open
Abstract
With the advent of brain imaging techniques and machine learning tools, much effort has been devoted to building computational models to capture the encoding of visual information in the human brain. One of the most challenging brain decoding tasks is the accurate reconstruction of the perceived natural images from brain activities measured by functional magnetic resonance imaging (fMRI). In this work, we survey the most recent deep learning methods for natural image reconstruction from fMRI. We examine these methods in terms of architectural design, benchmark datasets, and evaluation metrics and present a fair performance evaluation across standardized evaluation metrics. Finally, we discuss the strengths and limitations of existing studies and present potential future directions.
Collapse
Affiliation(s)
- Zarina Rakhimberdina
- Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory, Tokyo, Japan
| | - Quentin Jodelet
- Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory, Tokyo, Japan
| | - Xin Liu
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory, Tokyo, Japan
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Digital Architecture Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Tsuyoshi Murata
- Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory, Tokyo, Japan
| |
Collapse
|
49
|
Wang H, Huang L, Du C, Li D, Wang B, He H. Neural Encoding for Human Visual Cortex With Deep Neural Networks Learning “What” and “Where”. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2020.3007761] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
50
|
Kupers ER, Edadan A, Benson NC, Zuiderbaan W, de Jong MC, Dumoulin SO, Winawer J. A population receptive field model of the magnetoencephalography response. Neuroimage 2021; 244:118554. [PMID: 34509622 PMCID: PMC8631249 DOI: 10.1016/j.neuroimage.2021.118554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 07/16/2021] [Accepted: 09/02/2021] [Indexed: 12/23/2022] Open
Abstract
Computational models which predict the neurophysiological response from experimental stimuli have played an important role in human neuroimaging. One type of computational model, the population receptive field (pRF), has been used to describe cortical responses at the millimeter scale using functional magnetic resonance imaging (fMRI) and electrocorticography (ECoG). However, pRF models are not widely used for non-invasive electromagnetic field measurements (EEG/MEG), because individual sensors pool responses originating from several centimeter of cortex, containing neural populations with widely varying spatial tuning. Here, we introduce a forward-modeling approach in which pRFs estimated from fMRI data are used to predict MEG sensor responses. Subjects viewed contrast-reversing bar stimuli sweeping across the visual field in separate fMRI and MEG sessions. Individual subject's pRFs were modeled on the cortical surface at the millimeter scale using the fMRI data. We then predicted cortical time series and projected these predictions to MEG sensors using a biophysical MEG forward model, accounting for the pooling across cortex. We compared the predicted MEG responses to observed visually evoked steady-state responses measured in the MEG session. We found that pRF parameters estimated by fMRI could explain a substantial fraction of the variance in steady-state MEG sensor responses (up to 60% in individual sensors). Control analyses in which we artificially perturbed either pRF size or pRF position reduced MEG prediction accuracy, indicating that MEG data are sensitive to pRF properties derived from fMRI. Our model provides a quantitative approach to link fMRI and MEG measurements, thereby enabling advances in our understanding of spatiotemporal dynamics in human visual field maps.
Collapse
Affiliation(s)
- Eline R Kupers
- Department of Psychology, New York University, New York, NY 10003, United States; Center for Neural Science, New York University, New York, NY 10003, United States; Department of Psychology, Stanford University, Stanford, CA 94305, United States.
| | - Akhil Edadan
- Spinoza Center for Neuroimaging, Amsterdam 1105 BK, the Netherlands; Department of Experimental Psychology, Utrecht University, Utrecht 3584 CS, the Netherlands
| | - Noah C Benson
- Department of Psychology, New York University, New York, NY 10003, United States; Center for Neural Science, New York University, New York, NY 10003, United States; Sciences Institute, University of Washington, Seattle, WA 98195, United States
| | | | - Maartje C de Jong
- Spinoza Center for Neuroimaging, Amsterdam 1105 BK, the Netherlands; Department of Psychology, University of Amsterdam, Amsterdam 1001 NK, the Netherlands; Amsterdam Brain and Cognition (ABC), University of Amsterdam, Amsterdam 1001 NK, the Netherlands
| | - Serge O Dumoulin
- Spinoza Center for Neuroimaging, Amsterdam 1105 BK, the Netherlands; Department of Experimental Psychology, Utrecht University, Utrecht 3584 CS, the Netherlands; Department of Experimental and Applied Psychology, VU University, Amsterdam 1081 BT, the Netherlands
| | - Jonathan Winawer
- Department of Psychology, New York University, New York, NY 10003, United States; Center for Neural Science, New York University, New York, NY 10003, United States
| |
Collapse
|