1
|
Gamal M, Eldawlatly S. High-level visual processing in the lateral geniculate nucleus revealed using goal-driven deep learning. J Neurosci Methods 2025; 418:110429. [PMID: 40122470 DOI: 10.1016/j.jneumeth.2025.110429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 03/03/2025] [Accepted: 03/13/2025] [Indexed: 03/25/2025]
Abstract
BACKGROUND The Lateral Geniculate Nucleus (LGN) is an essential contributor to high-level visual processing despite being an early subcortical area in the visual system. Current LGN computational models focus on its basic properties, with less emphasis on its role in high-level vision. NEW METHOD We propose a high-level approach for encoding mouse LGN neural responses to natural scenes. This approach employs two deep neural networks (DNNs); namely VGG16 and ResNet50, as goal-driven models. We use these models as tools to better understand visual features encoded in the LGN. RESULTS Early layers of the DNNs represent the best LGN models. We also demonstrate that numerosity, as a high-level visual feature, is encoded, along with other visual features, in LGN neural activity. Results demonstrate that intermediate layers are better in representing numerosity compared to early layers. Early layers are better at predicting simple visual features, while intermediate layers are better at predicting more complex features. Finally, we show that an ensemble model of an early and an intermediate layer achieves high neural prediction accuracy and numerosity representation. COMPARISON WITH EXISTING METHOD(S) Our approach emphasizes the role of analyzing the inner workings of DNNs to demonstrate the representation of a high-level feature such as numerosity in the LGN, as opposed to the common belief about the simplicity of the LGN. CONCLUSIONS We demonstrate that goal-driven DNNs can be used as high-level vision models of the LGN for neural prediction and as an exploration tool to better understand the role of the LGN.
Collapse
Affiliation(s)
- Mai Gamal
- Computer Science and Engineering Department, German University in Cairo, Cairo 11835, Egypt.
| | - Seif Eldawlatly
- Computer and Systems Engineering Department, Ain Shams University, Cairo 11517, Egypt; Computer Science and Engineering Department, The American University in Cairo, Cairo 11835, Egypt.
| |
Collapse
|
2
|
Banga K, Benson J, Bhagat J, Biderman D, Birman D, Bonacchi N, Bruijns SA, Buchanan K, Campbell RAA, Carandini M, Chapuis GA, Churchland AK, Davatolhagh MF, Lee HD, Faulkner M, Gerçek B, Hu F, Huntenburg J, Hurwitz CL, Khanal A, Krasniak C, Lau P, Langfield C, Mackenzie N, Meijer GT, Miska NJ, Mohammadi Z, Noel JP, Paninski L, Pan-Vazquez A, Rossant C, Roth N, Schartner M, Socha KZ, Steinmetz NA, Svoboda K, Taheri M, Urai AE, Wang S, Wells M, West SJ, Whiteway MR, Winter O, Witten IB, Zhang Y. Reproducibility of in vivo electrophysiological measurements in mice. eLife 2025; 13:RP100840. [PMID: 40354112 PMCID: PMC12068871 DOI: 10.7554/elife.100840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2025] Open
Abstract
Understanding brain function relies on the collective work of many labs generating reproducible results. However, reproducibility has not been systematically assessed within the context of electrophysiological recordings during cognitive behaviors. To address this, we formed a multi-lab collaboration using a shared, open-source behavioral task and experimental apparatus. Experimenters in 10 laboratories repeatedly targeted Neuropixels probes to the same location (spanning secondary visual areas, hippocampus, and thalamus) in mice making decisions; this generated a total of 121 experimental replicates, a unique dataset for evaluating reproducibility of electrophysiology experiments. Despite standardizing both behavioral and electrophysiological procedures, some experimental outcomes were highly variable. A closer analysis uncovered that variability in electrode targeting hindered reproducibility, as did the limited statistical power of some routinely used electrophysiological analyses, such as single-neuron tests of modulation by individual task parameters. Reproducibility was enhanced by histological and electrophysiological quality-control criteria. Our observations suggest that data from systems neuroscience is vulnerable to a lack of reproducibility, but that across-lab standardization, including metrics we propose, can serve to mitigate this.
Collapse
Affiliation(s)
| | - Kush Banga
- University College LondonLondonUnited Kingdom
| | | | - Jai Bhagat
- University College LondonLondonUnited Kingdom
| | | | - Daniel Birman
- Department of Neurobiology and Biophysics, University of WashingtonSeattleUnited States
| | - Niccolò Bonacchi
- William James Center for Research, ISPA - Instituto UniversitárioLisbonPortugal
| | | | | | | | | | | | | | | | | | | | - Berk Gerçek
- University of Geneva, SwitzerlandGenevaSwitzerland
| | - Fei Hu
- University of California, BerkeleyBerkeleyUnited States
| | | | | | - Anup Khanal
- University of California, Los AngelesLos AngelesUnited States
| | | | - Petrina Lau
- University College LondonLondonUnited Kingdom
| | | | - Nancy Mackenzie
- Department of Neurobiology and Biophysics, University of WashingtonSeattleUnited States
| | | | | | | | | | | | | | | | - Noam Roth
- Department of Neurobiology and Biophysics, University of WashingtonSeattleUnited States
| | | | | | - Nicholas A Steinmetz
- Department of Neurobiology and Biophysics, University of WashingtonSeattleUnited States
| | - Karel Svoboda
- Allen Institute for Neural Dynamics WASeattleUnited States
| | - Marsa Taheri
- University of California, Los AngelesLos AngelesUnited States
| | | | - Shuqi Wang
- School of Computer and Communication Sciences, EPFLLausanneSwitzerland
| | - Miles Wells
- University College LondonLondonUnited Kingdom
| | | | | | | | | | - Yizi Zhang
- Columbia UniversityNew YorkUnited States
| |
Collapse
|
3
|
Hernández-Cámara P, Vila-Tomás J, Laparra V, Malo J. Dissecting the effectiveness of deep features as metric of perceptual image quality. Neural Netw 2025; 185:107189. [PMID: 39874824 DOI: 10.1016/j.neunet.2025.107189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/07/2025] [Accepted: 01/15/2025] [Indexed: 01/30/2025]
Abstract
There is an open debate on the role of artificial networks to understand the visual brain. Internal representations of images in artificial networks develop human-like properties. In particular, evaluating distortions using differences between internal features is correlated to human perception of distortion. However, the origins of this correlation are not well understood. Here, we dissect the different factors involved in the emergence of human-like behavior: function, architecture, and environment. To do so, we evaluate the aforementioned human-network correlation at different depths of 46 pre-trained model configurations that include no psycho-visual information. The results show that most of the models correlate better with human opinion than SSIM (a de-facto standard in subjective image quality). Moreover, some models are better than state-of-the-art networks specifically tuned for the application (LPIPS, DISTS). Regarding the function, supervised classification leads to nets that correlate better with humans than the explored models for self- and non-supervised tasks. However, we found that better performance in the task does not imply more human behavior. Regarding the architecture, simpler models correlate better with humans than very deep nets and generally, the highest correlation is not achieved in the last layer. Finally, regarding the environment, training with large natural datasets leads to bigger correlations than training in smaller databases with restricted content, as expected. We also found that the best classification models are not the best for predicting human distances. In the general debate about understanding human vision, our empirical findings imply that explanations have not to be focused on a single abstraction level, but all function, architecture, and environment are relevant.
Collapse
Affiliation(s)
| | - Jorge Vila-Tomás
- Image Processing Lab., Universitat de València, 46980 Paterna, Spain.
| | - Valero Laparra
- Image Processing Lab., Universitat de València, 46980 Paterna, Spain.
| | - Jesús Malo
- Image Processing Lab., Universitat de València, 46980 Paterna, Spain.
| |
Collapse
|
4
|
Cooray GK, Cooray V, Friston KJ. Cortical dynamics of neural-connectivity fields. J Comput Neurosci 2025:10.1007/s10827-025-00903-8. [PMID: 40208381 DOI: 10.1007/s10827-025-00903-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2024] [Revised: 03/13/2025] [Accepted: 03/24/2025] [Indexed: 04/11/2025]
Abstract
Macroscopic studies of cortical tissue reveal a prevalence of oscillatory activity, that reflect a fine tuning of neural interactions. This research extends neural field theories by incorporating generalized oscillatory dynamics into previous work on conservative or semi-conservative neural field dynamics. Prior studies have largely assumed isotropic connections among neural units; however, this study demonstrates that a broad range of anisotropic and fluctuating connections can still sustain oscillations. Using Lagrangian field methods, we examine different types of connectivity, their dynamics, and potential interactions with neural fields. From this theoretical foundation, we derive a framework that incorporates Hebbian and non-Hebbian learning - i.e., plasticity - into the study of neural fields via the concept of a connectivity field.
Collapse
Affiliation(s)
- Gerald K Cooray
- Clinical Neuroscience, Karolinska Institutet, Eugeniav, 17177, Stockholm, Sweden.
| | - Vernon Cooray
- Angstrom Laboratory, Uppsala University, Lägerhyddsv 1, 752 37, Uppsala, Sweden
| | - Karl J Friston
- Functional Imaging Laboratory at Queens Square Institute of Neurology, University College London, 12 Queens Square, London, WC1N 3AR, UK
| |
Collapse
|
5
|
Wang EY, Fahey PG, Ding Z, Papadopoulos S, Ponder K, Weis MA, Chang A, Muhammad T, Patel S, Ding Z, Tran D, Fu J, Schneider-Mizell CM, Reid RC, Collman F, da Costa NM, Franke K, Ecker AS, Reimer J, Pitkow X, Sinz FH, Tolias AS. Foundation model of neural activity predicts response to new stimulus types. Nature 2025; 640:470-477. [PMID: 40205215 PMCID: PMC11981942 DOI: 10.1038/s41586-025-08829-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 02/21/2025] [Indexed: 04/11/2025]
Abstract
The complexity of neural circuits makes it challenging to decipher the brain's algorithms of intelligence. Recent breakthroughs in deep learning have produced models that accurately simulate brain activity, enhancing our understanding of the brain's computational objectives and neural coding. However, it is difficult for such models to generalize beyond their training distribution, limiting their utility. The emergence of foundation models1 trained on vast datasets has introduced a new artificial intelligence paradigm with remarkable generalization capabilities. Here we collected large amounts of neural activity from visual cortices of multiple mice and trained a foundation model to accurately predict neuronal responses to arbitrary natural videos. This model generalized to new mice with minimal training and successfully predicted responses across various new stimulus domains, such as coherent motion and noise patterns. Beyond neural response prediction, the model also accurately predicted anatomical cell types, dendritic features and neuronal connectivity within the MICrONS functional connectomics dataset2. Our work is a crucial step towards building foundation models of the brain. As neuroscience accumulates larger, multimodal datasets, foundation models will reveal statistical regularities, enable rapid adaptation to new tasks and accelerate research.
Collapse
Affiliation(s)
- Eric Y Wang
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Paul G Fahey
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Bio-X, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Zhuokun Ding
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Bio-X, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Stelios Papadopoulos
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Bio-X, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Kayla Ponder
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Marissa A Weis
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
| | - Andersen Chang
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Taliah Muhammad
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Saumil Patel
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Bio-X, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Zhiwei Ding
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Dat Tran
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Jiakun Fu
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | | | - R Clay Reid
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | | | - Katrin Franke
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Bio-X, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Alexander S Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | - Jacob Reimer
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Xaq Pitkow
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA
| | - Fabian H Sinz
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Andreas S Tolias
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA.
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA.
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA.
- Stanford Bio-X, Stanford University, Stanford, CA, USA.
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA.
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
6
|
Lin I, Wang T, Gao S, Tang S, Lee TS. Self-Attention-Based Contextual Modulation Improves Neural System Identification. ARXIV 2025:arXiv:2406.07843v3. [PMID: 40061115 PMCID: PMC11888551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/19/2025]
Abstract
Convolutional neural networks (CNNs) have been shown to be state-of-the-art models for visual cortical neurons. Cortical neurons in the primary visual cortex are sensitive to contextual information mediated by extensive horizontal and feedback connections. Standard CNNs integrate global contextual information to model contextual modulation via two mechanisms: successive convolutions and a fully connected readout layer. In this paper, we find that self-attention (SA), an implementation of non-local network mechanisms, can improve neural response predictions over parameter-matched CNNs in two key metrics: tuning curve correlation and peak tuning. We introduce peak tuning as a metric to evaluate a model's ability to capture a neuron's top feature preference. We factorize networks to assess each context mechanism, revealing that information in the local receptive field is most important for modeling overall tuning, but surround information is critically necessary for characterizing the tuning peak. We find that self-attention can replace posterior spatial-integration convolutions when learned incrementally, and is further enhanced in the presence of a fully connected readout layer, suggesting that the two context mechanisms are complementary. Finally, we find that decomposing receptive field learning and contextual modulation learning in an incremental manner may be an effective and robust mechanism for learning surround-center interactions.
Collapse
Affiliation(s)
| | | | - Shang Gao
- Carnegie Mellon University
- Massachusetts Institute of Technology
| | | | | |
Collapse
|
7
|
Papale P, Wang F, Self MW, Roelfsema PR. An extensive dataset of spiking activity to reveal the syntax of the ventral stream. Neuron 2025; 113:539-553.e5. [PMID: 39809277 DOI: 10.1016/j.neuron.2024.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 08/16/2024] [Accepted: 12/03/2024] [Indexed: 01/16/2025]
Abstract
Visual neuroscience benefits from high-quality datasets with neuronal responses to many images. Several neuroimaging datasets have been published in recent years, but no comparable dataset with spiking activity exists. Here, we introduce the THINGS ventral stream spiking dataset (TVSD). We extensively sampled neuronal activity in response to >25,000 natural images from the THINGS database in macaques, using high-channel-count implants in three key cortical regions: primary visual cortex (V1), V4, and the inferotemporal cortex. We showcase the utility of TVSD by using an artificial neural network to visualize the tuning of neurons. We also characterize the correlated fluctuations in activity within and between areas and demonstrate that these noise correlations are strongest between neurons with similar tuning. The TVSD allows researchers to answer many questions about neuronal tuning, analyze the interactions within and between cortical regions, and compare spiking activity in monkeys to human neuroimaging data.
Collapse
Affiliation(s)
- Paolo Papale
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands.
| | - Feng Wang
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands
| | - Matthew W Self
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands
| | - Pieter R Roelfsema
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands; Department of Integrative Neurophysiology, VU University, De Boelelaan 1085, 1081 HV Amsterdam, the Netherlands; Department of Neurosurgery, Academic Medical Centre, Postbus 22660, 1100 DD Amsterdam, the Netherlands; Laboratory of Visual Brain Therapy, Sorbonne Université, INSERM, CNRS, Institut de la Vision, 17 rue Moreau, 75012 Paris, France.
| |
Collapse
|
8
|
Posani L, Wang S, Muscinelli SP, Paninski L, Fusi S. Rarely categorical, always high-dimensional: how the neural code changes along the cortical hierarchy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.11.15.623878. [PMID: 39605683 PMCID: PMC11601379 DOI: 10.1101/2024.11.15.623878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
A long-standing debate in neuroscience concerns whether individual neurons are organized into functionally distinct populations that encode information differently ("categorical" representations [1-3]) and the implications for neural computation. Here, we systematically analyzed how cortical neurons encode cognitive, sensory, and movement variables across 43 cortical regions during a complex task (14,000+ units from the International Brain Laboratory public Brain-wide Map data set [4]) and studied how these properties change across the sensory-cognitive cortical hierarchy [5]. We found that the structure of the neural code was scale-dependent: on a whole-cortex scale, neural selectivity was categorical and organized across regions in a way that reflected their anatomical connectivity. However, within individual regions, categorical representations were rare and limited to primary sensory areas. Remarkably, the degree of categorical clustering of neural selectivity was inversely correlated to the dimensionality of neural representations, suggesting a link between single-neuron selectivity and computational properties of population codes that we explained in a mathematical model. Finally, we found that the fraction of linearly separable combinations of experimental conditions ("Shattering Dimensionality" [6]) was near maximal across all areas, indicating a robust and uniform ability for flexible information encoding throughout the cortex. In conclusion, our results provide systematic evidence for a non-categorical, high-dimensional neural code in all but the lower levels of the cortical hierarchy.
Collapse
Affiliation(s)
- Lorenzo Posani
- Zuckerman Institute, Columbia University, New York, NY, USA
- School of Computer and Communication Sciences, EPFL, Street, Lausanne, Switzerland
| | - Shuqi Wang
- School of Computer and Communication Sciences, EPFL, Street, Lausanne, Switzerland
- Department of Statistics, Columbia University, New York, NY, USA
| | | | - Liam Paninski
- Zuckerman Institute, Columbia University, New York, NY, USA
- Department of Statistics, Columbia University, New York, NY, USA
- Co-senior authors
| | - Stefano Fusi
- Zuckerman Institute, Columbia University, New York, NY, USA
- Co-senior authors
| |
Collapse
|
9
|
Conwell C, Graham D, Boccagno C, Vessel EA. The perceptual primacy of feeling: Affectless visual machines explain a majority of variance in human visually evoked affect. Proc Natl Acad Sci U S A 2025; 122:e2306025121. [PMID: 39847334 PMCID: PMC11789064 DOI: 10.1073/pnas.2306025121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 08/27/2024] [Indexed: 01/24/2025] Open
Abstract
Looking at the world often involves not just seeing things, but feeling things. Modern feedforward machine vision systems that learn to perceive the world in the absence of active physiology, deliberative thought, or any form of feedback that resembles human affective experience offer tools to demystify the relationship between seeing and feeling, and to assess how much of visually evoked affective experiences may be a straightforward function of representation learning over natural image statistics. In this work, we deploy a diverse sample of 180 state-of-the-art deep neural network models trained only on canonical computer vision tasks to predict human ratings of arousal, valence, and beauty for images from multiple categories (objects, faces, landscapes, art) across two datasets. Importantly, we use the features of these models without additional learning, linearly decoding human affective responses from network activity in much the same way neuroscientists decode information from neural recordings. Aggregate analysis across our survey, demonstrates that predictions from purely perceptual models explain a majority of the explainable variance in average ratings of arousal, valence, and beauty alike. Finer-grained analysis within our survey (e.g. comparisons between shallower and deeper layers, or between randomly initialized, category-supervised, and self-supervised models) point to rich, preconceptual abstraction (learned from diversity of visual experience) as a key driver of these predictions. Taken together, these results provide further computational evidence for an information-processing account of visually evoked affect linked directly to efficient representation learning over natural image statistics, and hint at a computational locus of affective and aesthetic valuation immediately proximate to perception.
Collapse
Affiliation(s)
- Colin Conwell
- Department of Psychology, Harvard University, Cambridge, MA02139
| | - Daniel Graham
- Department of Psychological Science, Hobart and William Smith Colleges
| | - Chelsea Boccagno
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA02114
- Department of Epidemiology, Harvard T.H. Chan School of Public Health
| | - Edward A. Vessel
- Department of Psychology, City College, City University of New York, New York, NY10031
| |
Collapse
|
10
|
Ramirez JG, Vanhoyland M, Ratan Murty NA, Decramer T, Van Paesschen W, Bracci S, Op de Beeck H, Kanwisher N, Janssen P, Theys T. Intracortical recordings reveal the neuronal selectivity for bodies and body parts in the human visual cortex. Proc Natl Acad Sci U S A 2024; 121:e2408871121. [PMID: 39652751 PMCID: PMC11665852 DOI: 10.1073/pnas.2408871121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 10/22/2024] [Indexed: 02/13/2025] Open
Abstract
Body perception plays a fundamental role in social cognition. Yet, the neural mechanisms underlying this process in humans remain elusive given the spatiotemporal constraints of functional imaging. Here, we present intracortical recordings of single- and multiunit spiking activity in two epilepsy surgery patients in or near the extrastriate body area, a critical region for body perception. Our recordings revealed a strong preference for human bodies over a large range of control stimuli. Notably, body selectivity was driven by a distinct selectivity for body parts. The observed body selectivity generalized to nonphotographic depictions of bodies including silhouettes and stick figures. Overall, our study provides unique neural data that bridge the gap between human neuroimaging and macaque electrophysiology studies, laying a solid foundation for computational models of human body processing.
Collapse
Affiliation(s)
- Jesus Garcia Ramirez
- Research group Experimental Neurosurgery and Neuroanatomy, Katholieke Universiteit Leuven, and the Leuven Brain Institute, LeuvenB-3000, Belgium
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, Katholieke Universiteit Leuven and the Leuven Brain Institute, LeuvenB-3000, Belgium
| | - Michael Vanhoyland
- Research group Experimental Neurosurgery and Neuroanatomy, Katholieke Universiteit Leuven, and the Leuven Brain Institute, LeuvenB-3000, Belgium
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, Katholieke Universiteit Leuven and the Leuven Brain Institute, LeuvenB-3000, Belgium
- Department of Neurosurgery, Universitaire Ziekenhuizen Leuven, Katholieke Universiteit Leuven, LeuvenB-3000, Belgium
| | - N. A. Ratan Murty
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
- The Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Thomas Decramer
- Research group Experimental Neurosurgery and Neuroanatomy, Katholieke Universiteit Leuven, and the Leuven Brain Institute, LeuvenB-3000, Belgium
- Department of Neurosurgery, Universitaire Ziekenhuizen Leuven, Katholieke Universiteit Leuven, LeuvenB-3000, Belgium
| | - Wim Van Paesschen
- Laboratory for Epilepsy Research, Katholieke Universiteit Leuven, LeuvenB-3000, Belgium
| | - Stefania Bracci
- Department of Psychology and Cognitive Science, University of Trento, Trento38068, Italy
| | - Hans Op de Beeck
- Laboratory for Biological Psychology, Katholieke Universiteit Leuven, LeuvenB-3000, Belgium
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
- The Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Peter Janssen
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, Katholieke Universiteit Leuven and the Leuven Brain Institute, LeuvenB-3000, Belgium
| | - Tom Theys
- Research group Experimental Neurosurgery and Neuroanatomy, Katholieke Universiteit Leuven, and the Leuven Brain Institute, LeuvenB-3000, Belgium
- Department of Neurosurgery, Universitaire Ziekenhuizen Leuven, Katholieke Universiteit Leuven, LeuvenB-3000, Belgium
| |
Collapse
|
11
|
Susan S. Neuroscientific insights about computer vision models: a concise review. BIOLOGICAL CYBERNETICS 2024; 118:331-348. [PMID: 39382577 DOI: 10.1007/s00422-024-00998-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 09/12/2024] [Indexed: 10/10/2024]
Abstract
The development of biologically-inspired computational models has been the focus of study ever since the artificial neuron was introduced by McCulloch and Pitts in 1943. However, a scrutiny of literature reveals that most attempts to replicate the highly efficient and complex biological visual system have been futile or have met with limited success. The recent state-of the-art computer vision models, such as pre-trained deep neural networks and vision transformers, may not be biologically inspired per se. Nevertheless, certain aspects of biological vision are still found embedded, knowingly or unknowingly, in the architecture and functioning of these models. This paper explores several principles related to visual neuroscience and the biological visual pathway that resonate, in some manner, in the architectural design and functioning of contemporary computer vision models. The findings of this survey can provide useful insights for building futuristic bio-inspired computer vision models. The survey is conducted from a historical perspective, tracing the biological connections of computer vision models starting with the basic artificial neuron to modern technologies such as deep convolutional neural network (CNN) and spiking neural networks (SNN). One spotlight of the survey is a discussion on biologically plausible neural networks and bio-inspired unsupervised learning mechanisms adapted for computer vision tasks in recent times.
Collapse
Affiliation(s)
- Seba Susan
- Department of Information Technology, Delhi Technological University, Delhi, India.
| |
Collapse
|
12
|
Papale P, Zuiderbaan W, Teeuwen RRM, Gilhuis A, Self MW, Roelfsema PR, Dumoulin SO. V1 neurons are tuned to perceptual borders in natural scenes. Proc Natl Acad Sci U S A 2024; 121:e2221623121. [PMID: 39495929 PMCID: PMC11572972 DOI: 10.1073/pnas.2221623121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Accepted: 09/30/2024] [Indexed: 11/06/2024] Open
Abstract
The visual system needs to identify perceptually relevant borders to segment complex natural scenes. The primary visual cortex (V1) is thought to extract local borders, and higher visual areas are thought to identify the perceptually relevant borders between objects and the background. To test this conjecture, we used natural images that had been annotated by human observers who marked the perceptually relevant borders. We assessed the effect of perceptual relevance on V1 responses using human neuroimaging, macaque electrophysiology, and computational modeling. We report that perceptually relevant borders elicit stronger responses in the early visual cortex than irrelevant ones, even if simple features, such as contrast and the energy of oriented filters, are matched. Moreover, V1 neurons discriminate perceptually relevant borders surprisingly fast, during the early feedforward-driven activity at a latency of ~50 ms, indicating that they are tuned to the features that characterize them. We also revealed a delayed, contextual effect that enhances the V1 responses that are elicited by perceptually relevant borders at a longer latency. Our results reveal multiple mechanisms that allow V1 neurons to infer the layout of objects in natural images.
Collapse
Affiliation(s)
- Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience (KNAW), Amsterdam1105 BA, Netherlands
- Momilab Research Unit, Institutions, Markets, Technologies School for Advanced Studies Lucca, Lucca55100, Italy
| | - Wietske Zuiderbaan
- Department of Computational Cognitive Neuroscience and Neuroimaging, Netherlands Institute for Neuroscience (Koninklijke Nederlandse Akademie van Wetenschappen), Amsterdam1105 BA, Netherlands
- Spinoza Centre for Neuroimaging, Amsterdam1105 BK, Netherlands
| | - Rob R. M. Teeuwen
- Department of Vision and Cognition, Netherlands Institute for Neuroscience (KNAW), Amsterdam1105 BA, Netherlands
| | - Amparo Gilhuis
- Department of Vision and Cognition, Netherlands Institute for Neuroscience (KNAW), Amsterdam1105 BA, Netherlands
| | - Matthew W. Self
- Department of Vision and Cognition, Netherlands Institute for Neuroscience (KNAW), Amsterdam1105 BA, Netherlands
| | - Pieter R. Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience (KNAW), Amsterdam1105 BA, Netherlands
- Department of Integrative Neurophysiology, Vrije UniversiteitAmsterdam1081 HV, Netherlands
- Department of Neurosurgery, Academic Medical Centre, Amsterdam1100 DD, Netherlands
- Laboratory of Visual Brain Therapy, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Institut de la Vision, Sorbonne Université, ParisF-75012, France
| | - Serge O. Dumoulin
- Department of Computational Cognitive Neuroscience and Neuroimaging, Netherlands Institute for Neuroscience (Koninklijke Nederlandse Akademie van Wetenschappen), Amsterdam1105 BA, Netherlands
- Spinoza Centre for Neuroimaging, Amsterdam1105 BK, Netherlands
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam1181 BT, Netherlands
- Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht3584 CS, Netherlands
| |
Collapse
|
13
|
Peng Y, Gong X, Lu H, Fang F. Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers. J Cogn Neurosci 2024; 36:2458-2480. [PMID: 39106158 DOI: 10.1162/jocn_a_02233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2024]
Abstract
Deep convolutional neural networks (DCNNs) have attained human-level performance for object categorization and exhibited representation alignment between network layers and brain regions. Does such representation alignment naturally extend to other visual tasks beyond recognizing objects in static images? In this study, we expanded the exploration to the recognition of human actions from videos and assessed the representation capabilities and alignment of two-stream DCNNs in comparison with brain regions situated along ventral and dorsal pathways. Using decoding analysis and representational similarity analysis, we show that DCNN models do not show hierarchical representation alignment to human brain across visual regions when processing action videos. Instead, later layers of DCNN models demonstrate greater representation similarities to the human visual cortex. These findings were revealed for two display formats: photorealistic avatars with full-body information and simplified stimuli in the point-light display. The discrepancies in representation alignment suggest fundamental differences in how DCNNs and the human brain represent dynamic visual information related to actions.
Collapse
Affiliation(s)
- Yujia Peng
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China
- Institute for Artificial Intelligence, Peking University, Beijing, People's Republic of China
- National Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence, Beijing, China
- Department of Psychology, University of California, Los Angeles
| | - Xizi Gong
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China
| | - Hongjing Lu
- Department of Psychology, University of California, Los Angeles
- Department of Statistics, University of California, Los Angeles
| | - Fang Fang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing, People's Republic of China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, People's Republic of China
- Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China
| |
Collapse
|
14
|
Eberle O, Büttner J, el-Hajj H, Montavon G, Müller KR, Valleriani M. Historical insights at scale: A corpus-wide machine learning analysis of early modern astronomic tables. SCIENCE ADVANCES 2024; 10:eadj1719. [PMID: 39441928 PMCID: PMC11498222 DOI: 10.1126/sciadv.adj1719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 09/19/2024] [Indexed: 10/25/2024]
Abstract
Understanding the evolution and dissemination of human knowledge over time faces challenges due to the abundance of historical materials and limited specialist resources. However, the digitization of historical archives presents an opportunity for AI-supported analysis. This study advances historical analysis by using an atomization-recomposition method that relies on unsupervised machine learning and explainable AI techniques. Focusing on the "Sacrobosco Collection," consisting of 359 early modern printed editions of astronomy textbooks from European universities (1472-1650), totaling 76,000 pages, our analysis uncovers temporal and geographic patterns in knowledge transformation. We highlight the relevant role of astronomy textbooks in shaping a unified mathematical culture, driven by competition among educational institutions and market dynamics. This approach deepens our understanding by grounding insights in historical context, integrating with traditional methodologies. Case studies illustrate how communities embraced scientific advancements, reshaping astronomic and geographical views and exploring scientific roots amidst a changing world.
Collapse
Affiliation(s)
- Oliver Eberle
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587 Berlin, Germany
- BIFOLD–Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
| | - Jochen Büttner
- BIFOLD–Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
- Max Planck Institute of Geoanthropology, Kahlaische Str. 10, 07745 Jena, Germany
| | - Hassan el-Hajj
- BIFOLD–Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
- Max Planck Institute for the History of Science,Boltzmannstr. 22, 14195 Berlin, Germany
| | - Grégoire Montavon
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587 Berlin, Germany
- BIFOLD–Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587 Berlin, Germany
- BIFOLD–Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Seoul 136-713, South Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg 4, 66123 Saarbrücken, Germany
| | - Matteo Valleriani
- BIFOLD–Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
- Max Planck Institute for the History of Science,Boltzmannstr. 22, 14195 Berlin, Germany
- Institute of History and Philosophy of Science, Technology, and Literature, Faculty I–Humanities and Educational Sciences, Technische Universität Berlin, Straße des 17. Juni 135, 10623 Berlin, Germany
- The Cohn Institute for the History and Philosophy of Science and Ideas, Faculty of Humanities, Tel Aviv University, P.O. Box 39040, Ramat Aviv, Tel Aviv 6139001, Israel
| |
Collapse
|
15
|
Pavuluri A, Kohn A. The representational geometry for naturalistic textures in macaque V1 and V2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.18.619102. [PMID: 39484570 PMCID: PMC11526966 DOI: 10.1101/2024.10.18.619102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Our understanding of visual cortical processing has relied primarily on studying the selectivity of individual neurons in different areas. A complementary approach is to study how the representational geometry of neuronal populations differs across areas. Though the geometry is derived from individual neuronal selectivity, it can reveal encoding strategies difficult to infer from single neuron responses. In addition, recent theoretical work has begun to relate distinct functional objectives to different representational geometries. To understand how the representational geometry changes across stages of processing, we measured neuronal population responses in primary visual cortex (V1) and area V2 of macaque monkeys to an ensemble of synthetic, naturalistic textures. Responses were lower dimensional in V2 than V1, and there was a better alignment of V2 population responses to different textures. The representational geometry in V2 afforded better discriminability between out-of-sample textures. We performed complementary analyses of standard convolutional network models, which did not replicate the representational geometry of cortex. We conclude that there is a shift in the representational geometry between V1 and V2, with the V2 representation exhibiting features of a low-dimensional, systematic encoding of different textures and of different instantiations of each texture. Our results suggest that comparisons of representational geometry can reveal important transformations that occur across successive stages of visual processing.
Collapse
|
16
|
Bertalmío M, Durán Vizcaíno A, Malo J, Wichmann FA. Plaid masking explained with input-dependent dendritic nonlinearities. Sci Rep 2024; 14:24856. [PMID: 39438555 PMCID: PMC11496684 DOI: 10.1038/s41598-024-75471-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
A serious obstacle for understanding early spatial vision comes from the failure of the so-called standard model (SM) to predict the perception of plaid masking. But the SM originated from a major oversimplification of single neuron computations, ignoring fundamental properties of dendrites. Here we show that a spatial vision model including computations mimicking the input-dependent nature of dendritic nonlinearities, i.e. including nonlinear neural summation, has the potential to explain plaid masking data.
Collapse
Affiliation(s)
| | | | - Jesús Malo
- Universitat de València, València, Spain
| | | |
Collapse
|
17
|
Das S, Mangun GR, Ding M. Perceptual Expertise and Attention: An Exploration using Deep Neural Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.15.617743. [PMID: 39464001 PMCID: PMC11507720 DOI: 10.1101/2024.10.15.617743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Perceptual expertise and attention are two important factors that enable superior object recognition and task performance. While expertise enhances knowledge and provides a holistic understanding of the environment, attention allows us to selectively focus on task-related information and suppress distraction. It has been suggested that attention operates differently in experts and in novices, but much remains unknown. This study investigates the relationship between perceptual expertise and attention using convolutional neural networks (CNNs), which are shown to be good models of primate visual pathways. Two CNN models were trained to become experts in either face or scene recognition, and the effect of attention on performance was evaluated in tasks involving complex stimuli, such as superimposed images containing superimposed faces and scenes. The goal was to explore how feature-based attention (FBA) influences recognition within and outside the domain of expertise of the models. We found that each model performed better in its area of expertise-and that FBA further enhanced task performance, but only within the domain of expertise, increasing performance by up to 35% in scene recognition, and 15% in face recognition. However, attention had reduced or negative effects when applied outside the models' expertise domain. Neural unit-level analysis revealed that expertise led to stronger tuning towards category-specific features and sharper tuning curves, as reflected in greater representational dissimilarity between targets and distractors, which, in line with the biased competition model of attention, leads to enhanced performance by reducing competition. These findings highlight the critical role of neural tuning at single as well as network level neural in distinguishing the effects of attention in experts and in novices and demonstrate that CNNs can be used fruitfully as computational models for addressing neuroscience questions not practical with the empirical methods.
Collapse
Affiliation(s)
- Soukhin Das
- Center for Mind and Brain, University of California, Davis
- Department of Psychology, University of California, Davis
| | - G R Mangun
- Center for Mind and Brain, University of California, Davis
- Department of Psychology, University of California, Davis
- Department of Neurology, University of California, Davis
| | - Mingzhou Ding
- Department of Neurology, University of California, Davis
| |
Collapse
|
18
|
Höfling L, Szatko KP, Behrens C, Deng Y, Qiu Y, Klindt DA, Jessen Z, Schwartz GW, Bethge M, Berens P, Franke K, Ecker AS, Euler T. A chromatic feature detector in the retina signals visual context changes. eLife 2024; 13:e86860. [PMID: 39365730 PMCID: PMC11452179 DOI: 10.7554/elife.86860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 08/25/2024] [Indexed: 10/06/2024] Open
Abstract
The retina transforms patterns of light into visual feature representations supporting behaviour. These representations are distributed across various types of retinal ganglion cells (RGCs), whose spatial and temporal tuning properties have been studied extensively in many model organisms, including the mouse. However, it has been difficult to link the potentially nonlinear retinal transformations of natural visual inputs to specific ethological purposes. Here, we discover a nonlinear selectivity to chromatic contrast in an RGC type that allows the detection of changes in visual context. We trained a convolutional neural network (CNN) model on large-scale functional recordings of RGC responses to natural mouse movies, and then used this model to search in silico for stimuli that maximally excite distinct types of RGCs. This procedure predicted centre colour opponency in transient suppressed-by-contrast (tSbC) RGCs, a cell type whose function is being debated. We confirmed experimentally that these cells indeed responded very selectively to Green-OFF, UV-ON contrasts. This type of chromatic contrast was characteristic of transitions from ground to sky in the visual scene, as might be elicited by head or eye movements across the horizon. Because tSbC cells performed best among all RGC types at reliably detecting these transitions, we suggest a role for this RGC type in providing contextual information (i.e. sky or ground) necessary for the selection of appropriate behavioural responses to other stimuli, such as looming objects. Our work showcases how a combination of experiments with natural stimuli and computational modelling allows discovering novel types of stimulus selectivity and identifying their potential ethological relevance.
Collapse
Affiliation(s)
- Larissa Höfling
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
| | - Klaudia P Szatko
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
| | - Christian Behrens
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
| | - Yuyao Deng
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
| | - Yongrong Qiu
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
| | | | - Zachary Jessen
- Feinberg School of Medicine, Department of Ophthalmology, Northwestern UniversityChicagoUnited States
| | - Gregory W Schwartz
- Feinberg School of Medicine, Department of Ophthalmology, Northwestern UniversityChicagoUnited States
| | - Matthias Bethge
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
- Tübingen AI Center, University of TübingenTübingenGermany
| | - Philipp Berens
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
- Tübingen AI Center, University of TübingenTübingenGermany
- Hertie Institute for AI in Brain HealthTübingenGermany
| | - Katrin Franke
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
| | - Alexander S Ecker
- Institute of Computer Science and Campus Institute Data Science, University of GöttingenGöttingenGermany
- Max Planck Institute for Dynamics and Self-OrganizationGöttingenGermany
| | - Thomas Euler
- Institute for Ophthalmic Research, University of TübingenTübingenGermany
- Centre for Integrative Neuroscience, University of TübingenTübingenGermany
| |
Collapse
|
19
|
Layton OW, Steinmetz ST. Accuracy optimized neural networks do not effectively model optic flow tuning in brain area MSTd. Front Neurosci 2024; 18:1441285. [PMID: 39286477 PMCID: PMC11403719 DOI: 10.3389/fnins.2024.1441285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 08/09/2024] [Indexed: 09/19/2024] Open
Abstract
Accuracy-optimized convolutional neural networks (CNNs) have emerged as highly effective models at predicting neural responses in brain areas along the primate ventral stream, but it is largely unknown whether they effectively model neurons in the complementary primate dorsal stream. We explored how well CNNs model the optic flow tuning properties of neurons in dorsal area MSTd and we compared our results with the Non-Negative Matrix Factorization (NNMF) model, which successfully models many tuning properties of MSTd neurons. To better understand the role of computational properties in the NNMF model that give rise to optic flow tuning that resembles that of MSTd neurons, we created additional CNN model variants that implement key NNMF constraints - non-negative weights and sparse coding of optic flow. While the CNNs and NNMF models both accurately estimate the observer's self-motion from purely translational or rotational optic flow, NNMF and the CNNs with nonnegative weights yield substantially less accurate estimates than the other CNNs when tested on more complex optic flow that combines observer translation and rotation. Despite its poor accuracy, NNMF gives rise to tuning properties that align more closely with those observed in primate MSTd than any of the accuracy-optimized CNNs. This work offers a step toward a deeper understanding of the computational properties and constraints that describe the optic flow tuning of primate area MSTd.
Collapse
Affiliation(s)
- Oliver W Layton
- Department of Computer Science, Colby College, Waterville, ME, United States
| | - Scott T Steinmetz
- Center for Computing Research, Sandia National Labs, Albuquerque, NM, United States
| |
Collapse
|
20
|
Kar K, DiCarlo JJ. The Quest for an Integrated Set of Neural Mechanisms Underlying Object Recognition in Primates. Annu Rev Vis Sci 2024; 10:91-121. [PMID: 38950431 DOI: 10.1146/annurev-vision-112823-030616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Inferences made about objects via vision, such as rapid and accurate categorization, are core to primate cognition despite the algorithmic challenge posed by varying viewpoints and scenes. Until recently, the brain mechanisms that support these capabilities were deeply mysterious. However, over the past decade, this scientific mystery has been illuminated by the discovery and development of brain-inspired, image-computable, artificial neural network (ANN) systems that rival primates in these behavioral feats. Apart from fundamentally changing the landscape of artificial intelligence, modified versions of these ANN systems are the current leading scientific hypotheses of an integrated set of mechanisms in the primate ventral visual stream that support core object recognition. What separates brain-mapped versions of these systems from prior conceptual models is that they are sensory computable, mechanistic, anatomically referenced, and testable (SMART). In this article, we review and provide perspective on the brain mechanisms addressed by the current leading SMART models. We review their empirical brain and behavioral alignment successes and failures, discuss the next frontiers for an even more accurate mechanistic understanding, and outline the likely applications.
Collapse
Affiliation(s)
- Kohitij Kar
- Department of Biology, Centre for Vision Research, and Centre for Integrative and Applied Neuroscience, York University, Toronto, Ontario, Canada;
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT Quest for Intelligence, and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| |
Collapse
|
21
|
Wang EY, Fahey PG, Ding Z, Papadopoulos S, Ponder K, Weis MA, Chang A, Muhammad T, Patel S, Ding Z, Tran D, Fu J, Schneider-Mizell CM, Reid RC, Collman F, da Costa NM, Franke K, Ecker AS, Reimer J, Pitkow X, Sinz FH, Tolias AS. Foundation model of neural activity predicts response to new stimulus types and anatomy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.21.533548. [PMID: 36993435 PMCID: PMC10055288 DOI: 10.1101/2023.03.21.533548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The complexity of neural circuits makes it challenging to decipher the brain's algorithms of intelligence. Recent breakthroughs in deep learning have produced models that accurately simulate brain activity, enhancing our understanding of the brain's computational objectives and neural coding. However, these models struggle to generalize beyond their training distribution, limiting their utility. The emergence of foundation models, trained on vast datasets, has introduced a new AI paradigm with remarkable generalization capabilities. We collected large amounts of neural activity from visual cortices of multiple mice and trained a foundation model to accurately predict neuronal responses to arbitrary natural videos. This model generalized to new mice with minimal training and successfully predicted responses across various new stimulus domains, such as coherent motion and noise patterns. It could also be adapted to new tasks beyond neural prediction, accurately predicting anatomical cell types, dendritic features, and neuronal connectivity within the MICrONS functional connectomics dataset. Our work is a crucial step toward building foundation brain models. As neuroscience accumulates larger, multi-modal datasets, foundation models will uncover statistical regularities, enabling rapid adaptation to new tasks and accelerating research.
Collapse
Affiliation(s)
- Eric Y Wang
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Paul G Fahey
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Zhuokun Ding
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Stelios Papadopoulos
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Kayla Ponder
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Marissa A Weis
- Institute of Computer Science and Campus Institute Data Science, University Göttingen, Germany
| | - Andersen Chang
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Taliah Muhammad
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Saumil Patel
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Zhiwei Ding
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Dat Tran
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Jiakun Fu
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | | | - R Clay Reid
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | | | - Katrin Franke
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Alexander S Ecker
- Institute of Computer Science and Campus Institute Data Science, University Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | - Jacob Reimer
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
| | - Xaq Pitkow
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA
| | - Fabian H Sinz
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Institute of Computer Science and Campus Institute Data Science, University Göttingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Germany
| | - Andreas S Tolias
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, USA
- Department of Neuroscience, Baylor College of Medicine, Houston, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
- Department of Electrical Engineering, Stanford University, Stanford, CA, US
| |
Collapse
|
22
|
Zhang J, Huang L, Ma Z, Zhou H. Predicting the temporal-dynamic trajectories of cortical neuronal responses in non-human primates based on deep spiking neural network. Cogn Neurodyn 2024; 18:1977-1988. [PMID: 39104695 PMCID: PMC11297849 DOI: 10.1007/s11571-023-09989-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 05/25/2023] [Accepted: 06/21/2023] [Indexed: 08/07/2024] Open
Abstract
Deep convolutional neural networks (CNNs) are commonly used as computational models for the primate ventral stream, while deep spiking neural networks (SNNs) incorporated with both the temporal and spatial spiking information still lack investigation. We compared performances of SNN and CNN in prediction of visual responses to the naturalistic stimuli in area V4, inferior temporal (IT), and orbitofrontal cortex (OFC). The accuracies based on SNN were significantly higher than that of CNN in prediction of temporal-dynamic trajectory and averaged firing rate of visual response in V4 and IT. The temporal dynamics were captured by SNN for neurons with diverse temporal profiles and category selectivities, and most sensitively captured around the time of peak responses for each brain region. Consistently, SNN activities showed significantly stronger correlations with IT, V4 and OFC responses. In SNN, correlations with neural activities were stronger for later time-step features than early time-step features. The temporal-dynamic prediction was also significantly improved by considering preceding neural activities during the prediction. Thus, our study demonstrated SNN as a powerful temporal-dynamic model for cortical responses to complex naturalistic stimuli.
Collapse
Affiliation(s)
- Jie Zhang
- The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
- University of Chinese Academy of Sciences, Beijing, 100049 China
- Peng Cheng Laboratory, Shenzhen, 518000 China
| | - Liwei Huang
- Peng Cheng Laboratory, Shenzhen, 518000 China
- Peking University, Beijing, 100871 China
| | - Zhengyu Ma
- Peng Cheng Laboratory, Shenzhen, 518000 China
| | - Huihui Zhou
- The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
- Peng Cheng Laboratory, Shenzhen, 518000 China
| |
Collapse
|
23
|
Wang T, Lee TS, Yao H, Hong J, Li Y, Jiang H, Andolina IM, Tang S. Large-scale calcium imaging reveals a systematic V4 map for encoding natural scenes. Nat Commun 2024; 15:6401. [PMID: 39080309 PMCID: PMC11289446 DOI: 10.1038/s41467-024-50821-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 07/22/2024] [Indexed: 08/02/2024] Open
Abstract
Biological visual systems have evolved to process natural scenes. A full understanding of visual cortical functions requires a comprehensive characterization of how neuronal populations in each visual area encode natural scenes. Here, we utilized widefield calcium imaging to record V4 cortical response to tens of thousands of natural images in male macaques. Using this large dataset, we developed a deep-learning digital twin of V4 that allowed us to map the natural image preferences of the neural population at 100-µm scale. This detailed map revealed a diverse set of functional domains in V4, each encoding distinct natural image features. We validated these model predictions using additional widefield imaging and single-cell resolution two-photon imaging. Feature attribution analysis revealed that these domains lie along a continuum from preferring spatially localized shape features to preferring spatially dispersed surface features. These results provide insights into the organizing principles that govern natural scene encoding in V4.
Collapse
Affiliation(s)
- Tianye Wang
- Peking University School of Life Sciences, Beijing, 100871, China
- Peking-Tsinghua Center for Life Sciences, Beijing, 100871, China
- IDG/McGovern Institute for Brain Research at Peking University, Beijing, 100871, China
- Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, 100871, China
| | - Tai Sing Lee
- Computer Science Department and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Haoxuan Yao
- Peking University School of Life Sciences, Beijing, 100871, China
- Peking-Tsinghua Center for Life Sciences, Beijing, 100871, China
- IDG/McGovern Institute for Brain Research at Peking University, Beijing, 100871, China
- Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, 100871, China
| | - Jiayi Hong
- Peking University School of Life Sciences, Beijing, 100871, China
| | - Yang Li
- Peking University School of Life Sciences, Beijing, 100871, China
- Peking-Tsinghua Center for Life Sciences, Beijing, 100871, China
- IDG/McGovern Institute for Brain Research at Peking University, Beijing, 100871, China
- Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, 100871, China
| | - Hongfei Jiang
- Peking University School of Life Sciences, Beijing, 100871, China
- Peking-Tsinghua Center for Life Sciences, Beijing, 100871, China
- IDG/McGovern Institute for Brain Research at Peking University, Beijing, 100871, China
- Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, 100871, China
| | - Ian Max Andolina
- The Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Shiming Tang
- Peking University School of Life Sciences, Beijing, 100871, China.
- Peking-Tsinghua Center for Life Sciences, Beijing, 100871, China.
- IDG/McGovern Institute for Brain Research at Peking University, Beijing, 100871, China.
- Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, 100871, China.
| |
Collapse
|
24
|
Margalit E, Lee H, Finzi D, DiCarlo JJ, Grill-Spector K, Yamins DLK. A unifying framework for functional organization in early and higher ventral visual cortex. Neuron 2024; 112:2435-2451.e7. [PMID: 38733985 PMCID: PMC11257790 DOI: 10.1016/j.neuron.2024.04.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 12/08/2023] [Accepted: 04/15/2024] [Indexed: 05/13/2024]
Abstract
A key feature of cortical systems is functional organization: the arrangement of functionally distinct neurons in characteristic spatial patterns. However, the principles underlying the emergence of functional organization in the cortex are poorly understood. Here, we develop the topographic deep artificial neural network (TDANN), the first model to predict several aspects of the functional organization of multiple cortical areas in the primate visual system. We analyze the factors driving the TDANN's success and find that it balances two objectives: learning a task-general sensory representation and maximizing the spatial smoothness of responses according to a metric that scales with cortical surface area. In turn, the representations learned by the TDANN are more brain-like than in spatially unconstrained models. Finally, we provide evidence that the TDANN's functional organization balances performance with between-area connection length. Our results offer a unified principle for understanding the functional organization of the primate ventral visual system.
Collapse
Affiliation(s)
- Eshed Margalit
- Neurosciences Graduate Program, Stanford University, Stanford, CA 94305, USA.
| | - Hyodong Lee
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Dawn Finzi
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Kalanit Grill-Spector
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
25
|
Idrees S, Manookin MB, Rieke F, Field GD, Zylberberg J. Biophysical neural adaptation mechanisms enable artificial neural networks to capture dynamic retinal computation. Nat Commun 2024; 15:5957. [PMID: 39009568 PMCID: PMC11251147 DOI: 10.1038/s41467-024-50114-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 06/28/2024] [Indexed: 07/17/2024] Open
Abstract
Adaptation is a universal aspect of neural systems that changes circuit computations to match prevailing inputs. These changes facilitate efficient encoding of sensory inputs while avoiding saturation. Conventional artificial neural networks (ANNs) have limited adaptive capabilities, hindering their ability to reliably predict neural output under dynamic input conditions. Can embedding neural adaptive mechanisms in ANNs improve their performance? To answer this question, we develop a new deep learning model of the retina that incorporates the biophysics of photoreceptor adaptation at the front-end of conventional convolutional neural networks (CNNs). These conventional CNNs build on 'Deep Retina,' a previously developed model of retinal ganglion cell (RGC) activity. CNNs that include this new photoreceptor layer outperform conventional CNN models at predicting male and female primate and rat RGC responses to naturalistic stimuli that include dynamic local intensity changes and large changes in the ambient illumination. These improved predictions result directly from adaptation within the phototransduction cascade. This research underscores the potential of embedding models of neural adaptation in ANNs and using them to determine how neural circuits manage the complexities of encoding natural inputs that are dynamic and span a large range of light levels.
Collapse
Affiliation(s)
- Saad Idrees
- Department of Physics and Astronomy, York University, Toronto, ON, Canada.
- Centre for Vision Research, York University, Toronto, ON, Canada.
| | | | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA
| | - Greg D Field
- Stein Eye Institute, Department of Ophthalmology, University of California, Los Angeles, CA, USA
| | - Joel Zylberberg
- Department of Physics and Astronomy, York University, Toronto, ON, Canada.
- Centre for Vision Research, York University, Toronto, ON, Canada.
- Learning in Machines and Brains Program, Canadian Institute for Advanced Research, Toronto, ON, Canada.
| |
Collapse
|
26
|
Turishcheva P, Fahey PG, Vystrčilová M, Hansel L, Froebe R, Ponder K, Qiu Y, Willeke KF, Bashiri M, Baikulov R, Zhu Y, Ma L, Yu S, Huang T, Li BM, Wulf WD, Kudryashova N, Hennig MH, Rochefort NL, Onken A, Wang E, Ding Z, Tolias AS, Sinz FH, Ecker AS. Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos. ARXIV 2024:arXiv:2407.09100v1. [PMID: 39040641 PMCID: PMC11261979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Understanding how biological visual systems process information is challenging because of the nonlinear relationship between visual input and neuronal responses. Artificial neural networks allow computational neuroscientists to create predictive models that connect biological and machine vision. Machine learning has benefited tremendously from benchmarks that compare different model on the same task under standardized conditions. However, there was no standardized benchmark to identify state-of-the-art dynamic models of the mouse visual system. To address this gap, we established the SENSORIUM 2023 Benchmark Competition with dynamic input, featuring a new large-scale dataset from the primary visual cortex of ten mice. This dataset includes responses from 78,853 neurons to 2 hours of dynamic stimuli per neuron, together with the behavioral measurements such as running speed, pupil dilation, and eye movements. The competition ranked models in two tracks based on predictive performance for neuronal responses on a held-out test set: one focusing on predicting in-domain natural stimuli and another on out-of-distribution (OOD) stimuli to assess model generalization. As part of the NeurIPS 2023 competition track, we received more than 160 model submissions from 22 teams. Several new architectures for predictive models were proposed, and the winning teams improved the previous state-of-the-art model by 50%. Access to the dataset as well as the benchmarking infrastructure will remain online at www.sensorium-competition.net.
Collapse
Affiliation(s)
- Polina Turishcheva
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
| | - Paul G. Fahey
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Michaela Vystrčilová
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
| | - Laura Hansel
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
| | - Rachel Froebe
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Kayla Ponder
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
| | - Yongrong Qiu
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Konstantin F. Willeke
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, Tübingen University, Germany
| | - Mohammad Bashiri
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, Tübingen University, Germany
| | | | - Yu Zhu
- Institute of Automation, Chinese Academy of Sciences, China
- Beijing Academy of Artificial Intelligence, China
| | - Lei Ma
- Beijing Academy of Artificial Intelligence, China
| | - Shan Yu
- Institute of Automation, Chinese Academy of Sciences, China
| | - Tiejun Huang
- Beijing Academy of Artificial Intelligence, China
| | - Bryan M. Li
- The Alan Turing Institute, UK
- School of Informatics, University of Edinburgh, UK
| | - Wolf De Wulf
- School of Informatics, University of Edinburgh, UK
| | | | | | - Nathalie L. Rochefort
- Centre for Discovery Brain Sciences, University of Edinburgh, UK
- Simons Initiative for the Developing Brain, University of Edinburgh, UK
| | - Arno Onken
- School of Informatics, University of Edinburgh, UK
| | - Eric Wang
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
| | - Zhiwei Ding
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
| | - Andreas S. Tolias
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
- Department of Electrical Engineering, Stanford University, Stanford, CA, US
| | - Fabian H. Sinz
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Department of Neuroscience & Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, USA
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, Tübingen University, Germany
| | - Alexander S Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| |
Collapse
|
27
|
Turishcheva P, Fahey PG, Vystrčilová M, Hansel L, Froebe R, Ponder K, Qiu Y, Willeke KF, Bashiri M, Wang E, Ding Z, Tolias AS, Sinz FH, Ecker AS. The Dynamic Sensorium competition for predicting large-scale mouse visual cortex activity from videos. ARXIV 2024:arXiv:2305.19654v2. [PMID: 37396602 PMCID: PMC10312815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Understanding how biological visual systems process information is challenging due to the complex nonlinear relationship between neuronal responses and high-dimensional visual input. Artificial neural networks have already improved our understanding of this system by allowing computational neuroscientists to create predictive models and bridge biological and machine vision. During the Sensorium 2022 competition, we introduced benchmarks for vision models with static input (i.e. images). However, animals operate and excel in dynamic environments, making it crucial to study and understand how the brain functions under these conditions. Moreover, many biological theories, such as predictive coding, suggest that previous input is crucial for current input processing. Currently, there is no standardized benchmark to identify state-of-the-art dynamic models of the mouse visual system. To address this gap, we propose the Sensorium 2023 Benchmark Competition with dynamic input (https://www.sensorium-competition.net/). This competition includes the collection of a new large-scale dataset from the primary visual cortex of ten mice, containing responses from over 78,000 neurons to over 2 hours of dynamic stimuli per neuron. Participants in the main benchmark track will compete to identify the best predictive models of neuronal responses for dynamic input (i.e. video). We will also host a bonus track in which submission performance will be evaluated on out-of-domain input, using withheld neuronal responses to dynamic input stimuli whose statistics differ from the training set. Both tracks will offer behavioral data along with video stimuli. As before, we will provide code, tutorials, and strong pre-trained baseline models to encourage participation. We hope this competition will continue to strengthen the accompanying Sensorium benchmarks collection as a standard tool to measure progress in large-scale neural system identification models of the entire mouse visual hierarchy and beyond.
Collapse
Affiliation(s)
- Polina Turishcheva
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
| | - Paul G Fahey
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Michaela Vystrčilová
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
| | - Laura Hansel
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
| | - Rachel Froebe
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Kayla Ponder
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
| | - Yongrong Qiu
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
| | - Konstantin F Willeke
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
- International Max Planck Research School for Intelligent Systems, University of Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Germany
| | - Mohammad Bashiri
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- International Max Planck Research School for Intelligent Systems, University of Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Germany
| | - Eric Wang
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
| | - Zhiwei Ding
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
| | - Andreas S Tolias
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, US
- Stanford Bio-X, Stanford University, Stanford, CA, US
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, US
- Department of Electrical Engineering, Stanford University, Stanford, CA, US
| | - Fabian H Sinz
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- International Max Planck Research School for Intelligent Systems, University of Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Germany
| | - Alexander S Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| |
Collapse
|
28
|
Chandran KS, Ghosh K. A deep learning based cognitive model to probe the relation between psychophysics and electrophysiology of flicker stimulus. Brain Inform 2024; 11:18. [PMID: 38987386 PMCID: PMC11236830 DOI: 10.1186/s40708-024-00231-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 06/14/2024] [Indexed: 07/12/2024] Open
Abstract
The flicker stimulus is a visual stimulus of intermittent illumination. A flicker stimulus can appear flickering or steady to a human subject, depending on the physical parameters associated with the stimulus. When the flickering light appears steady, flicker fusion is said to have occurred. This work aims to bridge the gap between the psychophysics of flicker fusion and the electrophysiology associated with flicker stimulus through a Deep Learning based computational model of flicker perception. Convolutional Recurrent Neural Networks (CRNNs) were trained with psychophysics data of flicker stimulus obtained from a human subject. We claim that many of the reported features of electrophysiology of the flicker stimulus, including the presence of fundamentals and harmonics of the stimulus, can be explained as the result of a temporal convolution operation on the flicker stimulus. We further show that the convolution layer output of a CRNN trained with psychophysics data is more responsive to specific frequencies as in human EEG response to flicker, and the convolution layer of a trained CRNN can give a nearly sinusoidal output for 10 hertz flicker stimulus as reported for some human subjects.
Collapse
Affiliation(s)
- Keerthi S Chandran
- Center for Soft Computing Research, Indian Statistical Institue, 203 BT Road, Kolkata, West Bengal, 700108, India.
- Machine Intelligence Unit, Indian Statistical Institute, 203 BT Road, Kolkata, West Bengal, 700108, India.
| | - Kuntal Ghosh
- Center for Soft Computing Research, Indian Statistical Institue, 203 BT Road, Kolkata, West Bengal, 700108, India
- Machine Intelligence Unit, Indian Statistical Institute, 203 BT Road, Kolkata, West Bengal, 700108, India
| |
Collapse
|
29
|
Miao HY, Tong F. Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing. J Vis 2024; 24:1. [PMID: 38829629 PMCID: PMC11156204 DOI: 10.1167/jov.24.6.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/03/2024] [Indexed: 06/05/2024] Open
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
30
|
Djambazovska S, Zafer A, Ramezanpour H, Kreiman G, Kar K. The Impact of Scene Context on Visual Object Recognition: Comparing Humans, Monkeys, and Computational Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.27.596127. [PMID: 38854011 PMCID: PMC11160639 DOI: 10.1101/2024.05.27.596127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
During natural vision, we rarely see objects in isolation but rather embedded in rich and complex contexts. Understanding how the brain recognizes objects in natural scenes by integrating contextual information remains a key challenge. To elucidate neural mechanisms compatible with human visual processing, we need an animal model that behaves similarly to humans, so that inferred neural mechanisms can provide hypotheses relevant to the human brain. Here we assessed whether rhesus macaques could model human context-driven object recognition by quantifying visual object identification abilities across variations in the amount, quality, and congruency of contextual cues. Behavioral metrics revealed strikingly similar context-dependent patterns between humans and monkeys. However, neural responses in the inferior temporal (IT) cortex of monkeys that were never explicitly trained to discriminate objects in context, as well as current artificial neural network models, could only partially explain this cross-species correspondence. The shared behavioral variance unexplained by context-naive neural data or computational models highlights fundamental knowledge gaps. Our findings demonstrate an intriguing alignment of human and monkey visual object processing that defies full explanation by either brain activity in a key visual region or state-of-the-art models.
Collapse
Affiliation(s)
- Sara Djambazovska
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
- Children’s Hospital, Harvard Medical School, MA, USA
| | - Anaa Zafer
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
| | - Hamidreza Ramezanpour
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
| | | | - Kohitij Kar
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
| |
Collapse
|
31
|
Vogelsang M, Vogelsang L, Gupta P, Gandhi TK, Shah P, Swami P, Gilad-Gutnick S, Ben-Ami S, Diamond S, Ganesh S, Sinha P. Impact of early visual experience on later usage of color cues. Science 2024; 384:907-912. [PMID: 38781366 DOI: 10.1126/science.adk9587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 03/29/2024] [Indexed: 05/25/2024]
Abstract
Human visual recognition is remarkably robust to chromatic changes. In this work, we provide a potential account of the roots of this resilience based on observations with 10 congenitally blind children who gained sight late in life. Several months or years following their sight-restoring surgeries, the removal of color cues markedly reduced their recognition performance, whereas age-matched normally sighted children showed no such decrement. This finding may be explained by the greater-than-neonatal maturity of the late-sighted children's color system at sight onset, inducing overly strong reliance on chromatic cues. Simulations with deep neural networks corroborate this hypothesis. These findings highlight the adaptive significance of typical developmental trajectories and provide guidelines for enhancing machine vision systems.
Collapse
Affiliation(s)
- Marin Vogelsang
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Institute of Cognitive Science, University of Osnabrueck, 49090 Osnabrueck, Germany
| | - Lukas Vogelsang
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Brain Mind Institute, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Priti Gupta
- Amarnath and Shashi Khosla School of Information Technology, Indian Institute of Technology, New Delhi 110016, India
- Project Prakash, Dr. Shroff's Charity Eye Hospital, New Delhi 110002, India
- Cognitive Science Programme, Dayalbagh Educational Institute, Agra 282005, India
| | - Tapan K Gandhi
- Department of Electrical Engineering, Indian Institute of Technology, New Delhi 110016, India
| | - Pragya Shah
- Project Prakash, Dr. Shroff's Charity Eye Hospital, New Delhi 110002, India
| | - Piyush Swami
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, 2650 Hvidovre, Denmark
| | - Sharon Gilad-Gutnick
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Shlomit Ben-Ami
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Sidney Diamond
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Suma Ganesh
- Department of Pediatric Ophthalmology, Dr. Shroff's Charity Eye Hospital, New Delhi 110002, India
| | - Pawan Sinha
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
32
|
Balwani A, Cho S, Choi H. Exploring the Architectural Biases of the Canonical Cortical Microcircuit. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.23.595629. [PMID: 38826320 PMCID: PMC11142214 DOI: 10.1101/2024.05.23.595629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
The cortex plays a crucial role in various perceptual and cognitive functions, driven by its basic unit, the canonical cortical microcircuit. Yet, we remain short of a framework that definitively explains the structure-function relationships of this fundamental neuroanatomical motif. To better understand how physical substrates of cortical circuitry facilitate their neuronal dynamics, we employ a computational approach using recurrent neural networks and representational analyses. We examine the differences manifested by the inclusion and exclusion of biologically-motivated inter-areal laminar connections on the computational roles of different neuronal populations in the microcircuit of two hierarchically-related areas, throughout learning. Our findings show that the presence of feedback connections correlates with the functional modularization of cortical populations in different layers, and provides the microcircuit with a natural inductive bias to differentiate expected and unexpected inputs at initialization. Furthermore, when testing the effects of training the microcircuit and its variants with a predictive-coding inspired strategy, we find that doing so helps better encode noisy stimuli in areas of the cortex that receive feedback, all of which combine to suggest evidence for a predictive-coding mechanism serving as an intrinsic operative logic in the cortex.
Collapse
Affiliation(s)
- Aishwarya Balwani
- School of Electrical & Computer Engineering, Georgia Institute of Technology
| | - Suhee Cho
- Department of Brain and Cognitive Sciences, Korea Advanced Institute of Science Technology
| | - Hannah Choi
- School of Mathematics, Georgia Institute of Technology
| |
Collapse
|
33
|
Cadena SA, Willeke KF, Restivo K, Denfield G, Sinz FH, Bethge M, Tolias AS, Ecker AS. Diverse task-driven modeling of macaque V4 reveals functional specialization towards semantic tasks. PLoS Comput Biol 2024; 20:e1012056. [PMID: 38781156 PMCID: PMC11115319 DOI: 10.1371/journal.pcbi.1012056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 04/08/2024] [Indexed: 05/25/2024] Open
Abstract
Responses to natural stimuli in area V4-a mid-level area of the visual ventral stream-are well predicted by features from convolutional neural networks (CNNs) trained on image classification. This result has been taken as evidence for the functional role of V4 in object classification. However, we currently do not know if and to what extent V4 plays a role in solving other computational objectives. Here, we investigated normative accounts of V4 (and V1 for comparison) by predicting macaque single-neuron responses to natural images from the representations extracted by 23 CNNs trained on different computer vision tasks including semantic, geometric, 2D, and 3D types of tasks. We found that V4 was best predicted by semantic classification features and exhibited high task selectivity, while the choice of task was less consequential to V1 performance. Consistent with traditional characterizations of V4 function that show its high-dimensional tuning to various 2D and 3D stimulus directions, we found that diverse non-semantic tasks explained aspects of V4 function that are not captured by individual semantic tasks. Nevertheless, jointly considering the features of a pair of semantic classification tasks was sufficient to yield one of our top V4 models, solidifying V4's main functional role in semantic processing and suggesting that V4's selectivity to 2D or 3D stimulus properties found by electrophysiologists can result from semantic functional goals.
Collapse
Affiliation(s)
- Santiago A. Cadena
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
| | - Konstantin F. Willeke
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University Tübingen, Tübingen, Germany
| | - Kelli Restivo
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - George Denfield
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - Fabian H. Sinz
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University Tübingen, Tübingen, Germany
| | - Matthias Bethge
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Andreas S. Tolias
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America
| | - Alexander S. Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| |
Collapse
|
34
|
Nguyen P, Sooriyaarachchi J, Huang Q, Baker CL. Estimating receptive fields of simple and complex cells in early visual cortex: A convolutional neural network model with parameterized rectification. PLoS Comput Biol 2024; 20:e1012127. [PMID: 38820562 PMCID: PMC11168683 DOI: 10.1371/journal.pcbi.1012127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 06/12/2024] [Accepted: 05/01/2024] [Indexed: 06/02/2024] Open
Abstract
Neurons in the primary visual cortex respond selectively to simple features of visual stimuli, such as orientation and spatial frequency. Simple cells, which have phase-sensitive responses, can be modeled by a single receptive field filter in a linear-nonlinear model. However, it is challenging to analyze phase-invariant complex cells, which require more elaborate models having a combination of nonlinear subunits. Estimating parameters of these models is made additionally more difficult by cortical neurons' trial-to-trial response variability. We develop a simple convolutional neural network method to estimate receptive field models for both simple and complex visual cortex cells from their responses to natural images. The model consists of a spatiotemporal filter, a parameterized rectifier unit (PReLU), and a two-dimensional Gaussian "map" of the receptive field envelope. A single model parameter determines the simple vs. complex nature of the receptive field, capturing complex cell responses as a summation of homogeneous subunits, and collapsing to a linear-nonlinear model for simple type cells. The convolutional method predicts simple and complex cell responses to natural image stimuli as well as grating tuning curves. The fitted models yield a continuum of values for the PReLU parameter across the sampled neurons, showing that the simple/complex nature of cells can vary in a continuous manner. We demonstrate that complex-like cells respond less reliably than simple-like cells. However, compensation for this unreliability with noise ceiling analysis reveals predictive performance for complex cells proportionately closer to that for simple cells. Most spatial receptive field structures are well fit by Gabor functions, whose parameters confirm well-known properties of cat A17/18 receptive fields.
Collapse
Affiliation(s)
- Philippe Nguyen
- Department of Biomedical Engineering, McGill University, Montreal, Quebec, Canada
| | | | - Qianyu Huang
- Department of Biology, McGill University, Montreal, Quebec, Canada
| | - Curtis L. Baker
- Department of Ophthalmology and Visual Sciences, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
35
|
Goris RLT, Coen-Cagli R, Miller KD, Priebe NJ, Lengyel M. Response sub-additivity and variability quenching in visual cortex. Nat Rev Neurosci 2024; 25:237-252. [PMID: 38374462 PMCID: PMC11444047 DOI: 10.1038/s41583-024-00795-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/24/2024] [Indexed: 02/21/2024]
Abstract
Sub-additivity and variability are ubiquitous response motifs in the primary visual cortex (V1). Response sub-additivity enables the construction of useful interpretations of the visual environment, whereas response variability indicates the factors that limit the precision with which the brain can do this. There is increasing evidence that experimental manipulations that elicit response sub-additivity often also quench response variability. Here, we provide an overview of these phenomena and suggest that they may have common origins. We discuss empirical findings and recent model-based insights into the functional operations, computational objectives and circuit mechanisms underlying V1 activity. These different modelling approaches all predict that response sub-additivity and variability quenching often co-occur. The phenomenology of these two response motifs, as well as many of the insights obtained about them in V1, generalize to other cortical areas. Thus, the connection between response sub-additivity and variability quenching may be a canonical motif across the cortex.
Collapse
Affiliation(s)
- Robbe L T Goris
- Center for Perceptual Systems, University of Texas at Austin, Austin, TX, USA.
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Kenneth D Miller
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
- Kavli Institute for Brain Science, Columbia University, New York, NY, USA
- Dept. of Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA
- Morton B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Swartz Program in Theoretical Neuroscience, Columbia University, New York, NY, USA
| | - Nicholas J Priebe
- Center for Learning and Memory, University of Texas at Austin, Austin, TX, USA
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK
- Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
36
|
Deng K, Schwendeman PS, Guan Y. Predicting Single Neuron Responses of the Primary Visual Cortex with Deep Learning Model. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2305626. [PMID: 38350735 PMCID: PMC11022733 DOI: 10.1002/advs.202305626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/03/2024] [Indexed: 02/15/2024]
Abstract
Modeling neuron responses to stimuli can shed light on next-generation technologies such as brain-chip interfaces. Furthermore, high-performing models can serve to help formulate hypotheses and reveal the mechanisms underlying neural responses. Here the state-of-the-art computational model is presented for predicting single neuron responses to natural stimuli in the primary visual cortex (V1) of mice. The algorithm incorporates object positions and assembles multiple models with different train-validation data, resulting in a 15%-30% improvement over the existing models in cross-subject predictions and ranking first in the SENSORIUM 2022 Challenge, which benchmarks methods for neuron-specific prediction based on thousands of images. Importantly, The model reveals evidence that the spatial organizations of V1 are conserved across mice. This model will serve as an important noninvasive tool for understanding and utilizing the response patterns of primary visual cortex neurons.
Collapse
Affiliation(s)
- Kaiwen Deng
- Department of Computational Medicine and BioinformaticsUniversity of MichiganAnn ArborMI48105USA
| | | | - Yuanfang Guan
- Department of Computational Medicine and BioinformaticsUniversity of MichiganAnn ArborMI48105USA
| |
Collapse
|
37
|
Pan X, Coen-Cagli R, Schwartz O. Probing the Structure and Functional Properties of the Dropout-Induced Correlated Variability in Convolutional Neural Networks. Neural Comput 2024; 36:621-644. [PMID: 38457752 PMCID: PMC11164410 DOI: 10.1162/neco_a_01652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 12/04/2023] [Indexed: 03/10/2024]
Abstract
Computational neuroscience studies have shown that the structure of neural variability to an unchanged stimulus affects the amount of information encoded. Some artificial deep neural networks, such as those with Monte Carlo dropout layers, also have variable responses when the input is fixed. However, the structure of the trial-by-trial neural covariance in neural networks with dropout has not been studied, and its role in decoding accuracy is unknown. We studied the above questions in a convolutional neural network model with dropout in both the training and testing phases. We found that trial-by-trial correlation between neurons (i.e., noise correlation) is positive and low dimensional. Neurons that are close in a feature map have larger noise correlation. These properties are surprisingly similar to the findings in the visual cortex. We further analyzed the alignment of the main axes of the covariance matrix. We found that different images share a common trial-by-trial noise covariance subspace, and they are aligned with the global signal covariance. This evidence that the noise covariance is aligned with signal covariance suggests that noise covariance in dropout neural networks reduces network accuracy, which we further verified directly with a trial-shuffling procedure commonly used in neuroscience. These findings highlight a previously overlooked aspect of dropout layers that can affect network performance. Such dropout networks could also potentially be a computational model of neural variability.
Collapse
Affiliation(s)
- Xu Pan
- Department of Computer Science, University of Miami, Coral Gables, FL 33146, U.S.A.
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Dominick Purpura Department of Neuroscience, and Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY 10461, U.S.A.
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL 33146, U.S.A.
| |
Collapse
|
38
|
Jang H, Tong F. Improved modeling of human vision by incorporating robustness to blur in convolutional neural networks. Nat Commun 2024; 15:1989. [PMID: 38443349 PMCID: PMC10915141 DOI: 10.1038/s41467-024-45679-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 01/30/2024] [Indexed: 03/07/2024] Open
Abstract
Whenever a visual scene is cast onto the retina, much of it will appear degraded due to poor resolution in the periphery; moreover, optical defocus can cause blur in central vision. However, the pervasiveness of blurry or degraded input is typically overlooked in the training of convolutional neural networks (CNNs). We hypothesized that the absence of blurry training inputs may cause CNNs to rely excessively on high spatial frequency information for object recognition, thereby causing systematic deviations from biological vision. We evaluated this hypothesis by comparing standard CNNs with CNNs trained on a combination of clear and blurry images. We show that blur-trained CNNs outperform standard CNNs at predicting neural responses to objects across a variety of viewing conditions. Moreover, blur-trained CNNs acquire increased sensitivity to shape information and greater robustness to multiple forms of visual noise, leading to improved correspondence with human perception. Our results provide multi-faceted neurocomputational evidence that blurry visual experiences may be critical for conferring robustness to biological visual systems.
Collapse
Affiliation(s)
- Hojin Jang
- Department of Psychology, Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA.
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea.
| | - Frank Tong
- Department of Psychology, Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA.
| |
Collapse
|
39
|
Leong F, Rahmani B, Psaltis D, Moser C, Ghezzi D. An actor-model framework for visual sensory encoding. Nat Commun 2024; 15:808. [PMID: 38280912 PMCID: PMC10821921 DOI: 10.1038/s41467-024-45105-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 01/15/2024] [Indexed: 01/29/2024] Open
Abstract
A fundamental challenge in neuroengineering is determining a proper artificial input to a sensory system that yields the desired perception. In neuroprosthetics, this process is known as artificial sensory encoding, and it holds a crucial role in prosthetic devices restoring sensory perception in individuals with disabilities. For example, in visual prostheses, one key aspect of artificial image encoding is to downsample images captured by a camera to a size matching the number of inputs and resolution of the prosthesis. Here, we show that downsampling an image using the inherent computation of the retinal network yields better performance compared to learning-free downsampling methods. We have validated a learning-based approach (actor-model framework) that exploits the signal transformation from photoreceptors to retinal ganglion cells measured in explanted mouse retinas. The actor-model framework generates downsampled images eliciting a neuronal response in-silico and ex-vivo with higher neuronal reliability than the one produced by a learning-free approach. During the learning process, the actor network learns to optimize contrast and the kernel's weights. This methodological approach might guide future artificial image encoding strategies for visual prostheses. Ultimately, this framework could be applicable for encoding strategies in other sensory prostheses such as cochlear or limb.
Collapse
Affiliation(s)
- Franklin Leong
- Medtronic Chair in Neuroengineering, Center for Neuroprosthetics and Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland
| | - Babak Rahmani
- Laboratory of Applied Photonics Devices, Institute of Electrical and Micro Engineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Microsoft Research, Cambridge, UK
| | - Demetri Psaltis
- Optics Laboratory, Institute of Electrical and Micro Engineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Christophe Moser
- Laboratory of Applied Photonics Devices, Institute of Electrical and Micro Engineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Diego Ghezzi
- Medtronic Chair in Neuroengineering, Center for Neuroprosthetics and Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland.
- Ophthalmic and Neural Technologies Laboratory, Department of Ophthalmology, University of Lausanne, Hôpital ophtalmique Jules-Gonin, Fondation Asile des Aveugles, Lausanne, Switzerland.
| |
Collapse
|
40
|
Peters B, DiCarlo JJ, Gureckis T, Haefner R, Isik L, Tenenbaum J, Konkle T, Naselaris T, Stachenfeld K, Tavares Z, Tsao D, Yildirim I, Kriegeskorte N. How does the primate brain combine generative and discriminative computations in vision? ARXIV 2024:arXiv:2401.06005v1. [PMID: 38259351 PMCID: PMC10802669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
Collapse
Affiliation(s)
- Benjamin Peters
- Zuckerman Mind Brain Behavior Institute, Columbia University
- School of Psychology & Neuroscience, University of Glasgow
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT
- McGovern Institute for Brain Research, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Quest for Intelligence, Schwarzman College of Computing, MIT
| | | | - Ralf Haefner
- Brain and Cognitive Sciences, University of Rochester
- Center for Visual Science, University of Rochester
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University
| | - Joshua Tenenbaum
- Department of Brain and Cognitive Sciences, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Computer Science and Artificial Intelligence Laboratory, MIT
| | - Talia Konkle
- Department of Psychology, Harvard University
- Center for Brain Science, Harvard University
- Kempner Institute for Natural and Artificial Intelligence, Harvard University
| | | | | | - Zenna Tavares
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Data Science Institute, Columbia University
| | - Doris Tsao
- Dept of Molecular & Cell Biology, University of California Berkeley
- Howard Hughes Medical Institute
| | - Ilker Yildirim
- Department of Psychology, Yale University
- Department of Statistics and Data Science, Yale University
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Department of Psychology, Columbia University
- Department of Neuroscience, Columbia University
- Department of Electrical Engineering, Columbia University
| |
Collapse
|
41
|
Seignette K, Jamann N, Papale P, Terra H, Porneso RO, de Kraker L, van der Togt C, van der Aa M, Neering P, Ruimschotel E, Roelfsema PR, Montijn JS, Self MW, Kole MHP, Levelt CN. Experience shapes chandelier cell function and structure in the visual cortex. eLife 2024; 12:RP91153. [PMID: 38192196 PMCID: PMC10963032 DOI: 10.7554/elife.91153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024] Open
Abstract
Detailed characterization of interneuron types in primary visual cortex (V1) has greatly contributed to understanding visual perception, yet the role of chandelier cells (ChCs) in visual processing remains poorly characterized. Using viral tracing we found that V1 ChCs predominantly receive monosynaptic input from local layer 5 pyramidal cells and higher-order cortical regions. Two-photon calcium imaging and convolutional neural network modeling revealed that ChCs are visually responsive but weakly selective for stimulus content. In mice running in a virtual tunnel, ChCs respond strongly to events known to elicit arousal, including locomotion and visuomotor mismatch. Repeated exposure of the mice to the virtual tunnel was accompanied by reduced visual responses of ChCs and structural plasticity of ChC boutons and axon initial segment length. Finally, ChCs only weakly inhibited pyramidal cells. These findings suggest that ChCs provide an arousal-related signal to layer 2/3 pyramidal cells that may modulate their activity and/or gate plasticity of their axon initial segments during behaviorally relevant events.
Collapse
Affiliation(s)
- Koen Seignette
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Nora Jamann
- Department of Axonal Signaling, Netherlands Institute for NeuroscienceAmsterdamNetherlands
- Department of Biology Cell Biology, Neurobiology and Biophysics, Faculty of Science, Utrecht UniversityUtrechtNetherlands
| | - Paolo Papale
- Department of Vision & Cognition, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Huub Terra
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Ralph O Porneso
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Leander de Kraker
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Chris van der Togt
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
- Department of Vision & Cognition, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Maaike van der Aa
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Paul Neering
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
- Department of Vision & Cognition, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Emma Ruimschotel
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Pieter R Roelfsema
- Department of Vision & Cognition, Netherlands Institute for NeuroscienceAmsterdamNetherlands
- Laboratory of Visual Brain Therapy, Sorbonne Université, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Institut de la VisionParisFrance
- Department of Integrative Neurophysiology, Centre for Neurogenomics and Cognitive Research, VU UniversityAmsterdamNetherlands
- Department of Psychiatry, Academic Medical Center, University of AmsterdamAmsterdamNetherlands
| | - Jorrit S Montijn
- Department of Cortical Structure & Function, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Matthew W Self
- Department of Vision & Cognition, Netherlands Institute for NeuroscienceAmsterdamNetherlands
| | - Maarten HP Kole
- Department of Axonal Signaling, Netherlands Institute for NeuroscienceAmsterdamNetherlands
- Department of Biology Cell Biology, Neurobiology and Biophysics, Faculty of Science, Utrecht UniversityUtrechtNetherlands
| | - Christiaan N Levelt
- Department of Molecular Visual Plasticity, Netherlands Institute for NeuroscienceAmsterdamNetherlands
- Department of Molecular and Cellular Neurobiology, Center for Neurogenomics and Cognitive Research, VU University AmsterdamAmsterdamNetherlands
| |
Collapse
|
42
|
Xu A, Hou Y, Niell CM, Beyeler M. Multimodal Deep Learning Model Unveils Behavioral Dynamics of V1 Activity in Freely Moving Mice. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2023; 36:15341-15357. [PMID: 39005944 PMCID: PMC11242920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Despite their immense success as a model of macaque visual cortex, deep convolutional neural networks (CNNs) have struggled to predict activity in visual cortex of the mouse, which is thought to be strongly dependent on the animal's behavioral state. Furthermore, most computational models focus on predicting neural responses to static images presented under head fixation, which are dramatically different from the dynamic, continuous visual stimuli that arise during movement in the real world. Consequently, it is still unknown how natural visual input and different behavioral variables may integrate over time to generate responses in primary visual cortex (V1). To address this, we introduce a multimodal recurrent neural network that integrates gaze-contingent visual input with behavioral and temporal dynamics to explain V1 activity in freely moving mice. We show that the model achieves state-of-the-art predictions of V1 activity during free exploration and demonstrate the importance of each component in an extensive ablation study. Analyzing our model using maximally activating stimuli and saliency maps, we reveal new insights into cortical function, including the prevalence of mixed selectivity for behavioral variables in mouse V1. In summary, our model offers a comprehensive deep-learning framework for exploring the computational principles underlying V1 neurons in freely-moving animals engaged in natural behavior.
Collapse
Affiliation(s)
- Aiwen Xu
- Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93117
| | - Yuchen Hou
- Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93117
| | - Cristopher M Niell
- Department of Biology, Institute of Neuroscience University of Oregon Eugene, OR 97403
| | - Michael Beyeler
- Department of Computer Science Department of Psychological & Brain Sciences University of California, Santa Barbara Santa Barbara, CA 93117
| |
Collapse
|
43
|
Dibot NM, Tieo S, Mendelson TC, Puech W, Renoult JP. Sparsity in an artificial neural network predicts beauty: Towards a model of processing-based aesthetics. PLoS Comput Biol 2023; 19:e1011703. [PMID: 38048323 PMCID: PMC10721202 DOI: 10.1371/journal.pcbi.1011703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 12/14/2023] [Accepted: 11/20/2023] [Indexed: 12/06/2023] Open
Abstract
Generations of scientists have pursued the goal of defining beauty. While early scientists initially focused on objective criteria of beauty ('feature-based aesthetics'), philosophers and artists alike have since proposed that beauty arises from the interaction between the object and the individual who perceives it. The aesthetic theory of fluency formalizes this idea of interaction by proposing that beauty is determined by the efficiency of information processing in the perceiver's brain ('processing-based aesthetics'), and that efficient processing induces a positive aesthetic experience. The theory is supported by numerous psychological results, however, to date there is no quantitative predictive model to test it on a large scale. In this work, we propose to leverage the capacity of deep convolutional neural networks (DCNN) to model the processing of information in the brain by studying the link between beauty and neuronal sparsity, a measure of information processing efficiency. Whether analyzing pictures of faces, figurative or abstract art paintings, neuronal sparsity explains up to 28% of variance in beauty scores, and up to 47% when combined with a feature-based metric. However, we also found that sparsity is either positively or negatively correlated with beauty across the multiple layers of the DCNN. Our quantitative model stresses the importance of considering how information is processed, in addition to the content of that information, when predicting beauty, but also suggests an unexpectedly complex relationship between fluency and beauty.
Collapse
Affiliation(s)
- Nicolas M. Dibot
- CEFE, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
| | - Sonia Tieo
- CEFE, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Tamra C. Mendelson
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America
| | | | | |
Collapse
|
44
|
Singer Y, Taylor L, Willmore BDB, King AJ, Harper NS. Hierarchical temporal prediction captures motion processing along the visual pathway. eLife 2023; 12:e52599. [PMID: 37844199 PMCID: PMC10629830 DOI: 10.7554/elife.52599] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 10/04/2023] [Indexed: 10/18/2023] Open
Abstract
Visual neurons respond selectively to features that become increasingly complex from the eyes to the cortex. Retinal neurons prefer flashing spots of light, primary visual cortical (V1) neurons prefer moving bars, and those in higher cortical areas favor complex features like moving textures. Previously, we showed that V1 simple cell tuning can be accounted for by a basic model implementing temporal prediction - representing features that predict future sensory input from past input (Singer et al., 2018). Here, we show that hierarchical application of temporal prediction can capture how tuning properties change across at least two levels of the visual system. This suggests that the brain does not efficiently represent all incoming information; instead, it selectively represents sensory inputs that help in predicting the future. When applied hierarchically, temporal prediction extracts time-varying features that depend on increasingly high-level statistics of the sensory input.
Collapse
Affiliation(s)
- Yosef Singer
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Luke Taylor
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Ben DB Willmore
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| |
Collapse
|
45
|
Ma G, Yan R, Tang H. Exploiting noise as a resource for computation and learning in spiking neural networks. PATTERNS (NEW YORK, N.Y.) 2023; 4:100831. [PMID: 37876899 PMCID: PMC10591140 DOI: 10.1016/j.patter.2023.100831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 07/06/2023] [Accepted: 08/07/2023] [Indexed: 10/26/2023]
Abstract
Networks of spiking neurons underpin the extraordinary information-processing capabilities of the brain and have become pillar models in neuromorphic artificial intelligence. Despite extensive research on spiking neural networks (SNNs), most studies are established on deterministic models, overlooking the inherent non-deterministic, noisy nature of neural computations. This study introduces the noisy SNN (NSNN) and the noise-driven learning (NDL) rule by incorporating noisy neuronal dynamics to exploit the computational advantages of noisy neural processing. The NSNN provides a theoretical framework that yields scalable, flexible, and reliable computation and learning. We demonstrate that this framework leads to spiking neural models with competitive performance, improved robustness against challenging perturbations compared with deterministic SNNs, and better reproducing probabilistic computation in neural coding. Generally, this study offers a powerful and easy-to-use tool for machine learning, neuromorphic intelligence practitioners, and computational neuroscience researchers.
Collapse
Affiliation(s)
- Gehua Ma
- College of Computer Science and Technology, Zhejiang University, Hangzhou, PRC
| | - Rui Yan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, PRC
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, PRC
- State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou, PRC
| |
Collapse
|
46
|
Pham TQ, Matsui T, Chikazoe J. Evaluation of the Hierarchical Correspondence between the Human Brain and Artificial Neural Networks: A Review. BIOLOGY 2023; 12:1330. [PMID: 37887040 PMCID: PMC10604784 DOI: 10.3390/biology12101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/22/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023]
Abstract
Artificial neural networks (ANNs) that are heavily inspired by the human brain now achieve human-level performance across multiple task domains. ANNs have thus drawn attention in neuroscience, raising the possibility of providing a framework for understanding the information encoded in the human brain. However, the correspondence between ANNs and the brain cannot be measured directly. They differ in outputs and substrates, neurons vastly outnumber their ANN analogs (i.e., nodes), and the key algorithm responsible for most of modern ANN training (i.e., backpropagation) is likely absent from the brain. Neuroscientists have thus taken a variety of approaches to examine the similarity between the brain and ANNs at multiple levels of their information hierarchy. This review provides an overview of the currently available approaches and their limitations for evaluating brain-ANN correspondence.
Collapse
Affiliation(s)
| | - Teppei Matsui
- Graduate School of Brain Science, Doshisha University, Kyoto 610-0321, Japan
| | | |
Collapse
|
47
|
Kobylkov D, Zanon M, Perrino M, Vallortigara G. Neural coding of numerousness. Biosystems 2023; 232:104999. [PMID: 37574182 DOI: 10.1016/j.biosystems.2023.104999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 08/08/2023] [Accepted: 08/10/2023] [Indexed: 08/15/2023]
Abstract
Perception of numerousness, i.e. number of items in a set, is an important cognitive ability, which is present in several animal taxa. In spite of obvious differences in neuroanatomy, insects, fishes, reptiles, birds, and mammals all possess a "number sense". Furthermore, information regarding numbers can belong to different sensory modalities: animals can estimate a number of visual items, a number of tones, or a number of their own movements. Given both the heterogeneity of stimuli and of the brains processing these stimuli, it is hard to imagine that number cognition can be traced back to the same evolutionary conserved neural pathway. However, neurons that selectively respond to the number of stimuli have been described in higher-order integration brain centres both in primates and in birds, two evolutionary distant groups. Although most probably not of the same evolutionary origin, these number neurons share remarkable similarities in their response properties. Instead of homology, this similarity might result from computational advantages of the underlying coding mechanism. This means that one might expect numerousness information to undergo similar steps of neural processing even in evolutionary distant neural pathways. Following this logic, in this review we summarize our current knowledge of how numerousness is processed in the brain from sensory input to coding of abstract information in the higher-order integration centres. We also propose a list of key open questions that might promote future research on number cognition.
Collapse
Affiliation(s)
- Dmitry Kobylkov
- Centre for Mind/Brain Science, CIMeC, University of Trento, Rovereto, Italy
| | - Mirko Zanon
- Centre for Mind/Brain Science, CIMeC, University of Trento, Rovereto, Italy
| | - Matilde Perrino
- Centre for Mind/Brain Science, CIMeC, University of Trento, Rovereto, Italy
| | | |
Collapse
|
48
|
Nayebi A, Kong NCL, Zhuang C, Gardner JL, Norcia AM, Yamins DLK. Mouse visual cortex as a limited resource system that self-learns an ecologically-general representation. PLoS Comput Biol 2023; 19:e1011506. [PMID: 37782673 PMCID: PMC10569538 DOI: 10.1371/journal.pcbi.1011506] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 10/12/2023] [Accepted: 09/11/2023] [Indexed: 10/04/2023] Open
Abstract
Studies of the mouse visual system have revealed a variety of visual brain areas that are thought to support a multitude of behavioral capacities, ranging from stimulus-reward associations, to goal-directed navigation, and object-centric discriminations. However, an overall understanding of the mouse's visual cortex, and how it supports a range of behaviors, remains unknown. Here, we take a computational approach to help address these questions, providing a high-fidelity quantitative model of mouse visual cortex and identifying key structural and functional principles underlying that model's success. Structurally, we find that a comparatively shallow network structure with a low-resolution input is optimal for modeling mouse visual cortex. Our main finding is functional-that models trained with task-agnostic, self-supervised objective functions based on the concept of contrastive embeddings are much better matches to mouse cortex, than models trained on supervised objectives or alternative self-supervised methods. This result is very much unlike in primates where prior work showed that the two were roughly equivalent, naturally leading us to ask the question of why these self-supervised objectives are better matches than supervised ones in mouse. To this end, we show that the self-supervised, contrastive objective builds a general-purpose visual representation that enables the system to achieve better transfer on out-of-distribution visual scene understanding and reward-based navigation tasks. Our results suggest that mouse visual cortex is a low-resolution, shallow network that makes best use of the mouse's limited resources to create a light-weight, general-purpose visual system-in contrast to the deep, high-resolution, and more categorization-dominated visual system of primates.
Collapse
Affiliation(s)
- Aran Nayebi
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Neurosciences Ph.D. Program, Stanford University, Stanford, California, United States of America
- McGovern Institute for Brain Research, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Nathan C. L. Kong
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Chengxu Zhuang
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- McGovern Institute for Brain Research, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
| | - Justin L. Gardner
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
| | - Anthony M. Norcia
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
| | - Daniel L. K. Yamins
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
- Department of Computer Science, Stanford University, Stanford, California, United States of America
| |
Collapse
|
49
|
Papale P, Wang F, Morgan AT, Chen X, Gilhuis A, Petro LS, Muckli L, Roelfsema PR, Self MW. The representation of occluded image regions in area V1 of monkeys and humans. Curr Biol 2023; 33:3865-3871.e3. [PMID: 37643620 DOI: 10.1016/j.cub.2023.08.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/04/2023] [Accepted: 08/02/2023] [Indexed: 08/31/2023]
Abstract
Neuronal activity in the primary visual cortex (V1) is driven by feedforward input from within the neurons' receptive fields (RFs) and modulated by contextual information in regions surrounding the RF. The effect of contextual information on spiking activity occurs rapidly and is therefore challenging to dissociate from feedforward input. To address this challenge, we recorded the spiking activity of V1 neurons in monkeys viewing either natural scenes or scenes where the information in the RF was occluded, effectively removing the feedforward input. We found that V1 neurons responded rapidly and selectively to occluded scenes. V1 responses elicited by occluded stimuli could be used to decode individual scenes and could be predicted from those elicited by non-occluded images, indicating that there is an overlap between visually driven and contextual responses. We used representational similarity analysis to show that the structure of V1 representations of occluded scenes measured with electrophysiology in monkeys correlates strongly with the representations of the same scenes in humans measured with functional magnetic resonance imaging (fMRI). Our results reveal that contextual influences rapidly alter V1 spiking activity in monkeys over distances of several degrees in the visual field, carry information about individual scenes, and resemble those in human V1. VIDEO ABSTRACT.
Collapse
Affiliation(s)
- Paolo Papale
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands.
| | - Feng Wang
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands
| | - A Tyler Morgan
- Centre for Cognitive NeuroImaging, School of Psychology and Neuroscience, College of Medical, Veterinary and Life Sciences, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, UK; Imaging Centre for Excellence (ICE), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G51 4LB, UK
| | - Xing Chen
- Department of Ophthalmology, University of Pittsburgh School of Medicine, 203 Lothrop St, Pittsburgh, PA 15213, USA
| | - Amparo Gilhuis
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands
| | - Lucy S Petro
- Centre for Cognitive NeuroImaging, School of Psychology and Neuroscience, College of Medical, Veterinary and Life Sciences, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, UK; Imaging Centre for Excellence (ICE), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G51 4LB, UK
| | - Lars Muckli
- Centre for Cognitive NeuroImaging, School of Psychology and Neuroscience, College of Medical, Veterinary and Life Sciences, University of Glasgow, 62 Hillhead Street, Glasgow G12 8QB, UK; Imaging Centre for Excellence (ICE), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G51 4LB, UK
| | - Pieter R Roelfsema
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands; Department of Integrative Neurophysiology, VU University, De Boelelaan 1085, 1081 HV Amsterdam, the Netherlands; Department of Neurosurgery, Academic Medical Centre, Postbus 22660, 1100 DD Amsterdam, the Netherlands; Laboratory of Visual Brain Therapy, Sorbonne Université, INSERM, CNRS, Institut de la Vision, 17 rue Moreau, 75012 Paris, France.
| | - Matthew W Self
- Department of Vision & Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam, the Netherlands
| |
Collapse
|
50
|
Pan X, DeForge A, Schwartz O. Generalizing biological surround suppression based on center surround similarity via deep neural network models. PLoS Comput Biol 2023; 19:e1011486. [PMID: 37738258 PMCID: PMC10550176 DOI: 10.1371/journal.pcbi.1011486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 10/04/2023] [Accepted: 09/04/2023] [Indexed: 09/24/2023] Open
Abstract
Sensory perception is dramatically influenced by the context. Models of contextual neural surround effects in vision have mostly accounted for Primary Visual Cortex (V1) data, via nonlinear computations such as divisive normalization. However, surround effects are not well understood within a hierarchy, for neurons with more complex stimulus selectivity beyond V1. We utilized feedforward deep convolutional neural networks and developed a gradient-based technique to visualize the most suppressive and excitatory surround. We found that deep neural networks exhibited a key signature of surround effects in V1, highlighting center stimuli that visually stand out from the surround and suppressing responses when the surround stimulus is similar to the center. We found that in some neurons, especially in late layers, when the center stimulus was altered, the most suppressive surround surprisingly can follow the change. Through the visualization approach, we generalized previous understanding of surround effects to more complex stimuli, in ways that have not been revealed in visual cortices. In contrast, the suppression based on center surround similarity was not observed in an untrained network. We identified further successes and mismatches of the feedforward CNNs to the biology. Our results provide a testable hypothesis of surround effects in higher visual cortices, and the visualization approach could be adopted in future biological experimental designs.
Collapse
Affiliation(s)
- Xu Pan
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| | - Annie DeForge
- School of Information, University of California, Berkeley, CA, United States of America
- Bentley University, Waltham, MA, United States of America
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| |
Collapse
|