201
|
Hamano Y, Nagasaka S, Shouno H. Exploring the role of texture features in deep convolutional neural networks: Insights from Portilla-Simoncelli statistics. Neural Netw 2023; 168:300-312. [PMID: 37774515 DOI: 10.1016/j.neunet.2023.09.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 09/11/2023] [Accepted: 09/15/2023] [Indexed: 10/01/2023]
Abstract
It is well-understood that the performance of Deep Convolutional Neural Networks (DCNNs) in image recognition tasks is influenced not only by shape but also by texture information. Despite this, understanding the internal representations of DCNNs remains a challenging task. This study employs a simplified version of the Portilla-Simoncelli Statistics, termed "minPS," to explore how texture information is represented in a pre-trained VGG network. Using minPS features extracted from texture images, we perform a sparse regression on the activations across various channels in VGG layers. Our findings reveal that channels in the early to middle layers of the VGG network can be effectively described by minPS features. Additionally, we observe that the explanatory power of minPS sub-groups evolves as one ascends the network hierarchy. Specifically, sub-groups termed Linear Cross Scale (LCS) and Energy Cross Scale (ECS) exhibit weak explanatory power for VGG channels. To investigate the relationship further, we compare the original texture images with their synthesized counterparts, generated using VGG, in terms of minPS features. Our results indicate that the absence of certain minPS features suggests their non-utilization in VGG's internal representations.
Collapse
Affiliation(s)
- Yusuke Hamano
- NEC Corporation, Shiba 5-7-1, Minato-ku, Tokyo, Japan
| | - Shoko Nagasaka
- The University of Electro-Communications, Chofu, Tokyo, Japan
| | - Hayaru Shouno
- The University of Electro-Communications, Chofu, Tokyo, Japan.
| |
Collapse
|
202
|
Tsuda B, Richmond BJ, Sejnowski TJ. Exploring strategy differences between humans and monkeys with recurrent neural networks. PLoS Comput Biol 2023; 19:e1011618. [PMID: 37983250 PMCID: PMC10695363 DOI: 10.1371/journal.pcbi.1011618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 12/04/2023] [Accepted: 10/19/2023] [Indexed: 11/22/2023] Open
Abstract
Animal models are used to understand principles of human biology. Within cognitive neuroscience, non-human primates are considered the premier model for studying decision-making behaviors in which direct manipulation experiments are still possible. Some prominent studies have brought to light major discrepancies between monkey and human cognition, highlighting problems with unverified extrapolation from monkey to human. Here, we use a parallel model system-artificial neural networks (ANNs)-to investigate a well-established discrepancy identified between monkeys and humans with a working memory task, in which monkeys appear to use a recency-based strategy while humans use a target-selective strategy. We find that ANNs trained on the same task exhibit a progression of behavior from random behavior (untrained) to recency-like behavior (partially trained) and finally to selective behavior (further trained), suggesting monkeys and humans may occupy different points in the same overall learning progression. Surprisingly, what appears to be recency-like behavior in the ANN, is in fact an emergent non-recency-based property of the organization of the neural network's state space during its development through training. We find that explicit encouragement of recency behavior during training has a dual effect, not only causing an accentuated recency-like behavior, but also speeding up the learning process altogether, resulting in an efficient shaping mechanism to achieve the optimal strategy. Our results suggest a new explanation for the discrepency observed between monkeys and humans and reveal that what can appear to be a recency-based strategy in some cases may not be recency at all.
Collapse
Affiliation(s)
- Ben Tsuda
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, California, United States of America
- Neurosciences Graduate Program, University of California San Diego, La Jolla, California, United States of America
- Medical Scientist Training Program, University of California San Diego, La Jolla, California, United States of America
| | - Barry J. Richmond
- Section on Neural Coding and Computation, National Institute of Mental Health, Bethesda, Maryland, United States of America
| | - Terrence J. Sejnowski
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, California, United States of America
- Institute for Neural Computation, University of California San Diego, La Jolla, California, United States of America
- Division of Biological Sciences, University of California San Diego, La Jolla, California, United States of America
| |
Collapse
|
203
|
Halvagal MS, Zenke F. The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks. Nat Neurosci 2023; 26:1906-1915. [PMID: 37828226 PMCID: PMC10620089 DOI: 10.1038/s41593-023-01460-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 09/08/2023] [Indexed: 10/14/2023]
Abstract
Recognition of objects from sensory stimuli is essential for survival. To that end, sensory networks in the brain must form object representations invariant to stimulus changes, such as size, orientation and context. Although Hebbian plasticity is known to shape sensory networks, it fails to create invariant object representations in computational models, raising the question of how the brain achieves such processing. In the present study, we show that combining Hebbian plasticity with a predictive form of plasticity leads to invariant representations in deep neural network models. We derive a local learning rule that generalizes to spiking neural networks and naturally accounts for several experimentally observed properties of synaptic plasticity, including metaplasticity and spike-timing-dependent plasticity. Finally, our model accurately captures neuronal selectivity changes observed in the primate inferotemporal cortex in response to altered visual experience. Thus, we provide a plausible normative theory emphasizing the importance of predictive plasticity mechanisms for successful representational learning.
Collapse
Affiliation(s)
- Manu Srinath Halvagal
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- Faculty of Science, University of Basel, Basel, Switzerland
| | - Friedemann Zenke
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- Faculty of Science, University of Basel, Basel, Switzerland.
| |
Collapse
|
204
|
Velarde OM, Makse HA, Parra LC. Architecture of the brain's visual system enhances network stability and performance through layers, delays, and feedback. PLoS Comput Biol 2023; 19:e1011078. [PMID: 37948463 PMCID: PMC10664920 DOI: 10.1371/journal.pcbi.1011078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 11/22/2023] [Accepted: 10/19/2023] [Indexed: 11/12/2023] Open
Abstract
In the visual system of primates, image information propagates across successive cortical areas, and there is also local feedback within an area and long-range feedback across areas. Recent findings suggest that the resulting temporal dynamics of neural activity are crucial in several vision tasks. In contrast, artificial neural network models of vision are typically feedforward and do not capitalize on the benefits of temporal dynamics, partly due to concerns about stability and computational costs. In this study, we focus on recurrent networks with feedback connections for visual tasks with static input corresponding to a single fixation. We demonstrate mathematically that a network's dynamics can be stabilized by four key features of biological networks: layer-ordered structure, temporal delays between layers, longer distance feedback across layers, and nonlinear neuronal responses. Conversely, when feedback has a fixed distance, one can omit delays in feedforward connections to achieve more efficient artificial implementations. We also evaluated the effect of feedback connections on object detection and classification performance using standard benchmarks, specifically the COCO and CIFAR10 datasets. Our findings indicate that feedback connections improved the detection of small objects, and classification performance became more robust to noise. We found that performance increased with the temporal dynamics, not unlike what is observed in core vision of primates. These results suggest that delays and layered organization are crucial features for stability and performance in both biological and artificial recurrent neural networks.
Collapse
Affiliation(s)
- Osvaldo Matias Velarde
- Biomedical Engineering Department, The City College of New York, New York, New York, United States of America
| | - Hernán A. Makse
- Levich Institute and Physics Department, The City College of New York, New York, New York, United States of America
| | - Lucas C. Parra
- Biomedical Engineering Department, The City College of New York, New York, New York, United States of America
| |
Collapse
|
205
|
Karapetian A, Boyanova A, Pandaram M, Obermayer K, Kietzmann TC, Cichy RM. Empirically Identifying and Computationally Modeling the Brain-Behavior Relationship for Human Scene Categorization. J Cogn Neurosci 2023; 35:1879-1897. [PMID: 37590093 PMCID: PMC10586810 DOI: 10.1162/jocn_a_02043] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Humans effortlessly make quick and accurate perceptual decisions about the nature of their immediate visual environment, such as the category of the scene they face. Previous research has revealed a rich set of cortical representations potentially underlying this feat. However, it remains unknown which of these representations are suitably formatted for decision-making. Here, we approached this question empirically and computationally, using neuroimaging and computational modeling. For the empirical part, we collected EEG data and RTs from human participants during a scene categorization task (natural vs. man-made). We then related EEG data to behavior to behavior using a multivariate extension of signal detection theory. We observed a correlation between neural data and behavior specifically between ∼100 msec and ∼200 msec after stimulus onset, suggesting that the neural scene representations in this time period are suitably formatted for decision-making. For the computational part, we evaluated a recurrent convolutional neural network (RCNN) as a model of brain and behavior. Unifying our previous observations in an image-computable model, the RCNN predicted well the neural representations, the behavioral scene categorization data, as well as the relationship between them. Our results identify and computationally characterize the neural and behavioral correlates of scene categorization in humans.
Collapse
Affiliation(s)
- Agnessa Karapetian
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | | | | | - Klaus Obermayer
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Technische Universität Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| | | | - Radoslaw M Cichy
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| |
Collapse
|
206
|
Toosi T, Issa EB. Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward Alignment. ARXIV 2023:arXiv:2310.20599v1. [PMID: 37961740 PMCID: PMC10635293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
In natural vision, feedback connections support versatile visual inference capabilities such as making sense of the occluded or noisy bottom-up sensory information or mediating pure top-down processes such as imagination. However, the mechanisms by which the feedback pathway learns to give rise to these capabilities flexibly are not clear. We propose that top-down effects emerge through alignment between feedforward and feedback pathways, each optimizing its own objectives. To achieve this co-optimization, we introduce Feedback-Feedforward Alignment (FFA), a learning algorithm that leverages feedback and feedforward pathways as mutual credit assignment computational graphs, enabling alignment. In our study, we demonstrate the effectiveness of FFA in co-optimizing classification and reconstruction tasks on widely used MNIST and CIFAR10 datasets. Notably, the alignment mechanism in FFA endows feedback connections with emergent visual inference functions, including denoising, resolving occlusions, hallucination, and imagination. Moreover, FFA offers bio-plausibility compared to traditional back-propagation (BP) methods in implementation. By repurposing the computational graph of credit assignment into a goal-driven feedback pathway, FFA alleviates weight transport problems encountered in BP, enhancing the bio-plausibility of the learning algorithm. Our study presents FFA as a promising proof-of-concept for the mechanisms underlying how feedback connections in the visual cortex support flexible visual functions. This work also contributes to the broader field of visual inference underlying perceptual phenomena and has implications for developing more biologically inspired learning algorithms.
Collapse
Affiliation(s)
- Tahereh Toosi
- Center for Theoretical Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY
| | - Elias B. Issa
- Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY
| |
Collapse
|
207
|
Nayebi A, Rajalingham R, Jazayeri M, Yang GR. Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes. ARXIV 2023:arXiv:2305.11772v2. [PMID: 37292459 PMCID: PMC10246064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Humans and animals have a rich and flexible understanding of the physical world, which enables them to infer the underlying dynamical trajectories of objects and events, plausible future states, and use that to plan and anticipate the consequences of actions. However, the neural mechanisms underlying these computations are unclear. We combine a goal-driven modeling approach with dense neurophysiological data and high-throughput human behavioral readouts that contain thousands of comparisons to directly impinge on this question. Specifically, we construct and evaluate several classes of sensory-cognitive networks to predict the future state of rich, ethologically-relevant environments, ranging from self-supervised end-to-end models with pixel-wise or object-slot objectives, to models that future predict in the latent space of purely static image-pretrained or dynamic video-pretrained foundation models. We find that "scale is not all you need", and that many state-of-the-art machine learning models fail to perform well on our neural and behavioral benchmarks for future prediction. In fact, only one class of models matches these data well overall. We find that neural responses are currently best predicted by models trained to predict the future state of their environment in the latent space of pretrained foundation models optimized for dynamic scenes in a self-supervised manner. These models also approach the neurons' ability to predict the environmental state variables that are visually hidden from view, despite not being explicitly trained to do so. Finally, we find that not all foundation model latents are equal. Notably, models that future predict in the latent space of video foundation models that are optimized to support a diverse range of egocentric sensorimotor tasks, reasonably match both human behavioral error patterns and neural dynamics across all environmental scenarios that we were able to test. Overall, these findings suggest that the neural mechanisms and behaviors of primate mental simulation have strong inductive biases associated with them, and are thus far most consistent with being optimized to future predict on reusable visual representations that are useful for Embodied AI more generally.
Collapse
Affiliation(s)
- Aran Nayebi
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
| | - Rishi Rajalingham
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
- Reality Labs, Meta; 390 9th Ave, New York, NY 10001
| | - Mehrdad Jazayeri
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
- Department of Brain and Cognitive Sciences, MIT; Cambridge, MA 02139
| | - Guangyu Robert Yang
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
- Department of Brain and Cognitive Sciences, MIT; Cambridge, MA 02139
- Department of Electrical Engineering and Computer Science, MIT; Cambridge, MA 02139
| |
Collapse
|
208
|
Jiahui G, Feilong M, Visconti di Oleggio Castello M, Nastase SA, Haxby JV, Gobbini MI. Modeling naturalistic face processing in humans with deep convolutional neural networks. Proc Natl Acad Sci U S A 2023; 120:e2304085120. [PMID: 37847731 PMCID: PMC10614847 DOI: 10.1073/pnas.2304085120] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 09/11/2023] [Indexed: 10/19/2023] Open
Abstract
Deep convolutional neural networks (DCNNs) trained for face identification can rival and even exceed human-level performance. The ways in which the internal face representations in DCNNs relate to human cognitive representations and brain activity are not well understood. Nearly all previous studies focused on static face image processing with rapid display times and ignored the processing of naturalistic, dynamic information. To address this gap, we developed the largest naturalistic dynamic face stimulus set in human neuroimaging research (700+ naturalistic video clips of unfamiliar faces). We used this naturalistic dataset to compare representational geometries estimated from DCNNs, behavioral responses, and brain responses. We found that DCNN representational geometries were consistent across architectures, cognitive representational geometries were consistent across raters in a behavioral arrangement task, and neural representational geometries in face areas were consistent across brains. Representational geometries in late, fully connected DCNN layers, which are optimized for individuation, were much more weakly correlated with cognitive and neural geometries than were geometries in late-intermediate layers. The late-intermediate face-DCNN layers successfully matched cognitive representational geometries, as measured with a behavioral arrangement task that primarily reflected categorical attributes, and correlated with neural representational geometries in known face-selective topographies. Our study suggests that current DCNNs successfully capture neural cognitive processes for categorical attributes of faces but less accurately capture individuation and dynamic features.
Collapse
Affiliation(s)
- Guo Jiahui
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | - Ma Feilong
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | | | - Samuel A. Nastase
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ08544
| | - James V. Haxby
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | - M. Ida Gobbini
- Department of Medical and Surgical Sciences, University of Bologna, Bologna40138, Italy
- Istituti di Ricovero e Cura a Carattere Scientifico, Istituto delle Scienze Neurologiche di Bologna, Bologna40139, Italia
| |
Collapse
|
209
|
Tieo S, Dezeure J, Cryer A, Lepou P, Charpentier MJ, Renoult JP. Social and sexual consequences of facial femininity in a non-human primate. iScience 2023; 26:107901. [PMID: 37766996 PMCID: PMC10520438 DOI: 10.1016/j.isci.2023.107901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 08/11/2023] [Accepted: 09/08/2023] [Indexed: 09/29/2023] Open
Abstract
In humans, femininity shapes women's interactions with both genders, but its influence on animals remains unknown. Using 10 years of data on a wild primate, we developed an artificial intelligence-based method to estimate facial femininity from naturalistic portraits. Our method explains up to 30% of the variance in perceived femininity in humans, competing with classical methods using standardized pictures taken under laboratory conditions. We then showed that femininity estimated on 95 female mandrills significantly correlated with various socio-sexual behaviors. Unexpectedly, less feminine female mandrills were approached and aggressed more frequently by both sexes and received more male copulations, suggesting a positive valuation of masculinity attributes rather than a perception bias. This study contributes to understand the role of femininity on animal's sociality and offers a framework for non-invasive research on visual communication in behavioral ecology.
Collapse
Affiliation(s)
- Sonia Tieo
- CEFE, University Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Jules Dezeure
- Projet Mandrillus, Fondation Lékédi pour la Biodiversité, Bakoumba BP 52, Gabon
| | - Anna Cryer
- Projet Mandrillus, Fondation Lékédi pour la Biodiversité, Bakoumba BP 52, Gabon
| | - Pascal Lepou
- Projet Mandrillus, Fondation Lékédi pour la Biodiversité, Bakoumba BP 52, Gabon
| | - Marie J.E. Charpentier
- Institut des Sciences de l’Evolution de Montpellier (ISEM), UMR5554 - University of Montpellier/CNRS/IRD/EPHE, Place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| | | |
Collapse
|
210
|
Singer Y, Taylor L, Willmore BDB, King AJ, Harper NS. Hierarchical temporal prediction captures motion processing along the visual pathway. eLife 2023; 12:e52599. [PMID: 37844199 PMCID: PMC10629830 DOI: 10.7554/elife.52599] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 10/04/2023] [Indexed: 10/18/2023] Open
Abstract
Visual neurons respond selectively to features that become increasingly complex from the eyes to the cortex. Retinal neurons prefer flashing spots of light, primary visual cortical (V1) neurons prefer moving bars, and those in higher cortical areas favor complex features like moving textures. Previously, we showed that V1 simple cell tuning can be accounted for by a basic model implementing temporal prediction - representing features that predict future sensory input from past input (Singer et al., 2018). Here, we show that hierarchical application of temporal prediction can capture how tuning properties change across at least two levels of the visual system. This suggests that the brain does not efficiently represent all incoming information; instead, it selectively represents sensory inputs that help in predicting the future. When applied hierarchically, temporal prediction extracts time-varying features that depend on increasingly high-level statistics of the sensory input.
Collapse
Affiliation(s)
- Yosef Singer
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Luke Taylor
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Ben DB Willmore
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| |
Collapse
|
211
|
Ma G, Yan R, Tang H. Exploiting noise as a resource for computation and learning in spiking neural networks. PATTERNS (NEW YORK, N.Y.) 2023; 4:100831. [PMID: 37876899 PMCID: PMC10591140 DOI: 10.1016/j.patter.2023.100831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 07/06/2023] [Accepted: 08/07/2023] [Indexed: 10/26/2023]
Abstract
Networks of spiking neurons underpin the extraordinary information-processing capabilities of the brain and have become pillar models in neuromorphic artificial intelligence. Despite extensive research on spiking neural networks (SNNs), most studies are established on deterministic models, overlooking the inherent non-deterministic, noisy nature of neural computations. This study introduces the noisy SNN (NSNN) and the noise-driven learning (NDL) rule by incorporating noisy neuronal dynamics to exploit the computational advantages of noisy neural processing. The NSNN provides a theoretical framework that yields scalable, flexible, and reliable computation and learning. We demonstrate that this framework leads to spiking neural models with competitive performance, improved robustness against challenging perturbations compared with deterministic SNNs, and better reproducing probabilistic computation in neural coding. Generally, this study offers a powerful and easy-to-use tool for machine learning, neuromorphic intelligence practitioners, and computational neuroscience researchers.
Collapse
Affiliation(s)
- Gehua Ma
- College of Computer Science and Technology, Zhejiang University, Hangzhou, PRC
| | - Rui Yan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, PRC
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, PRC
- State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou, PRC
| |
Collapse
|
212
|
Pham TQ, Matsui T, Chikazoe J. Evaluation of the Hierarchical Correspondence between the Human Brain and Artificial Neural Networks: A Review. BIOLOGY 2023; 12:1330. [PMID: 37887040 PMCID: PMC10604784 DOI: 10.3390/biology12101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/22/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023]
Abstract
Artificial neural networks (ANNs) that are heavily inspired by the human brain now achieve human-level performance across multiple task domains. ANNs have thus drawn attention in neuroscience, raising the possibility of providing a framework for understanding the information encoded in the human brain. However, the correspondence between ANNs and the brain cannot be measured directly. They differ in outputs and substrates, neurons vastly outnumber their ANN analogs (i.e., nodes), and the key algorithm responsible for most of modern ANN training (i.e., backpropagation) is likely absent from the brain. Neuroscientists have thus taken a variety of approaches to examine the similarity between the brain and ANNs at multiple levels of their information hierarchy. This review provides an overview of the currently available approaches and their limitations for evaluating brain-ANN correspondence.
Collapse
Affiliation(s)
| | - Teppei Matsui
- Graduate School of Brain Science, Doshisha University, Kyoto 610-0321, Japan
| | | |
Collapse
|
213
|
Vacher J, Launay C, Mamassian P, Coen-Cagli R. Measuring uncertainty in human visual segmentation. ARXIV 2023:arXiv:2301.07807v3. [PMID: 36824425 PMCID: PMC9949179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Segmenting visual stimuli into distinct groups of features and visual objects is central to visual function. Classical psychophysical methods have helped uncover many rules of human perceptual segmentation, and recent progress in machine learning has produced successful algorithms. Yet, the computational logic of human segmentation remains unclear, partially because we lack well-controlled paradigms to measure perceptual segmentation maps and compare models quantitatively. Here we propose a new, integrated approach: given an image, we measure multiple pixel-based same-different judgments and perform model-based reconstruction of the underlying segmentation map. The reconstruction is robust to several experimental manipulations and captures the variability of individual participants. We demonstrate the validity of the approach on human segmentation of natural images and composite textures. We show that image uncertainty affects measured human variability, and it influences how participants weigh different visual features. Because any putative segmentation algorithm can be inserted to perform the reconstruction, our paradigm affords quantitative tests of theories of perception as well as new benchmarks for segmentation algorithms.
Collapse
Affiliation(s)
- Jonathan Vacher
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Claire Launay
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Pascal Mamassian
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
- Department. of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
214
|
Feng M, Xu J. Detection of ASD Children through Deep-Learning Application of fMRI. CHILDREN (BASEL, SWITZERLAND) 2023; 10:1654. [PMID: 37892317 PMCID: PMC10605350 DOI: 10.3390/children10101654] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/01/2023] [Accepted: 10/04/2023] [Indexed: 10/29/2023]
Abstract
Autism spectrum disorder (ASD) necessitates prompt diagnostic scrutiny to enable immediate, targeted interventions. This study unveils an advanced convolutional-neural-network (CNN) algorithm that was meticulously engineered to examine resting-state functional magnetic resonance imaging (fMRI) for early ASD detection in pediatric cohorts. The CNN architecture amalgamates convolutional, pooling, batch-normalization, dropout, and fully connected layers, optimized for high-dimensional data interpretation. Rigorous preprocessing yielded 22,176 two-dimensional echo planar samples from 126 subjects (56 ASD, 70 controls) who were sourced from the Autism Brain Imaging Data Exchange (ABIDE I) repository. The model, trained on 17,740 samples across 50 epochs, demonstrated unparalleled diagnostic metrics-accuracy of 99.39%, recall of 98.80%, precision of 99.85%, and an F1 score of 99.32%-and thereby eclipsed extant computational methodologies. Feature map analyses substantiated the model's hierarchical feature extraction capabilities. This research elucidates a deep learning framework for computer-assisted ASD screening via fMRI, with transformative implications for early diagnosis and intervention.
Collapse
Affiliation(s)
- Min Feng
- Nanjing Rehabilitation Medical Center, The Affiliated Brain Hospital, Nanjing Medical University, Nanjing 210029, China
- School of Chinese Language and Literature, Nanjing Normal University, Nanjing 210024, China
| | - Juncai Xu
- School of Engineering, Case Western Reserve University, Cleveland, OH 44106, USA;
| |
Collapse
|
215
|
Moro A, Greco M, Cappa SF. Large languages, impossible languages and human brains. Cortex 2023; 167:82-85. [PMID: 37540953 DOI: 10.1016/j.cortex.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/21/2023] [Accepted: 07/11/2023] [Indexed: 08/06/2023]
Abstract
We aim at offering a contribution to highlight the essential differences between Large Language Models (LLM) and the human language faculty. More explicitly, we claim that the existence of impossible languages for humans does not have any equivalent for LLM making them unsuitable models of the human language faculty, especially for a neurobiological point of view. The core part is preceded by two premises bearing on the distinction between machines and humans and the distinction between competence and performance.
Collapse
Affiliation(s)
- Andrea Moro
- Scuola Universitaria Superiore IUSS, Pavia, Italy
| | - Matteo Greco
- Scuola Universitaria Superiore IUSS, Pavia, Italy
| | - Stefano F Cappa
- Scuola Universitaria Superiore IUSS, Pavia, Italy; IRCCS Mondino Foundation, Pavia, Italy.
| |
Collapse
|
216
|
van Dyck LE, Gruber WR. Modeling Biological Face Recognition with Deep Convolutional Neural Networks. J Cogn Neurosci 2023; 35:1521-1537. [PMID: 37584587 DOI: 10.1162/jocn_a_02040] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground, and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces." In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.
Collapse
|
217
|
Farahat A, Effenberger F, Vinck M. A novel feature-scrambling approach reveals the capacity of convolutional neural networks to learn spatial relations. Neural Netw 2023; 167:400-414. [PMID: 37673027 PMCID: PMC7616855 DOI: 10.1016/j.neunet.2023.08.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 07/07/2023] [Accepted: 08/13/2023] [Indexed: 09/08/2023]
Abstract
Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition. Furthermore, CNNs have major applications in understanding the nature of visual representations in the human brain. Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans. Specifically, there is a major debate about the question of whether CNNs primarily rely on surface regularities of objects, or whether they are capable of exploiting the spatial arrangement of features, similar to humans. Here, we develop a novel feature-scrambling approach to explicitly test whether CNNs use the spatial arrangement of features (i.e. object parts) to classify objects. We combine this approach with a systematic manipulation of effective receptive field sizes of CNNs as well as minimal recognizable configurations (MIRCs) analysis. In contrast to much previous literature, we provide evidence that CNNs are in fact capable of using relatively long-range spatial relationships for object classification. Moreover, the extent to which CNNs use spatial relationships depends heavily on the dataset, e.g. texture vs. sketch. In fact, CNNs even use different strategies for different classes within heterogeneous datasets (ImageNet), suggesting CNNs have a continuous spectrum of classification strategies. Finally, we show that CNNs learn the spatial arrangement of features only up to an intermediate level of granularity, which suggests that intermediate rather than global shape features provide the optimal trade-off between sensitivity and specificity in object classification. These results provide novel insights into the nature of CNN representations and the extent to which they rely on the spatial arrangement of features for object classification.
Collapse
Affiliation(s)
- Amr Farahat
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany; Donders Centre for Neuroscience, Department of Neuroinformatics, Radboud University, Nijmegen, The Netherlands.
| | - Felix Effenberger
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany; Frankfurt Institute for Advanced Studies, Frankfurt, Germany
| | - Martin Vinck
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany; Donders Centre for Neuroscience, Department of Neuroinformatics, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
218
|
Ukita J, Ohki K. Adversarial attacks and defenses using feature-space stochasticity. Neural Netw 2023; 167:875-889. [PMID: 37722983 DOI: 10.1016/j.neunet.2023.08.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 05/27/2023] [Accepted: 08/13/2023] [Indexed: 09/20/2023]
Abstract
Recent studies in deep neural networks have shown that injecting random noise in the input layer of the networks contributes towards ℓp-norm-bounded adversarial perturbations. However, to defend against unrestricted adversarial examples, most of which are not ℓp-norm-bounded in the input layer, such input-layer random noise may not be sufficient. In the first part of this study, we generated a novel class of unrestricted adversarial examples termed feature-space adversarial examples. These examples are far from the original data in the input space but adjacent to the original data in a hidden-layer feature space and far again in the output layer. In the second part of this study, we empirically showed that while injecting random noise in the input layer was unable to defend these feature-space adversarial examples, they were defended by injecting random noise in the hidden layer. These results highlight the novel benefit of stochasticity in higher layers, in that it is useful for defending against these feature-space adversarial examples, a class of unrestricted adversarial examples.
Collapse
Affiliation(s)
- Jumpei Ukita
- Department of Physiology, The University of Tokyo School of Medicine, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan.
| | - Kenichi Ohki
- Department of Physiology, The University of Tokyo School of Medicine, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan; International Research Center for Neurointelligence (WPI-IRCN), 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan; Institute for AI and Beyond, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan.
| |
Collapse
|
219
|
Nayebi A, Kong NCL, Zhuang C, Gardner JL, Norcia AM, Yamins DLK. Mouse visual cortex as a limited resource system that self-learns an ecologically-general representation. PLoS Comput Biol 2023; 19:e1011506. [PMID: 37782673 PMCID: PMC10569538 DOI: 10.1371/journal.pcbi.1011506] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 10/12/2023] [Accepted: 09/11/2023] [Indexed: 10/04/2023] Open
Abstract
Studies of the mouse visual system have revealed a variety of visual brain areas that are thought to support a multitude of behavioral capacities, ranging from stimulus-reward associations, to goal-directed navigation, and object-centric discriminations. However, an overall understanding of the mouse's visual cortex, and how it supports a range of behaviors, remains unknown. Here, we take a computational approach to help address these questions, providing a high-fidelity quantitative model of mouse visual cortex and identifying key structural and functional principles underlying that model's success. Structurally, we find that a comparatively shallow network structure with a low-resolution input is optimal for modeling mouse visual cortex. Our main finding is functional-that models trained with task-agnostic, self-supervised objective functions based on the concept of contrastive embeddings are much better matches to mouse cortex, than models trained on supervised objectives or alternative self-supervised methods. This result is very much unlike in primates where prior work showed that the two were roughly equivalent, naturally leading us to ask the question of why these self-supervised objectives are better matches than supervised ones in mouse. To this end, we show that the self-supervised, contrastive objective builds a general-purpose visual representation that enables the system to achieve better transfer on out-of-distribution visual scene understanding and reward-based navigation tasks. Our results suggest that mouse visual cortex is a low-resolution, shallow network that makes best use of the mouse's limited resources to create a light-weight, general-purpose visual system-in contrast to the deep, high-resolution, and more categorization-dominated visual system of primates.
Collapse
Affiliation(s)
- Aran Nayebi
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Neurosciences Ph.D. Program, Stanford University, Stanford, California, United States of America
- McGovern Institute for Brain Research, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Nathan C. L. Kong
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Chengxu Zhuang
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- McGovern Institute for Brain Research, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
| | - Justin L. Gardner
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
| | - Anthony M. Norcia
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
| | - Daniel L. K. Yamins
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Psychology, Stanford University, Stanford, California, United States of America
- Department of Computer Science, Stanford University, Stanford, California, United States of America
| |
Collapse
|
220
|
Li B, Zhang C, Cao L, Chen P, Liu T, Gao H, Wang L, Yan B, Tong L. Brain Functional Representation of Highly Occluded Object Recognition. Brain Sci 2023; 13:1387. [PMID: 37891756 PMCID: PMC10605645 DOI: 10.3390/brainsci13101387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 09/23/2023] [Accepted: 09/27/2023] [Indexed: 10/29/2023] Open
Abstract
Recognizing highly occluded objects is believed to arise from the interaction between the brain's vision and cognition-controlling areas, although supporting neuroimaging data are currently limited. To explore the neural mechanism during this activity, we conducted an occlusion object recognition experiment using functional magnetic resonance imaging (fMRI). During magnet resonance examinations, 66 subjects engaged in object recognition tasks with three different occlusion degrees. Generalized linear model (GLM) analysis showed that the activation degree of the occipital lobe (inferior occipital gyrus, middle occipital gyrus, and occipital fusiform gyrus) and dorsal anterior cingulate cortex (dACC) was related to the occlusion degree of the objects. Multivariate pattern analysis (MVPA) further unearthed a considerable surge in classification precision when dACC activation was incorporated as a feature. This suggested the combined role of dACC and the occipital lobe in occluded object recognition tasks. Moreover, psychophysiological interaction (PPI) analysis disclosed that functional connectivity (FC) between the dACC and the occipital lobe was enhanced with increased occlusion, highlighting the necessity of FC between these two brain regions in effectively identifying exceedingly occluded objects. In conclusion, these findings contribute to understanding the neural mechanisms of highly occluded object recognition, augmenting our appreciation of how the brain manages incomplete visual data.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Li Tong
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, China; (B.L.); (C.Z.); (T.L.)
| |
Collapse
|
221
|
Yao M, Wen B, Yang M, Guo J, Jiang H, Feng C, Cao Y, He H, Chang L. High-dimensional topographic organization of visual features in the primate temporal lobe. Nat Commun 2023; 14:5931. [PMID: 37739988 PMCID: PMC10517140 DOI: 10.1038/s41467-023-41584-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/07/2023] [Indexed: 09/24/2023] Open
Abstract
The inferotemporal cortex supports our supreme object recognition ability. Numerous studies have been conducted to elucidate the functional organization of this brain area, but there are still important questions that remain unanswered, including how this organization differs between humans and non-human primates. Here, we use deep neural networks trained on object categorization to construct a 25-dimensional space of visual features, and systematically measure the spatial organization of feature preference in both male monkey brains and human brains using fMRI. These feature maps allow us to predict the selectivity of a previously unknown region in monkey brains, which is corroborated by additional fMRI and electrophysiology experiments. These maps also enable quantitative analyses of the topographic organization of the temporal lobe, demonstrating the existence of a pair of orthogonal gradients that differ in spatial scale and revealing significant differences in the functional organization of high-level visual areas between monkey and human brains.
Collapse
Affiliation(s)
- Mengna Yao
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Bincheng Wen
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Mingpo Yang
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Jiebin Guo
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Haozhou Jiang
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Chao Feng
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Yilei Cao
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Huiguang He
- University of Chinese Academy of Sciences, Beijing, 100049, China
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Le Chang
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
222
|
Abstract
Deep neural networks (DNNs) are machine learning algorithms that have revolutionized computer vision due to their remarkable successes in tasks like object classification and segmentation. The success of DNNs as computer vision algorithms has led to the suggestion that DNNs may also be good models of human visual perception. In this article, we review evidence regarding current DNNs as adequate behavioral models of human core object recognition. To this end, we argue that it is important to distinguish between statistical tools and computational models and to understand model quality as a multidimensional concept in which clarity about modeling goals is key. Reviewing a large number of psychophysical and computational explorations of core object recognition performance in humans and DNNs, we argue that DNNs are highly valuable scientific tools but that, as of today, DNNs should only be regarded as promising-but not yet adequate-computational models of human core object recognition behavior. On the way, we dispel several myths surrounding DNNs in vision science.
Collapse
Affiliation(s)
- Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany;
| | | |
Collapse
|
223
|
Maheswaranathan N, McIntosh LT, Tanaka H, Grant S, Kastner DB, Melander JB, Nayebi A, Brezovec LE, Wang JH, Ganguli S, Baccus SA. Interpreting the retinal neural code for natural scenes: From computations to neurons. Neuron 2023; 111:2742-2755.e4. [PMID: 37451264 PMCID: PMC10680974 DOI: 10.1016/j.neuron.2023.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 01/30/2023] [Accepted: 06/14/2023] [Indexed: 07/18/2023]
Abstract
Understanding the circuit mechanisms of the visual code for natural scenes is a central goal of sensory neuroscience. We show that a three-layer network model predicts retinal natural scene responses with an accuracy nearing experimental limits. The model's internal structure is interpretable, as interneurons recorded separately and not modeled directly are highly correlated with model interneurons. Models fitted only to natural scenes reproduce a diverse set of phenomena related to motion encoding, adaptation, and predictive coding, establishing their ethological relevance to natural visual computation. A new approach decomposes the computations of model ganglion cells into the contributions of model interneurons, allowing automatic generation of new hypotheses for how interneurons with different spatiotemporal responses are combined to generate retinal computations, including predictive phenomena currently lacking an explanation. Our results demonstrate a unified and general approach to study the circuit mechanisms of ethological retinal computations under natural visual scenes.
Collapse
Affiliation(s)
| | - Lane T McIntosh
- Neuroscience Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Hidenori Tanaka
- Department of Applied Physics, Stanford University, Stanford, CA, USA; Physics & Informatics Laboratories, NTT Research, Inc., Sunnyvale, CA, USA; Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Satchel Grant
- Department of Neurobiology, Stanford University, Stanford, CA, USA
| | - David B Kastner
- Neuroscience Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Joshua B Melander
- Neuroscience Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Aran Nayebi
- Neuroscience Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Luke E Brezovec
- Neuroscience Program, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Surya Ganguli
- Department of Applied Physics, Stanford University, Stanford, CA, USA
| | - Stephen A Baccus
- Department of Neurobiology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
224
|
Pan X, DeForge A, Schwartz O. Generalizing biological surround suppression based on center surround similarity via deep neural network models. PLoS Comput Biol 2023; 19:e1011486. [PMID: 37738258 PMCID: PMC10550176 DOI: 10.1371/journal.pcbi.1011486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 10/04/2023] [Accepted: 09/04/2023] [Indexed: 09/24/2023] Open
Abstract
Sensory perception is dramatically influenced by the context. Models of contextual neural surround effects in vision have mostly accounted for Primary Visual Cortex (V1) data, via nonlinear computations such as divisive normalization. However, surround effects are not well understood within a hierarchy, for neurons with more complex stimulus selectivity beyond V1. We utilized feedforward deep convolutional neural networks and developed a gradient-based technique to visualize the most suppressive and excitatory surround. We found that deep neural networks exhibited a key signature of surround effects in V1, highlighting center stimuli that visually stand out from the surround and suppressing responses when the surround stimulus is similar to the center. We found that in some neurons, especially in late layers, when the center stimulus was altered, the most suppressive surround surprisingly can follow the change. Through the visualization approach, we generalized previous understanding of surround effects to more complex stimuli, in ways that have not been revealed in visual cortices. In contrast, the suppression based on center surround similarity was not observed in an untrained network. We identified further successes and mismatches of the feedforward CNNs to the biology. Our results provide a testable hypothesis of surround effects in higher visual cortices, and the visualization approach could be adopted in future biological experimental designs.
Collapse
Affiliation(s)
- Xu Pan
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| | - Annie DeForge
- School of Information, University of California, Berkeley, CA, United States of America
- Bentley University, Waltham, MA, United States of America
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| |
Collapse
|
225
|
Vinken K, Prince JS, Konkle T, Livingstone MS. The neural code for "face cells" is not face-specific. SCIENCE ADVANCES 2023; 9:eadg1736. [PMID: 37647400 PMCID: PMC10468123 DOI: 10.1126/sciadv.adg1736] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 07/27/2023] [Indexed: 09/01/2023]
Abstract
Face cells are neurons that respond more to faces than to non-face objects. They are found in clusters in the inferotemporal cortex, thought to process faces specifically, and, hence, studied using faces almost exclusively. Analyzing neural responses in and around macaque face patches to hundreds of objects, we found graded response profiles for non-face objects that predicted the degree of face selectivity and provided information on face-cell tuning beyond that from actual faces. This relationship between non-face and face responses was not predicted by color and simple shape properties but by information encoded in deep neural networks trained on general objects rather than face classification. These findings contradict the long-standing assumption that face versus non-face selectivity emerges from face-specific features and challenge the practice of focusing on only the most effective stimulus. They provide evidence instead that category-selective neurons are best understood by their tuning directions in a domain-general object space.
Collapse
Affiliation(s)
- Kasper Vinken
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Jacob S. Prince
- Department of Psychology, Harvard University, Cambridge, MA 02478, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, MA 02478, USA
| | | |
Collapse
|
226
|
Fan JE, Bainbridge WA, Chamberlain R, Wammes JD. Drawing as a versatile cognitive tool. NATURE REVIEWS PSYCHOLOGY 2023; 2:556-568. [PMID: 39239312 PMCID: PMC11377027 DOI: 10.1038/s44159-023-00212-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/21/2023] [Indexed: 09/07/2024]
Abstract
Drawing is a cognitive tool that makes the invisible contents of mental life visible. Humans use this tool to produce a remarkable variety of pictures, from realistic portraits to schematic diagrams. Despite this variety and the prevalence of drawn images, the psychological mechanisms that enable drawings to be so versatile have yet to be fully explored. In this Review, we synthesize contemporary work in multiple areas of psychology, computer science and neuroscience that examines the cognitive processes involved in drawing production and comprehension. This body of findings suggests that the balance of contributions from perception, memory and social inference during drawing production varies depending on the situation, resulting in some drawings that are more realistic and other drawings that are more abstract. We also consider the use of drawings as a research tool for investigating various aspects of cognition, as well as the role that drawing has in facilitating learning and communication. Taken together, information about how drawings are used in different contexts illuminates the central role of visually grounded abstractions in human thought and behaviour.
Collapse
Affiliation(s)
- Judith E Fan
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA
- Department of Psychology, Stanford University, Stanford, CA, USA
| | | | | | - Jeffrey D Wammes
- Department of Psychology, Centre for Neuroscience Studies, Queen's University, Kingston, Ontario, Canada
| |
Collapse
|
227
|
Taylor J, Kriegeskorte N. Extracting and visualizing hidden activations and computational graphs of PyTorch models with TorchLens. Sci Rep 2023; 13:14375. [PMID: 37658079 PMCID: PMC10474256 DOI: 10.1038/s41598-023-40807-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 08/16/2023] [Indexed: 09/03/2023] Open
Abstract
Deep neural network models (DNNs) are essential to modern AI and provide powerful models of information processing in biological neural networks. Researchers in both neuroscience and engineering are pursuing a better understanding of the internal representations and operations that undergird the successes and failures of DNNs. Neuroscientists additionally evaluate DNNs as models of brain computation by comparing their internal representations to those found in brains. It is therefore essential to have a method to easily and exhaustively extract and characterize the results of the internal operations of any DNN. Many models are implemented in PyTorch, the leading framework for building DNN models. Here we introduce TorchLens, a new open-source Python package for extracting and characterizing hidden-layer activations in PyTorch models. Uniquely among existing approaches to this problem, TorchLens has the following features: (1) it exhaustively extracts the results of all intermediate operations, not just those associated with PyTorch module objects, yielding a full record of every step in the model's computational graph, (2) it provides an intuitive visualization of the model's complete computational graph along with metadata about each computational step in a model's forward pass for further analysis, (3) it contains a built-in validation procedure to algorithmically verify the accuracy of all saved hidden-layer activations, and (4) the approach it uses can be automatically applied to any PyTorch model with no modifications, including models with conditional (if-then) logic in their forward pass, recurrent models, branching models where layer outputs are fed into multiple subsequent layers in parallel, and models with internally generated tensors (e.g., injections of noise). Furthermore, using TorchLens requires minimal additional code, making it easy to incorporate into existing pipelines for model development and analysis, and useful as a pedagogical aid when teaching deep learning concepts. We hope this contribution will help researchers in AI and neuroscience understand the internal representations of DNNs.
Collapse
Affiliation(s)
- JohnMark Taylor
- Zuckerman Mind Brain Behavior Institute, Columbia University, 3227 Broadway, New York, NY, 10027, USA.
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University, 3227 Broadway, New York, NY, 10027, USA
| |
Collapse
|
228
|
Vacher J, Launay C, Mamassian P, Coen-Cagli R. Measuring uncertainty in human visual segmentation. PLoS Comput Biol 2023; 19:e1011483. [PMID: 37747914 PMCID: PMC10553811 DOI: 10.1371/journal.pcbi.1011483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 10/05/2023] [Accepted: 08/31/2023] [Indexed: 09/27/2023] Open
Abstract
Segmenting visual stimuli into distinct groups of features and visual objects is central to visual function. Classical psychophysical methods have helped uncover many rules of human perceptual segmentation, and recent progress in machine learning has produced successful algorithms. Yet, the computational logic of human segmentation remains unclear, partially because we lack well-controlled paradigms to measure perceptual segmentation maps and compare models quantitatively. Here we propose a new, integrated approach: given an image, we measure multiple pixel-based same-different judgments and perform model-based reconstruction of the underlying segmentation map. The reconstruction is robust to several experimental manipulations and captures the variability of individual participants. We demonstrate the validity of the approach on human segmentation of natural images and composite textures. We show that image uncertainty affects measured human variability, and it influences how participants weigh different visual features. Because any putative segmentation algorithm can be inserted to perform the reconstruction, our paradigm affords quantitative tests of theories of perception as well as new benchmarks for segmentation algorithms.
Collapse
Affiliation(s)
- Jonathan Vacher
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Claire Launay
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New-York, United States of America
| | - Pascal Mamassian
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New-York, United States of America
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New-York, United States of America
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, New-York, United States of America
| |
Collapse
|
229
|
Peelen MV, Downing PE. Testing cognitive theories with multivariate pattern analysis of neuroimaging data. Nat Hum Behav 2023; 7:1430-1441. [PMID: 37591984 PMCID: PMC7616245 DOI: 10.1038/s41562-023-01680-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 07/12/2023] [Indexed: 08/19/2023]
Abstract
Multivariate pattern analysis (MVPA) has emerged as a powerful method for the analysis of functional magnetic resonance imaging, electroencephalography and magnetoencephalography data. The new approaches to experimental design and hypothesis testing afforded by MVPA have made it possible to address theories that describe cognition at the functional level. Here we review a selection of studies that have used MVPA to test cognitive theories from a range of domains, including perception, attention, memory, navigation, emotion, social cognition and motor control. This broad view reveals properties of MVPA that make it suitable for understanding the 'how' of human cognition, such as the ability to test predictions expressed at the item or event level. It also reveals limitations and points to future directions.
Collapse
Affiliation(s)
- Marius V Peelen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.
| | - Paul E Downing
- Cognitive Neuroscience Institute, Department of Psychology, Bangor University, Bangor, UK.
| |
Collapse
|
230
|
Miao HY, Tong F. Convolutional neural network models of neuronal responses in macaque V1 reveal limited non-linear processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.26.554952. [PMID: 37693397 PMCID: PMC10491131 DOI: 10.1101/2023.08.26.554952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple non-linearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more non-linear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower-layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven non-linear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although VGG-19's predictive accuracy was somewhat better than standard AlexNet, we found that a modified version of AlexNet could match VGG-19's performance after only a few non-linear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for non-linear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few non-linear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, 37240, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, 37240, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, 37240, USA
| |
Collapse
|
231
|
Schütt HH, Kipnis AD, Diedrichsen J, Kriegeskorte N. Statistical inference on representational geometries. eLife 2023; 12:e82566. [PMID: 37610302 PMCID: PMC10446828 DOI: 10.7554/elife.82566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 08/07/2023] [Indexed: 08/24/2023] Open
Abstract
Neuroscience has recently made much progress, expanding the complexity of both neural activity measurements and brain-computational models. However, we lack robust methods for connecting theory and experiment by evaluating our new big models with our new big data. Here, we introduce new inference methods enabling researchers to evaluate and compare models based on the accuracy of their predictions of representational geometries: A good model should accurately predict the distances among the neural population representations (e.g. of a set of stimuli). Our inference methods combine novel 2-factor extensions of crossvalidation (to prevent overfitting to either subjects or conditions from inflating our estimates of model accuracy) and bootstrapping (to enable inferential model comparison with simultaneous generalization to both new subjects and new conditions). We validate the inference methods on data where the ground-truth model is known, by simulating data with deep neural networks and by resampling of calcium-imaging and functional MRI data. Results demonstrate that the methods are valid and conclusions generalize correctly. These data analysis methods are available in an open-source Python toolbox (rsatoolbox.readthedocs.io).
Collapse
Affiliation(s)
- Heiko H Schütt
- Zuckerman Institute, Columbia UniversityNew YorkUnited States
| | | | | | | |
Collapse
|
232
|
Gong Z, Zhou M, Dai Y, Wen Y, Liu Y, Zhen Z. A large-scale fMRI dataset for the visual processing of naturalistic scenes. Sci Data 2023; 10:559. [PMID: 37612327 PMCID: PMC10447576 DOI: 10.1038/s41597-023-02471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/14/2023] [Indexed: 08/25/2023] Open
Abstract
One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put a lot of effort into collecting large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.
Collapse
Affiliation(s)
- Zhengxin Gong
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Ming Zhou
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Yuxuan Dai
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Yushan Wen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Youyi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China.
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
233
|
Kozachkov L, Kastanenka KV, Krotov D. Building transformers from neurons and astrocytes. Proc Natl Acad Sci U S A 2023; 120:e2219150120. [PMID: 37579149 PMCID: PMC10450673 DOI: 10.1073/pnas.2219150120] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 06/22/2023] [Indexed: 08/16/2023] Open
Abstract
Glial cells account for between 50% and 90% of all human brain cells, and serve a variety of important developmental, structural, and metabolic functions. Recent experimental efforts suggest that astrocytes, a type of glial cell, are also directly involved in core cognitive processes such as learning and memory. While it is well established that astrocytes and neurons are connected to one another in feedback loops across many timescales and spatial scales, there is a gap in understanding the computational role of neuron-astrocyte interactions. To help bridge this gap, we draw on recent advances in AI and astrocyte imaging technology. In particular, we show that neuron-astrocyte networks can naturally perform the core computation of a Transformer, a particularly successful type of AI architecture. In doing so, we provide a concrete, normative, and experimentally testable account of neuron-astrocyte communication. Because Transformers are so successful across a wide variety of task domains, such as language, vision, and audition, our analysis may help explain the ubiquity, flexibility, and power of the brain's neuron-astrocyte networks.
Collapse
Affiliation(s)
- Leo Kozachkov
- Massachusetts Institute of Technology-International Business Machines, Watson Artificial Intelligence Laboratory, IBM Research, Cambridge, MA02142
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Ksenia V. Kastanenka
- Department of Neurology, MassGeneral Institute for Neurodegenerative Diseases, Massachusetts General Hospital and Harvard Medical School, Boston, MA02115
| | - Dmitry Krotov
- Massachusetts Institute of Technology-International Business Machines, Watson Artificial Intelligence Laboratory, IBM Research, Cambridge, MA02142
| |
Collapse
|
234
|
Hebscher M, Bainbridge WA, Voss JL. Neural similarity between overlapping events at learning differentially affects reinstatement across the cortex. Neuroimage 2023; 277:120220. [PMID: 37321360 PMCID: PMC10468827 DOI: 10.1016/j.neuroimage.2023.120220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 06/01/2023] [Accepted: 06/05/2023] [Indexed: 06/17/2023] Open
Abstract
Episodic memory often involves high overlap between the actors, locations, and objects of everyday events. Under some circumstances, it may be beneficial to distinguish, or differentiate, neural representations of similar events to avoid interference at recall. Alternatively, forming overlapping representations of similar events, or integration, may aid recall by linking shared information between memories. It is currently unclear how the brain supports these seemingly conflicting functions of differentiation and integration. We used multivoxel pattern similarity analysis (MVPA) of fMRI data and neural-network analysis of visual similarity to examine how highly overlapping naturalistic events are encoded in patterns of cortical activity, and how the degree of differentiation versus integration at encoding affects later retrieval. Participants performed an episodic memory task in which they learned and recalled naturalistic video stimuli with high feature overlap. Visually similar videos were encoded in overlapping patterns of neural activity in temporal, parietal, and occipital regions, suggesting integration. We further found that encoding processes differentially predicted later reinstatement across the cortex. In visual processing regions in occipital cortex, greater differentiation at encoding predicted later reinstatement. Higher-level sensory processing regions in temporal and parietal lobes showed the opposite pattern, whereby highly integrated stimuli showed greater reinstatement. Moreover, integration in high-level sensory processing regions during encoding predicted greater accuracy and vividness at recall. These findings provide novel evidence that encoding-related differentiation and integration processes across the cortex have divergent effects on later recall of highly similar naturalistic events.
Collapse
Affiliation(s)
- Melissa Hebscher
- Department of Neurology, University of Chicago, Chicago, IL 60637, USA.
| | - Wilma A Bainbridge
- Department of Psychology, University of Chicago, Chicago, IL 60637, USA; The Neuroscience Institute, University of Chicago, Chicago, IL 60637, USA
| | - Joel L Voss
- Department of Neurology, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
235
|
Veerabadran V, Goldman J, Shankar S, Cheung B, Papernot N, Kurakin A, Goodfellow I, Shlens J, Sohl-Dickstein J, Mozer MC, Elsayed GF. Subtle adversarial image manipulations influence both human and machine perception. Nat Commun 2023; 14:4933. [PMID: 37582834 PMCID: PMC10427626 DOI: 10.1038/s41467-023-40499-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 08/01/2023] [Indexed: 08/17/2023] Open
Abstract
Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations-subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN.
Collapse
Affiliation(s)
- Vijay Veerabadran
- Google, Mountain View, CA, USA
- Department of Cognitive Science, University of California, San Diego, CA, USA
| | | | - Shreya Shankar
- Google, Mountain View, CA, USA
- University of California, Berkeley, CA, USA
| | - Brian Cheung
- Google, Mountain View, CA, USA
- MIT Brain and Cognitive Sciences, Cambridge, MA, USA
| | | | | | | | | | | | | | | |
Collapse
|
236
|
Bredenberg C, Savin C. Desiderata for normative models of synaptic plasticity. ARXIV 2023:arXiv:2308.04988v1. [PMID: 37608931 PMCID: PMC10441445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Normative models of synaptic plasticity use a combination of mathematics and computational simulations to arrive at predictions of behavioral and network-level adaptive phenomena. In recent years, there has been an explosion of theoretical work on these models, but experimental confirmation is relatively limited. In this review, we organize work on normative plasticity models in terms of a set of desiderata which, when satisfied, are designed to guarantee that a model has a clear link between plasticity and adaptive behavior, consistency with known biological evidence about neural plasticity, and specific testable predictions. We then discuss how new models have begun to improve on these criteria and suggest avenues for further development. As prototypes, we provide detailed analyses of two specific models - REINFORCE and the Wake-Sleep algorithm. We provide a conceptual guide to help develop neural learning theories that are precise, powerful, and experimentally testable.
Collapse
Affiliation(s)
- Colin Bredenberg
- Center for Neural Science, New York University, New York, NY 10003, USA
- Mila-Quebec AI Institute, 6666 Rue Saint-Urbain, Montréal, QC H2S 3H1
| | - Cristina Savin
- Center for Neural Science, New York University, New York, NY 10003, USA
- Center for Data Science, New York University, New York, NY 10011, USA
| |
Collapse
|
237
|
Wybo WAM, Tsai MC, Tran VAK, Illing B, Jordan J, Morrison A, Senn W. NMDA-driven dendritic modulation enables multitask representation learning in hierarchical sensory processing pathways. Proc Natl Acad Sci U S A 2023; 120:e2300558120. [PMID: 37523562 PMCID: PMC10410730 DOI: 10.1073/pnas.2300558120] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 06/14/2023] [Indexed: 08/02/2023] Open
Abstract
While sensory representations in the brain depend on context, it remains unclear how such modulations are implemented at the biophysical level, and how processing layers further in the hierarchy can extract useful features for each possible contextual state. Here, we demonstrate that dendritic N-Methyl-D-Aspartate spikes can, within physiological constraints, implement contextual modulation of feedforward processing. Such neuron-specific modulations exploit prior knowledge, encoded in stable feedforward weights, to achieve transfer learning across contexts. In a network of biophysically realistic neuron models with context-independent feedforward weights, we show that modulatory inputs to dendritic branches can solve linearly nonseparable learning problems with a Hebbian, error-modulated learning rule. We also demonstrate that local prediction of whether representations originate either from different inputs, or from different contextual modulations of the same input, results in representation learning of hierarchical feedforward weights across processing layers that accommodate a multitude of contexts.
Collapse
Affiliation(s)
- Willem A. M. Wybo
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure–Function Relationships (INM-10), Jülich Research Center, DE-52428Jülich, Germany
| | - Matthias C. Tsai
- Department of Physiology, University of Bern, CH-3012Bern, Switzerland
| | - Viet Anh Khoa Tran
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure–Function Relationships (INM-10), Jülich Research Center, DE-52428Jülich, Germany
- Department of Computer Science - 3, Faculty 1, RWTH Aachen University, DE-52074Aachen, Germany
| | - Bernd Illing
- Laboratory of Computational Neuroscience, École Polytechnique Fédérale de Lausanne, CH-1015Lausanne, Switzerland
| | - Jakob Jordan
- Department of Physiology, University of Bern, CH-3012Bern, Switzerland
| | - Abigail Morrison
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure–Function Relationships (INM-10), Jülich Research Center, DE-52428Jülich, Germany
- Department of Computer Science - 3, Faculty 1, RWTH Aachen University, DE-52074Aachen, Germany
| | - Walter Senn
- Department of Physiology, University of Bern, CH-3012Bern, Switzerland
| |
Collapse
|
238
|
Bernáez Timón L, Ekelmans P, Kraynyukova N, Rose T, Busse L, Tchumatchenko T. How to incorporate biological insights into network models and why it matters. J Physiol 2023; 601:3037-3053. [PMID: 36069408 DOI: 10.1113/jp282755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 08/24/2022] [Indexed: 11/08/2022] Open
Abstract
Due to the staggering complexity of the brain and its neural circuitry, neuroscientists rely on the analysis of mathematical models to elucidate its function. From Hodgkin and Huxley's detailed description of the action potential in 1952 to today, new theories and increasing computational power have opened up novel avenues to study how neural circuits implement the computations that underlie behaviour. Computational neuroscientists have developed many models of neural circuits that differ in complexity, biological realism or emergent network properties. With recent advances in experimental techniques for detailed anatomical reconstructions or large-scale activity recordings, rich biological data have become more available. The challenge when building network models is to reflect experimental results, either through a high level of detail or by finding an appropriate level of abstraction. Meanwhile, machine learning has facilitated the development of artificial neural networks, which are trained to perform specific tasks. While they have proven successful at achieving task-oriented behaviour, they are often abstract constructs that differ in many features from the physiology of brain circuits. Thus, it is unclear whether the mechanisms underlying computation in biological circuits can be investigated by analysing artificial networks that accomplish the same function but differ in their mechanisms. Here, we argue that building biologically realistic network models is crucial to establishing causal relationships between neurons, synapses, circuits and behaviour. More specifically, we advocate for network models that consider the connectivity structure and the recorded activity dynamics while evaluating task performance.
Collapse
Affiliation(s)
- Laura Bernáez Timón
- Institute for Physiological Chemistry, University of Mainz Medical Center, Mainz, Germany
| | - Pierre Ekelmans
- Frankfurt Institute for Advanced Studies, Frankfurt, Germany
| | - Nataliya Kraynyukova
- Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany
| | - Tobias Rose
- Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany
| | - Laura Busse
- Division of Neurobiology, Faculty of Biology, LMU Munich, Munich, Germany
- Bernstein Center for Computational Neuroscience, Munich, Germany
| | - Tatjana Tchumatchenko
- Institute for Physiological Chemistry, University of Mainz Medical Center, Mainz, Germany
- Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany
| |
Collapse
|
239
|
Baek S, Park Y, Paik SB. Species-specific wiring of cortical circuits for small-world networks in the primary visual cortex. PLoS Comput Biol 2023; 19:e1011343. [PMID: 37540638 PMCID: PMC10403141 DOI: 10.1371/journal.pcbi.1011343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 07/10/2023] [Indexed: 08/06/2023] Open
Abstract
Long-range horizontal connections (LRCs) are conspicuous anatomical structures in the primary visual cortex (V1) of mammals, yet their detailed functions in relation to visual processing are not fully understood. Here, we show that LRCs are key components to organize a "small-world network" optimized for each size of the visual cortex, enabling the cost-efficient integration of visual information. Using computational simulations of a biologically inspired model neural network, we found that sparse LRCs added to networks, combined with dense local connections, compose a small-world network and significantly enhance image classification performance. We confirmed that the performance of the network appeared to be strongly correlated with the small-world coefficient of the model network under various conditions. Our theoretical model demonstrates that the amount of LRCs to build a small-world network depends on each size of cortex and that LRCs are beneficial only when the size of the network exceeds a certain threshold. Our model simulation of various sizes of cortices validates this prediction and provides an explanation of the species-specific existence of LRCs in animal data. Our results provide insight into a biological strategy of the brain to balance functional performance and resource cost.
Collapse
Affiliation(s)
- Seungdae Baek
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Youngjin Park
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Se-Bum Paik
- Department of Brain and Cognitive Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
240
|
Jang H, Tong F. Improved modeling of human vision by incorporating robustness to blur in convolutional neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.29.551089. [PMID: 37577646 PMCID: PMC10418076 DOI: 10.1101/2023.07.29.551089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Whenever a visual scene is cast onto the retina, much of it will appear degraded due to poor resolution in the periphery; moreover, optical defocus can cause blur in central vision. However, the pervasiveness of blurry or degraded input is typically overlooked in the training of convolutional neural networks (CNNs). We hypothesized that the absence of blurry training inputs may cause CNNs to rely excessively on high spatial frequency information for object recognition, thereby causing systematic deviations from biological vision. We evaluated this hypothesis by comparing standard CNNs with CNNs trained on a combination of clear and blurry images. We show that blur-trained CNNs outperform standard CNNs at predicting neural responses to objects across a variety of viewing conditions. Moreover, blur-trained CNNs acquire increased sensitivity to shape information and greater robustness to multiple forms of visual noise, leading to improved correspondence with human perception. Our results provide novel neurocomputational evidence that blurry visual experiences are very important for conferring robustness to biological visual systems.
Collapse
Affiliation(s)
- Hojin Jang
- Department of Psychology and Vanderbilt Vision Research Center Vanderbilt University
| | - Frank Tong
- Department of Psychology and Vanderbilt Vision Research Center Vanderbilt University
| |
Collapse
|
241
|
Janik RA. Aesthetics and neural network image representations. Sci Rep 2023; 13:11428. [PMID: 37454170 DOI: 10.1038/s41598-023-38443-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 07/08/2023] [Indexed: 07/18/2023] Open
Abstract
We analyze the spaces of images encoded by generative neural networks of the BigGAN architecture. We find that generic multiplicative perturbations of neural network parameters away from the photo-realistic point often lead to networks generating images which appear as "artistic renditions" of the corresponding objects. This demonstrates an emergence of aesthetic properties directly from the structure of the photo-realistic visual environment as encoded in its neural network parametrization. Moreover, modifying a deep semantic part of the neural network leads to the appearance of symbolic visual representations. None of the considered networks had any access to images of human-made art.
Collapse
Affiliation(s)
- Romuald A Janik
- Institute of Theoretical Physics and Mark Kac Center for Complex Systems Research, Jagiellonian University, ul. Łojasiewicza 11, 30-348, Kraków, Poland.
| |
Collapse
|
242
|
Liu J, Bayle DJ, Spagna A, Sitt JD, Bourgeois A, Lehongre K, Fernandez-Vidal S, Adam C, Lambrecq V, Navarro V, Seidel Malkinson T, Bartolomeo P. Fronto-parietal networks shape human conscious report through attention gain and reorienting. Commun Biol 2023; 6:730. [PMID: 37454150 PMCID: PMC10349830 DOI: 10.1038/s42003-023-05108-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 07/06/2023] [Indexed: 07/18/2023] Open
Abstract
How do attention and consciousness interact in the human brain? Rival theories of consciousness disagree on the role of fronto-parietal attentional networks in conscious perception. We recorded neural activity from 727 intracerebral contacts in 13 epileptic patients, while they detected near-threshold targets preceded by attentional cues. Clustering revealed three neural patterns: first, attention-enhanced conscious report accompanied sustained right-hemisphere fronto-temporal activity in networks connected by the superior longitudinal fasciculus (SLF) II-III, and late accumulation of activity (>300 ms post-target) in bilateral dorso-prefrontal and right-hemisphere orbitofrontal cortex (SLF I-III). Second, attentional reorienting affected conscious report through early, sustained activity in a right-hemisphere network (SLF III). Third, conscious report accompanied left-hemisphere dorsolateral-prefrontal activity. Task modeling with recurrent neural networks revealed multiple clusters matching the identified brain clusters, elucidating the causal relationship between clusters in conscious perception of near-threshold targets. Thus, distinct, hemisphere-asymmetric fronto-parietal networks support attentional gain and reorienting in shaping human conscious experience.
Collapse
Affiliation(s)
- Jianghao Liu
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France.
- Dassault Systèmes, Vélizy-Villacoublay, France.
| | | | - Alfredo Spagna
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France
- Department of Psychology, Columbia University in the City of New York, New York, NY, 10027, USA
| | - Jacobo D Sitt
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France
| | - Alexia Bourgeois
- Laboratory of Cognitive Neurorehabilitation, Faculty of Medicine, University of Geneva, 1206, Geneva, Switzerland
| | - Katia Lehongre
- CENIR - Centre de Neuro-Imagerie de Recherche, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France
| | - Sara Fernandez-Vidal
- CENIR - Centre de Neuro-Imagerie de Recherche, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France
| | - Claude Adam
- Epilepsy Unit, AP-HP, Pitié-Salpêtrière Hospital, 75013, Paris, France
| | - Virginie Lambrecq
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France
- Epilepsy Unit, AP-HP, Pitié-Salpêtrière Hospital, 75013, Paris, France
- Clinical Neurophysiology Department, AP-HP, Pitié-Salpêtrière Hospital, 75013, Paris, France
| | - Vincent Navarro
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France
- Epilepsy Unit, AP-HP, Pitié-Salpêtrière Hospital, 75013, Paris, France
- Clinical Neurophysiology Department, AP-HP, Pitié-Salpêtrière Hospital, 75013, Paris, France
| | - Tal Seidel Malkinson
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France.
- CNRS, CRAN, Université de Lorraine, F-54000, Nancy, France.
| | - Paolo Bartolomeo
- Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, 75013, Paris, France.
| |
Collapse
|
243
|
Abstract
Flexible behavior requires the creation, updating, and expression of memories to depend on context. While the neural underpinnings of each of these processes have been intensively studied, recent advances in computational modeling revealed a key challenge in context-dependent learning that had been largely ignored previously: Under naturalistic conditions, context is typically uncertain, necessitating contextual inference. We review a theoretical approach to formalizing context-dependent learning in the face of contextual uncertainty and the core computations it requires. We show how this approach begins to organize a large body of disparate experimental observations, from multiple levels of brain organization (including circuits, systems, and behavior) and multiple brain regions (most prominently the prefrontal cortex, the hippocampus, and motor cortices), into a coherent framework. We argue that contextual inference may also be key to understanding continual learning in the brain. This theory-driven perspective places contextual inference as a core component of learning.
Collapse
Affiliation(s)
- James B Heald
- Department of Neuroscience and Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; ,
| | - Daniel M Wolpert
- Department of Neuroscience and Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; ,
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom;
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom;
- Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
244
|
Farzmahdi A, Zarco W, Freiwald W, Kriegeskorte N, Golan T. Emergence of brain-like mirror-symmetric viewpoint tuning in convolutional neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522909. [PMID: 36711779 PMCID: PMC9881894 DOI: 10.1101/2023.01.05.522909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric views (e.g., left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.
Collapse
|
245
|
Celeghin A, Borriero A, Orsenigo D, Diano M, Méndez Guerrero CA, Perotti A, Petri G, Tamietto M. Convolutional neural networks for vision neuroscience: significance, developments, and outstanding issues. Front Comput Neurosci 2023; 17:1153572. [PMID: 37485400 PMCID: PMC10359983 DOI: 10.3389/fncom.2023.1153572] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 06/19/2023] [Indexed: 07/25/2023] Open
Abstract
Convolutional Neural Networks (CNN) are a class of machine learning models predominately used in computer vision tasks and can achieve human-like performance through learning from experience. Their striking similarities to the structural and functional principles of the primate visual system allow for comparisons between these artificial networks and their biological counterparts, enabling exploration of how visual functions and neural representations may emerge in the real brain from a limited set of computational principles. After considering the basic features of CNNs, we discuss the opportunities and challenges of endorsing CNNs as in silico models of the primate visual system. Specifically, we highlight several emerging notions about the anatomical and physiological properties of the visual system that still need to be systematically integrated into current CNN models. These tenets include the implementation of parallel processing pathways from the early stages of retinal input and the reconsideration of several assumptions concerning the serial progression of information flow. We suggest design choices and architectural constraints that could facilitate a closer alignment with biology provide causal evidence of the predictive link between the artificial and biological visual systems. Adopting this principled perspective could potentially lead to new research questions and applications of CNNs beyond modeling object recognition.
Collapse
Affiliation(s)
| | | | - Davide Orsenigo
- Department of Psychology, University of Torino, Turin, Italy
| | - Matteo Diano
- Department of Psychology, University of Torino, Turin, Italy
| | | | | | | | - Marco Tamietto
- Department of Psychology, University of Torino, Turin, Italy
- Department of Medical and Clinical Psychology, and CoRPS–Center of Research on Psychology in Somatic Diseases–Tilburg University, Tilburg, Netherlands
| |
Collapse
|
246
|
Lim C, Inagaki M, Shinozaki T, Fujita I. Analysis of convolutional neural networks reveals the computational properties essential for subcortical processing of facial expression. Sci Rep 2023; 13:10908. [PMID: 37407668 DOI: 10.1038/s41598-023-37995-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 06/30/2023] [Indexed: 07/07/2023] Open
Abstract
Perception of facial expression is crucial for primate social interactions. This visual information is processed through the ventral cortical pathway and the subcortical pathway. However, the subcortical pathway exhibits inaccurate processing, and the responsible architectural and physiological properties remain unclear. To investigate this, we constructed and examined convolutional neural networks with three key properties of the subcortical pathway: a shallow layer architecture, concentric receptive fields at the initial processing stage, and a greater degree of spatial pooling. These neural networks achieved modest accuracy in classifying facial expressions. By replacing these properties, individually or in combination, with corresponding cortical features, performance gradually improved. Similar to amygdala neurons, some units in the final processing layer exhibited sensitivity to retina-based spatial frequencies (SFs), while others were sensitive to object-based SFs. Replacement of any of these properties affected the coordinates of the SF encoding. Therefore, all three properties limit the accuracy of facial expression information and are essential for determining the SF representation coordinate. These findings characterize the role of the subcortical computational processes in facial expression recognition.
Collapse
Affiliation(s)
- Chanseok Lim
- Laboratory for Cognitive Neuroscience, Graduate School of Frontier Biosciences, Osaka University, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan
- Perceptual and Cognitive Neuroscience Laboratory, Graduate School of Frontier Biosciences, Osaka University, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Mikio Inagaki
- Laboratory for Cognitive Neuroscience, Graduate School of Frontier Biosciences, Osaka University, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Takashi Shinozaki
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan
- Computational Neuroscience Laboratory, Faculty of Informatics, Kindai University, 3-4-1 Kowakae, Higashiosaka, Osaka, 577-8502, Japan
| | - Ichiro Fujita
- Laboratory for Cognitive Neuroscience, Graduate School of Frontier Biosciences, Osaka University, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan.
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan.
- Research Organization of Science and Technology, Ritsumeikan University, 1-1-1 Noji-Higashi, Kusatsu, Shiga, 525-8577, Japan.
| |
Collapse
|
247
|
Dyballa L, Rudzite AM, Hoseini MS, Thapa M, Stryker MP, Field GD, Zucker SW. Population encoding of stimulus features along the visual hierarchy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.27.545450. [PMID: 37425920 PMCID: PMC10327159 DOI: 10.1101/2023.06.27.545450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The retina and primary visual cortex (V1) both exhibit diverse neural populations sensitive to diverse visual features. Yet it remains unclear how neural populations in each area partition stimulus space to span these features. One possibility is that neural populations are organized into discrete groups of neurons, with each group signaling a particular constellation of features. Alternatively, neurons could be continuously distributed across feature-encoding space. To distinguish these possibilities, we presented a battery of visual stimuli to mouse retina and V1 while measuring neural responses with multi-electrode arrays. Using machine learning approaches, we developed a manifold embedding technique that captures how neural populations partition feature space and how visual responses correlate with physiological and anatomical properties of individual neurons. We show that retinal populations discretely encode features, while V1 populations provide a more continuous representation. Applying the same analysis approach to convolutional neural networks that model visual processing, we demonstrate that they partition features much more similarly to the retina, indicating they are more like big retinas than little brains.
Collapse
Affiliation(s)
| | | | | | - Mishek Thapa
- Department of Neurobiology, Duke University
- Stein Eye Institute, Department of Ophthalmology, David Geffen School of Medicine, University of California, Los Angeles
| | - Michael P. Stryker
- Department of Physiology, University of California, San Francisco
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco
| | - Greg D. Field
- Department of Neurobiology, Duke University
- Stein Eye Institute, Department of Ophthalmology, David Geffen School of Medicine, University of California, Los Angeles
| | - Steven W. Zucker
- Department of Computer Science, Yale University
- Department of Biomedical Engineering, Yale University
| |
Collapse
|
248
|
Kanagamani T, Chakravarthy VS, Ravindran B, Menon RN. A deep network-based model of hippocampal memory functions under normal and Alzheimer's disease conditions. Front Neural Circuits 2023; 17:1092933. [PMID: 37416627 PMCID: PMC10320296 DOI: 10.3389/fncir.2023.1092933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 06/02/2023] [Indexed: 07/08/2023] Open
Abstract
We present a deep network-based model of the associative memory functions of the hippocampus. The proposed network architecture has two key modules: (1) an autoencoder module which represents the forward and backward projections of the cortico-hippocampal projections and (2) a module that computes familiarity of the stimulus and implements hill-climbing over the familiarity which represents the dynamics of the loops within the hippocampus. The proposed network is used in two simulation studies. In the first part of the study, the network is used to simulate image pattern completion by autoassociation under normal conditions. In the second part of the study, the proposed network is extended to a heteroassociative memory and is used to simulate picture naming task in normal and Alzheimer's disease (AD) conditions. The network is trained on pictures and names of digits from 0 to 9. The encoder layer of the network is partly damaged to simulate AD conditions. As in case of AD patients, under moderate damage condition, the network recalls superordinate words ("odd" instead of "nine"). Under severe damage conditions, the network shows a null response ("I don't know"). Neurobiological plausibility of the model is extensively discussed.
Collapse
Affiliation(s)
- Tamizharasan Kanagamani
- Laboratory for Computational Neuroscience, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, TN, India
| | - V. Srinivasa Chakravarthy
- Laboratory for Computational Neuroscience, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, TN, India
| | - Balaraman Ravindran
- Department of Computer Science and Engineering, Robert Bosch Centre for Data Science and AI, Indian Institute of Technology Madras, Chennai, TN, India
| | - Ramshekhar N. Menon
- Cognition and Behavioural Neurology Section, Department of Neurology, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum, Kerala, India
| |
Collapse
|
249
|
Li D, Chang L. Representational geometry of incomplete faces in macaque face patches. Cell Rep 2023; 42:112673. [PMID: 37342911 DOI: 10.1016/j.celrep.2023.112673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 04/23/2023] [Accepted: 06/06/2023] [Indexed: 06/23/2023] Open
Abstract
The neural code of faces has been intensively studied in the macaque face patch system. Although the majority of previous studies used complete faces as stimuli, faces are often seen partially in daily life. Here, we investigated how face-selective cells represent two types of incomplete faces: face fragments and occluded faces, with the location of the fragment/occluder and the facial features systematically varied. Contrary to popular belief, we found that the preferred face regions identified with two stimulus types are dissociated in many face cells. This dissociation can be explained by the nonlinear integration of information from different face parts and is closely related to a curved representation of face completeness in the state space, which allows a clear discrimination between different stimulus types. Furthermore, identity-related facial features are represented in a subspace orthogonal to the nonlinear dimension of face completeness, supporting a condition-general code of facial identity.
Collapse
Affiliation(s)
- Dongyuan Li
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Le Chang
- Institute of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
250
|
Kay K, Bonnen K, Denison RN, Arcaro MJ, Barack DL. Tasks and their role in visual neuroscience. Neuron 2023; 111:1697-1713. [PMID: 37040765 DOI: 10.1016/j.neuron.2023.03.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 03/13/2023] [Accepted: 03/15/2023] [Indexed: 04/13/2023]
Abstract
Vision is widely used as a model system to gain insights into how sensory inputs are processed and interpreted by the brain. Historically, careful quantification and control of visual stimuli have served as the backbone of visual neuroscience. There has been less emphasis, however, on how an observer's task influences the processing of sensory inputs. Motivated by diverse observations of task-dependent activity in the visual system, we propose a framework for thinking about tasks, their role in sensory processing, and how we might formally incorporate tasks into our models of vision.
Collapse
Affiliation(s)
- Kendrick Kay
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Kathryn Bonnen
- School of Optometry, Indiana University, Bloomington, IN 47405, USA
| | - Rachel N Denison
- Department of Psychological and Brain Sciences, Boston University, Boston, MA 02215, USA
| | - Mike J Arcaro
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19146, USA
| | - David L Barack
- Departments of Neuroscience and Philosophy, University of Pennsylvania, Philadelphia, PA 19146, USA
| |
Collapse
|