1
|
Phillips PJ, White D. The state of modelling face processing in humans with deep learning. Br J Psychol 2025. [PMID: 40364689 DOI: 10.1111/bjop.12794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 04/20/2025] [Indexed: 05/15/2025]
Abstract
Deep learning models trained for facial recognition now surpass the highest performing human participants. Recent evidence suggests that they also model some qualitative aspects of face processing in humans. This review compares the current understanding of deep learning models with psychological models of the face processing system. Psychological models consist of two components that operate on the information encoded when people perceive a face, which we refer to here as 'face codes'. The first component, the core system, extracts face codes from retinal input that encode invariant and changeable properties. The second component, the extended system, links face codes to personal information about a person and their social context. Studies of face codes in existing deep learning models reveal some surprising results. For example, face codes in networks designed for identity recognition also encode expression information, which contrasts with psychological models that separate invariant and changeable properties. Deep learning can also be used to implement candidate models of the face processing system, for example to compare alternative cognitive architectures and codes that might support interchange between core and extended face processing systems. We conclude by summarizing seven key lessons from this research and outlining three open questions for future study.
Collapse
Affiliation(s)
| | - David White
- School of Psychology, UNSW Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
2
|
Amato LG, Vergani AA, Lassi M, Carpaneto J, Mazzeo S, Moschini V, Burali R, Salvestrini G, Fabbiani C, Giacomucci G, Galdo G, Morinelli C, Emiliani F, Scarpino M, Padiglioni S, Nacmias B, Sorbi S, Grippo A, Bessi V, Mazzoni A. Personalized brain models link cognitive decline progression to underlying synaptic and connectivity degeneration. Alzheimers Res Ther 2025; 17:74. [PMID: 40188185 PMCID: PMC11971895 DOI: 10.1186/s13195-025-01718-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Accepted: 03/13/2025] [Indexed: 04/07/2025]
Abstract
Cognitive decline is a condition affecting almost one sixth of the elder population and is widely regarded as one of the first manifestations of Alzheimer's disease. Despite the extensive body of knowledge on the condition, there is no clear consensus on the structural defects and neurodegeneration processes determining cognitive decline evolution. Here, we introduce a Brain Network Model (BNM) simulating the effects of neurodegeneration on neural activity during cognitive processing. The model incorporates two key parameters accounting for distinct pathological mechanisms: synaptic degeneration, primarily leading to hyperexcitation, and brain disconnection. Through parameter optimization, we successfully replicated individual electroencephalography (EEG) responses recorded during task execution from 145 participants spanning different stages of cognitive decline. The cohort included healthy controls, patients with subjective cognitive decline (SCD), and those with mild cognitive impairment (MCI) of the Alzheimer type. Through model inversion, we generated personalized BNMs for each participant based on individual EEG recordings. These models revealed distinct network configurations corresponding to the patient's cognitive condition, with virtual neurodegeneration levels directly proportional to the severity of cognitive decline. Strikingly, the model uncovered a neurodegeneration-driven phase transition leading to two distinct regimes of neural activity underlying task execution. On either side of this phase transition, increasing synaptic degeneration induced changes in neural activity that closely mirrored experimental observations across cognitive decline stages. This enabled the model to directly link synaptic degeneration and hyperexcitation to cognitive decline severity. Furthermore, the model pinpointed posterior cingulum fiber degeneration as the structural driver of this phase transition. Our findings highlight the potential of BNMs to account for the evolution of neural activity across stages of cognitive decline while elucidating the underlying neurodegenerative mechanisms. This approach provides a novel framework for understanding how structural and functional brain alterations contribute to cognitive deterioration along the Alzheimer's continuum.
Collapse
Affiliation(s)
- Lorenzo Gaetano Amato
- The BioRobotics Institute, Sant'Anna School of Advanced Studies, Pisa, Italy
- Department of Excellence in Robotics and AI, Sant'Anna School of Advanced Studies, Pisa, Italy
| | - Alberto Arturo Vergani
- The BioRobotics Institute, Sant'Anna School of Advanced Studies, Pisa, Italy
- Department of Excellence in Robotics and AI, Sant'Anna School of Advanced Studies, Pisa, Italy
| | - Michael Lassi
- The BioRobotics Institute, Sant'Anna School of Advanced Studies, Pisa, Italy
- Department of Excellence in Robotics and AI, Sant'Anna School of Advanced Studies, Pisa, Italy
| | - Jacopo Carpaneto
- The BioRobotics Institute, Sant'Anna School of Advanced Studies, Pisa, Italy
- Department of Excellence in Robotics and AI, Sant'Anna School of Advanced Studies, Pisa, Italy
| | - Salvatore Mazzeo
- Research and Innovation Center for Dementia-CRIDEM, Careggi University Hospital, Florence, Italy
- Vita-Salute San Raffaele University, Milan, Italy
- IRCCS Polilinico San Donato, Milan, Italy
| | - Valentina Moschini
- Skeletal Muscles and Sensory Organs Department, Careggi University Hospital, Florence, Italy
| | | | | | | | - Giulia Giacomucci
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Giulia Galdo
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Carmen Morinelli
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Filippo Emiliani
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Maenia Scarpino
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Sonia Padiglioni
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Benedetta Nacmias
- IRCSS Fondazione Don Carlo Gnocchi, Florence, Italy
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Sandro Sorbi
- IRCSS Fondazione Don Carlo Gnocchi, Florence, Italy
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | | | - Valentina Bessi
- Department of Neuroscience, Drug Research and Child Health, Careggi University Hospital, PsychologyFlorence, Italy
| | - Alberto Mazzoni
- The BioRobotics Institute, Sant'Anna School of Advanced Studies, Pisa, Italy.
- Department of Excellence in Robotics and AI, Sant'Anna School of Advanced Studies, Pisa, Italy.
| |
Collapse
|
3
|
Mattera A, Alfieri V, Granato G, Baldassarre G. Chaotic recurrent neural networks for brain modelling: A review. Neural Netw 2025; 184:107079. [PMID: 39756119 DOI: 10.1016/j.neunet.2024.107079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 11/25/2024] [Accepted: 12/19/2024] [Indexed: 01/07/2025]
Abstract
Even in the absence of external stimuli, the brain is spontaneously active. Indeed, most cortical activity is internally generated by recurrence. Both theoretical and experimental studies suggest that chaotic dynamics characterize this spontaneous activity. While the precise function of brain chaotic activity is still puzzling, we know that chaos confers many advantages. From a computational perspective, chaos enhances the complexity of network dynamics. From a behavioural point of view, chaotic activity could generate the variability required for exploration. Furthermore, information storage and transfer are maximized at the critical border between order and chaos. Despite these benefits, many computational brain models avoid incorporating spontaneous chaotic activity due to the challenges it poses for learning algorithms. In recent years, however, multiple approaches have been proposed to overcome this limitation. As a result, many different algorithms have been developed, initially within the reservoir computing paradigm. Over time, the field has evolved to increase the biological plausibility and performance of the algorithms, sometimes going beyond the reservoir computing framework. In this review article, we examine the computational benefits of chaos and the unique properties of chaotic recurrent neural networks, with a particular focus on those typically utilized in reservoir computing. We also provide a detailed analysis of the algorithms designed to train chaotic RNNs, tracing their historical evolution and highlighting key milestones in their development. Finally, we explore the applications and limitations of chaotic RNNs for brain modelling, consider their potential broader impacts beyond neuroscience, and outline promising directions for future research.
Collapse
Affiliation(s)
- Andrea Mattera
- Institute of Cognitive Sciences and Technology, National Research Council, Via Romagnosi 18a, I-00196, Rome, Italy.
| | - Valerio Alfieri
- Institute of Cognitive Sciences and Technology, National Research Council, Via Romagnosi 18a, I-00196, Rome, Italy; International School of Advanced Studies, Center for Neuroscience, University of Camerino, Via Gentile III Da Varano, 62032, Camerino, Italy
| | - Giovanni Granato
- Institute of Cognitive Sciences and Technology, National Research Council, Via Romagnosi 18a, I-00196, Rome, Italy
| | - Gianluca Baldassarre
- Institute of Cognitive Sciences and Technology, National Research Council, Via Romagnosi 18a, I-00196, Rome, Italy
| |
Collapse
|
4
|
Sun L, Cui K, Hu J, Dong L, Liu L, Jia J, Yu J, Yang J. Impaired glymphatic system in patent foramen ovale based on diffusion tensor imaging analysis along the perivascular space. Quant Imaging Med Surg 2025; 15:2987-2999. [PMID: 40235821 PMCID: PMC11994554 DOI: 10.21037/qims-24-1963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 02/13/2025] [Indexed: 04/17/2025]
Abstract
Background Patent foramen ovale (PFO) is often complicated by cerebral diseases. Notably, PFO is disproportionately prevalent in cryptogenic stroke patients, but the mechanism of PFO is not clear yet. This study aimed to investigate whether there was a decline in the diffusion tensor imaging analysis along the perivascular space (DTI-ALPS) index that partially indicates interstitial fluid (ISF) dynamics and glymphatic system function in PFO patients, which had not been reported before, and to discuss the glymphatic metabolism mechanism by which PFO causes cryptogenic stroke. Methods In total, 52 PFO patients and 50 age- and gender-matched healthy controls (HCs) who underwent diffusion tensor imaging (DTI) with magnetic resonance imaging (MRI) scanning were included in the study. Diffusivity maps in the x-axis (Dxx), y-axis (Dyy), z-axis (Dzz), and the DTI-ALPS index from the projection and association fibers were extracted, and differences between PFO and HC groups were analyzed. Results The PFO patients had significantly higher Dxx and Dyy from the projection fibers, Dzz from the association fibers, lower DTI-ALPS indexes in both hemispheres, and higher Dxx from the association fibers in the left hemisphere than the HCs (P<0.01). The PFO patients had a lower DTI-ALPS index than the HCs in the left (1.358±0.116 vs. 1.624±0.281, P<0.001) and right (1.360±0.135 vs. 1.531±0.208, P<0.001) hemispheres. The areas under the receiver operating characteristic (ROC) curve were 0.83 with an ALPS index cut-off value of 1.434 in the left hemisphere, and 0.76 with an ALPS index cut-off value of 1.420 in the right hemisphere. Further, the paired samples t-tests revealed slight lateral differences in the DTI-ALPS index between the left and right hemispheres in the HCs (P=0.012). The reduced DTI-ALPS index of the left hemisphere (0.267±0.042) was greater than that of the right hemisphere (0.171±0.035). Conclusions The PFO patients showed a decrease in the DTI-ALPS index, which partly indicates ISF dynamic disorder and glymphatic system dysfunction, especially in dominant hemispheres. The DTI-ALPS index could serve as a neuroimaging biomarker for PFO. Further, the state of the impaired glymphatic system in PFO may increase the risk of stroke.
Collapse
Affiliation(s)
- Liqiang Sun
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
- Department of Radiology, Hebei General Hospital of Hebei Medical University, Shijiazhuang, China
| | - Kaige Cui
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| | - Jing Hu
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| | - Liqing Dong
- Department of Ultrasound, Hebei General Hospital of Hebei Medical University, Shijiazhuang, China
| | - Liying Liu
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| | - Juan Jia
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| | - Jiaqi Yu
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| | - Jiping Yang
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, China
| |
Collapse
|
5
|
Kanwisher N. Animal models of the human brain: Successes, limitations, and alternatives. Curr Opin Neurobiol 2025; 90:102969. [PMID: 39914250 DOI: 10.1016/j.conb.2024.102969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 12/19/2024] [Accepted: 12/21/2024] [Indexed: 02/21/2025]
Abstract
The last three decades of research in human cognitive neuroscience have given us an initial "parts list" for the human mind in the form of a set of cortical regions with distinct and often very specific functions. But current neuroscientific methods in humans have limited ability to reveal exactly what these regions represent and compute, the causal role of each in behavior, and the interactions among regions that produce real-world cognition. Animal models can help to answer these questions when homologues exist in other species, like the face system in macaques. When homologues do not exist in animals, for example for speech and music perception, and understanding of language or other people's thoughts, intracranial recordings in humans play a central role, along with a new alternative to animal models: artificial neural networks.
Collapse
Affiliation(s)
- Nancy Kanwisher
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, United States.
| |
Collapse
|
6
|
Béna G, Goodman DFM. Dynamics of specialization in neural modules under resource constraints. Nat Commun 2025; 16:187. [PMID: 39746951 PMCID: PMC11695987 DOI: 10.1038/s41467-024-55188-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 12/02/2024] [Indexed: 01/04/2025] Open
Abstract
The brain is structurally and functionally modular, although recent evidence has raised questions about the extent of both types of modularity. Using a simple, toy artificial neural network setup that allows for precise control, we find that structural modularity does not in general guarantee functional specialization (across multiple measures of specialization). Further, in this setup (1) specialization only emerges when features of the environment are meaningfully separable, (2) specialization preferentially emerges when the network is strongly resource-constrained, and (3) these findings are qualitatively similar across several different variations of network architectures. Finally, we show that functional specialization varies dynamically across time, and these dynamics depend on both the timing and bandwidth of information flow in the network. We conclude that a static notion of specialization is likely too simple a framework for understanding intelligence in situations of real-world complexity, from biology to brain-inspired neuromorphic systems.
Collapse
|
7
|
Duyck S, Costantino AI, Bracci S, Op de Beeck H. A computational deep learning investigation of animacy perception in the human brain. Commun Biol 2024; 7:1718. [PMID: 39741161 DOI: 10.1038/s42003-024-07415-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 12/18/2024] [Indexed: 01/02/2025] Open
Abstract
The functional organization of the human object vision pathway distinguishes between animate and inanimate objects. To understand animacy perception, we explore the case of zoomorphic objects resembling animals. While the perception of these objects as animal-like seems obvious to humans, such "Animal bias" is a striking discrepancy between the human brain and deep neural networks (DNNs). We computationally investigated the potential origins of this bias. We successfully induced this bias in DNNs trained explicitly with zoomorphic objects. Alternative training schedules failed to cause an Animal bias. We considered the superordinate distinction between animate and inanimate classes, the sensitivity for faces and bodies, the bias for shape over texture, the role of ecologically valid categories, recurrent connections, and language-informed visual processing. These findings provide computational support that the Animal bias for zoomorphic objects is a unique property of human perception yet can be explained by human learning history.
Collapse
Affiliation(s)
- Stefanie Duyck
- Brain and Cognition, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
| | - Andrea I Costantino
- Brain and Cognition, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium.
| | - Stefania Bracci
- Center for Mind/Brain Sciences (CIMeC), University of Trento, Trento, Italy
| | - Hans Op de Beeck
- Brain and Cognition, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
| |
Collapse
|
8
|
Hosseini E, Casto C, Zaslavsky N, Conwell C, Richardson M, Fedorenko E. Universality of representation in biological and artificial neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.26.629294. [PMID: 39764030 PMCID: PMC11703180 DOI: 10.1101/2024.12.26.629294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
Abstract
Many artificial neural networks (ANNs) trained with ecologically plausible objectives on naturalistic data align with behavior and neural representations in biological systems. Here, we show that this alignment is a consequence of convergence onto the same representations by high-performing ANNs and by brains. We developed a method to identify stimuli that systematically vary the degree of inter-model representation agreement. Across language and vision, we then showed that stimuli from high- and low-agreement sets predictably modulated model-to-brain alignment. We also examined which stimulus features distinguish high- from low-agreement sentences and images. Our results establish representation universality as a core component in the model-to-brain alignment and provide a new approach for using ANNs to uncover the structure of biological representations and computations.
Collapse
Affiliation(s)
- Eghbal Hosseini
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Colton Casto
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology (SHBT), Harvard University, Boston, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA
| | - Noga Zaslavsky
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Psychology, New York University, New York, NY, USA
| | - Colin Conwell
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Mark Richardson
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology (SHBT), Harvard University, Boston, MA, USA
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology (SHBT), Harvard University, Boston, MA, USA
| |
Collapse
|
9
|
Li M, Su Y, Huang HY, Cheng J, Hu X, Zhang X, Wang H, Qin Y, Wang X, Lindquist KA, Liu Z, Zhang D. Language-specific representation of emotion-concept knowledge causally supports emotion inference. iScience 2024; 27:111401. [PMID: 39669430 PMCID: PMC11635025 DOI: 10.1016/j.isci.2024.111401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 07/10/2024] [Accepted: 11/12/2024] [Indexed: 12/14/2024] Open
Abstract
Humans no doubt use language to communicate about their emotional experiences, but does language in turn help humans understand emotions, or is language just a vehicle of communication? This study used a form of artificial intelligence (AI) known as large language models (LLMs) to assess whether language-based representations of emotion causally contribute to the AI's ability to generate inferences about the emotional meaning of novel situations. Fourteen attributes of human emotion concept representation were found to be represented by the LLM's distinct artificial neuron populations. By manipulating these attribute-related neurons, we in turn demonstrated the role of emotion concept knowledge in generative emotion inference. The attribute-specific performance deterioration was related to the importance of different attributes in human mental space. Our findings provide a proof-in-concept that even an LLM can learn about emotions in the absence of sensory-motor representations and highlight the contribution of language-derived emotion-concept knowledge for emotion inference.
Collapse
Affiliation(s)
- Ming Li
- Department of Psychological and Cognitive Sciences, Tsinghua University, Beijing, China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| | - Yusheng Su
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Hsiu-Yuan Huang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Jiali Cheng
- Miner School of Computer and Information Sciences, University of Massachusetts Lowell, Lowell, MA 01854, USA
| | - Xin Hu
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Xinmiao Zhang
- Department of Psychological and Cognitive Sciences, Tsinghua University, Beijing, China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| | - Huadong Wang
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Yujia Qin
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Xiaozhi Wang
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Kristen A. Lindquist
- Department of Psychology and Neuroscience, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Zhiyuan Liu
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Dan Zhang
- Department of Psychological and Cognitive Sciences, Tsinghua University, Beijing, China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| |
Collapse
|
10
|
Han Z, Sereno AB. Understanding Cortical Streams from a Computational Perspective. J Cogn Neurosci 2024; 36:2618-2626. [PMID: 38319677 PMCID: PMC11602005 DOI: 10.1162/jocn_a_02121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
The two visual cortical streams hypothesis, which suggests object properties (what) are processed separately from spatial properties (where), has a longstanding history, and much evidence has accumulated to support its conjectures. Nevertheless, in the last few decades, conflicting evidence has mounted that demands some explanation and modification. For example, existence of (1) shape activities (fMRI) or shape selectivities (physiology) in dorsal stream, similar to ventral stream; likewise, spatial activations (fMRI) or spatial selectivities (physiology) in ventral stream, similar to dorsal stream; (2) multiple segregated subpathways within a stream. In addition, the idea of segregation of various aspects of multiple objects in a scene raises questions about how these properties of multiple objects are then properly re-associated or bound back together to accurately perceive, remember, or make decisions. We will briefly review the history of the two-stream hypothesis, discuss competing accounts that challenge current thinking, and propose ideas on why the brain has segregated pathways. We will present ideas based on our own data using artificial neural networks (1) to reveal encoding differences for what and where that arise in a two-pathway neural network, (2) to show how these encoding differences can clarify previous conflicting findings, and (3) to elucidate the computational advantages of segregated pathways. Furthermore, we will discuss whether neural networks need to have multiple subpathways for different visual attributes. We will also discuss the binding problem (how to correctly associate the different attributes of each object together when there are multiple objects each with multiple attributes in a scene) and possible solutions to the binding problem. Finally, we will briefly discuss problems and limitations with existing models and potential fruitful future directions.
Collapse
Affiliation(s)
| | - Anne B Sereno
- Purdue University
- Indiana University School of Medicine
| |
Collapse
|
11
|
Wiese H, Schweinberger SR, Kovács G. The neural dynamics of familiar face recognition. Neurosci Biobehav Rev 2024; 167:105943. [PMID: 39557351 DOI: 10.1016/j.neubiorev.2024.105943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Revised: 09/17/2024] [Accepted: 11/11/2024] [Indexed: 11/20/2024]
Abstract
Humans are highly efficient at recognising familiar faces. However, previous EEG/ERP research has given a partial and fragmented account of the neural basis of this remarkable ability. We argue that this is related to insufficient consideration of fundamental characteristics of familiar face recognition. These include image-independence (recognition across different pictures), levels of familiarity (familiar faces vary hugely in duration and intensity of our exposure to them), automaticity (we cannot voluntarily withhold from recognising a familiar face), and domain-selectivity (the degree to which face familiarity effects are selective). We review recent EEG/ERP work, combining uni- and multivariate methods, that has systematically targeted these shortcomings. We present a theoretical account of familiar face recognition, dividing it into early visual, domain-sensitive and domain-general phases, and integrating image-independence and levels of familiarity. Our account incorporates classic and more recent concepts, such as multi-dimensional face representation and course-to-fine processing. While several questions remain to be addressed, this new account represents a major step forward in our understanding of the neurophysiological basis of familiar face recognition.
Collapse
|
12
|
Tian X, Song Y, Liu J. Decoding face identity: A reverse-correlation approach using deep learning. Cognition 2024; 254:106008. [PMID: 39550877 DOI: 10.1016/j.cognition.2024.106008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 10/31/2024] [Accepted: 11/05/2024] [Indexed: 11/19/2024]
Abstract
Face recognition is crucial for social interactions. Traditional approaches primarily rely on subjective judgment, utilizing a pre-selected set of facial features based on literature or intuition to identify critical facial features for face recognition. In this study, we adopted a reverse-correlation approach, aligning responses of a deep convolutional neural network (DCNN) with its internal representations to objectively identify facial features pivotal for face recognition. Specifically, we trained a DCNN, namely VGG-FD, to possess human-like capability in discriminating facial identities. A representational similarity analysis (RSA) was employed to characterize VGG-FD's performance metrics, which was subsequently reverse-correlated with its representations in layers capable of discriminating facial identities. Our analysis revealed a higher likelihood of face pairs being perceived as different identities when their representations significantly differed in areas such as the eyes, eyebrows, or central facial region, suggesting the significance of the eyes as facial parts and the central facial region as an integral of face configuration in face recognition. In summary, our study leveraged DCNNs to identify critical facial features for face discrimination in a hypothesis-neutral, data-driven manner, hereby advocating for the adoption of this new paradigm to explore critical facial features across various face recognition tasks.
Collapse
Affiliation(s)
- Xue Tian
- Faculty of Psychology, Tianjin Normal University, Tianjin 300387, China
| | - Yiying Song
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China.
| | - Jia Liu
- Department of Psychology and Tsinghua Laboratory of Brain & Intelligence, Tsinghua University, Beijing, China.
| |
Collapse
|
13
|
Jiang T, Zhou G. Semantic Content in Face Representation: Essential for Proficient Recognition of Unfamiliar Faces by Good Recognizers. Cogn Sci 2024; 48:e70020. [PMID: 39587972 DOI: 10.1111/cogs.70020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 10/28/2024] [Accepted: 10/30/2024] [Indexed: 11/27/2024]
Abstract
Face recognition is adapted to achieve goals of social interactions, which rely on further processing of the semantic information of faces, beyond visual computations. Here, we explored the semantic content of face representation apart from visual component, and tested their relations to face recognition performance. Specifically, we propose that enhanced visual or semantic coding could underlie the advantage of familiar over unfamiliar faces recognition, as well as the superior recognition of skilled face recognizers. We asked participants to freely describe familiar/unfamiliar faces using words or phrases, and converted these descriptions into semantic vectors. Face semantics were transformed into quantifiable face vectors by aggregating these word/phrase vectors. We also extracted visual features from a deep convolutional neural network and obtained the visual representation of familiar/unfamiliar faces. Semantic and visual representations were used to predict perceptual representation generated from a behavior rating task separately in different groups (bad/good face recognizers in familiar-face/unfamiliar-face conditions). Comparisons revealed that although long-term memory facilitated visual feature extraction for familiar faces compared to unfamiliar faces, good recognizers compensated for this disparity by incorporating more semantic information for unfamiliar faces, a strategy not observed in bad recognizers. This study highlights the significance of semantics in recognizing unfamiliar faces.
Collapse
Affiliation(s)
- Tong Jiang
- Department of Psychology, Sun Yat-sen University
| | - Guomei Zhou
- Department of Psychology, Sun Yat-sen University
| |
Collapse
|
14
|
Shekhar M, Rahnev D. Human-like dissociations between confidence and accuracy in convolutional neural networks. PLoS Comput Biol 2024; 20:e1012578. [PMID: 39541396 PMCID: PMC11594416 DOI: 10.1371/journal.pcbi.1012578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 11/26/2024] [Accepted: 10/22/2024] [Indexed: 11/16/2024] Open
Abstract
Prior research has shown that manipulating stimulus energy by changing both stimulus contrast and variability results in confidence-accuracy dissociations in humans. Specifically, even when performance is matched, higher stimulus energy leads to higher confidence. The most common explanation for this effect, derived from cognitive modeling, is the positive evidence heuristic where confidence neglects evidence that disconfirms the choice. However, an alternative explanation is the signal-and-variance-increase hypothesis, according to which these dissociations arise from changes in the separation and variance of perceptual representations. Because artificial neural networks lack built-in confidence heuristics, they can serve as a test for the necessity of confidence heuristics in explaining confidence-accuracy dissociations. Therefore, we tested whether confidence-accuracy dissociations induced by stimulus energy manipulations emerge naturally in convolutional neural networks (CNNs). We found that, across three different energy manipulations, CNNs produced confidence-accuracy dissociations similar to those found in humans. This effect was present for a range of CNN architectures from shallow 4-layer networks to very deep ones, such as VGG-19 and ResNet-50 pretrained on ImageNet. Further, we traced back the reason for the confidence-accuracy dissociations in all CNNs to the same signal-and-variance increase that has been proposed for humans: higher stimulus energy increased the separation and variance of evidence distributions in the CNNs' output layer leading to higher confidence even for matched accuracy. These findings cast doubt on the necessity of the positive evidence heuristic to explain human confidence and establish CNNs as promising models for testing cognitive theories of human behavior.
Collapse
Affiliation(s)
- Medha Shekhar
- School of Psychology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Dobromir Rahnev
- School of Psychology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
15
|
Bourne JA, Cichy RM, Kiorpes L, Morrone MC, Arcaro MJ, Nielsen KJ. Development of Higher-Level Vision: A Network Perspective. J Neurosci 2024; 44:e1291242024. [PMID: 39358020 PMCID: PMC11450542 DOI: 10.1523/jneurosci.1291-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 07/27/2024] [Accepted: 07/29/2024] [Indexed: 10/04/2024] Open
Abstract
Most studies on the development of the visual system have focused on the mechanisms shaping early visual stages up to the level of primary visual cortex (V1). Much less is known about the development of the stages after V1 that handle the higher visual functions fundamental to everyday life. The standard model for the maturation of these areas is that it occurs sequentially, according to the positions of areas in the adult hierarchy. Yet, the existing literature reviewed here paints a different picture, one in which the adult configuration emerges through a sequence of unique network configurations that are not mere partial versions of the adult hierarchy. In addition to studying higher visual development per se to fill major gaps in knowledge, it will be crucial to adopt a network-level perspective in future investigations to unravel normal developmental mechanisms, identify vulnerabilities to developmental disorders, and eventually devise treatments for these disorders.
Collapse
Affiliation(s)
- James A Bourne
- Section on Cellular and Cognitive Neurodevelopment, Systems Neurodevelopment Laboratory, National Institute of Mental Health, Bethesda, Maryland 20814
| | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin 14195, Germany
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin 10099, Germany
- Einstein Center for Neurosciences Berlin, Charite-Universitätsmedizin Berlin, Berlin 10117, Germany
- Bernstein Center for Computational Neuroscience Berlin, Humboldt-Universität zu Berlin, Berlin 10099, Germany
| | - Lynne Kiorpes
- Center for Neural Science, New York University, New York, New York 10003
| | - Maria Concetta Morrone
- IRCCS Fondazione Stella Maris, Pisa 56128, Italy
- Department of Translational Research on New Technologies in Medicine and Surgery, University of Pisa, Pisa 56126, Italy
| | - Michael J Arcaro
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Kristina J Nielsen
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, Maryland 21218
| |
Collapse
|
16
|
Liu X, He D, Zhu M, Li Y, Lin L, Cai Q. Hemispheric dominance in reading system alters contribution to face processing lateralization across development. Dev Cogn Neurosci 2024; 69:101418. [PMID: 39059053 PMCID: PMC11331717 DOI: 10.1016/j.dcn.2024.101418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 07/07/2024] [Accepted: 07/21/2024] [Indexed: 07/28/2024] Open
Abstract
Face processing dominates the right hemisphere. This lateralization can be affected by co-lateralization within the same system and influence between different systems, such as neural competition from reading acquisition. Yet, how the relationship pattern changes through development remains unknown. This study examined the lateralization of core face processing and word processing in different age groups. By comparing fMRI data from 36 school-aged children and 40 young adults, we investigated whether there are age and regional effects on lateralization, and how relationships between lateralization within and between systems change across development. Our results showed significant right hemispheric lateralization in the core face system and left hemispheric lateralization in reading-related areas for both age groups when viewing faces and texts passively. While all participants showed stronger lateralization in brain regions of higher functional hierarchy when viewing faces, only adults exhibited this lateralization when viewing texts. In both age cohorts, there was intra-system co-lateralization for face processing, whereas an inter-system relationship was only found in adults. Specifically, functional lateralization of Broca's area during reading negatively predicted functional asymmetry in the FFA during face perception. This study initially provides neuroimaging evidence for the reading-induced neural competition theory from a maturational perspective in Chinese cohorts.
Collapse
Affiliation(s)
- Xinyang Liu
- Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China.
| | - Danni He
- Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Miaomiao Zhu
- Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Yinghui Li
- Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Longnian Lin
- Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China; Shanghai Center for Brain Science and Brain-Inspired Technology, East China Normal University, China; NYU-ECNU Institute of Brain and Cognitive Science, New York University, Shanghai, China; School of Life Science Department, East China Normal University, Shanghai 200062, China.
| | - Qing Cai
- Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China; Shanghai Changning Mental Health Center, Shanghai 200335, China; Shanghai Center for Brain Science and Brain-Inspired Technology, East China Normal University, China; NYU-ECNU Institute of Brain and Cognitive Science, New York University, Shanghai, China.
| |
Collapse
|
17
|
Kar K, DiCarlo JJ. The Quest for an Integrated Set of Neural Mechanisms Underlying Object Recognition in Primates. Annu Rev Vis Sci 2024; 10:91-121. [PMID: 38950431 DOI: 10.1146/annurev-vision-112823-030616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Inferences made about objects via vision, such as rapid and accurate categorization, are core to primate cognition despite the algorithmic challenge posed by varying viewpoints and scenes. Until recently, the brain mechanisms that support these capabilities were deeply mysterious. However, over the past decade, this scientific mystery has been illuminated by the discovery and development of brain-inspired, image-computable, artificial neural network (ANN) systems that rival primates in these behavioral feats. Apart from fundamentally changing the landscape of artificial intelligence, modified versions of these ANN systems are the current leading scientific hypotheses of an integrated set of mechanisms in the primate ventral visual stream that support core object recognition. What separates brain-mapped versions of these systems from prior conceptual models is that they are sensory computable, mechanistic, anatomically referenced, and testable (SMART). In this article, we review and provide perspective on the brain mechanisms addressed by the current leading SMART models. We review their empirical brain and behavioral alignment successes and failures, discuss the next frontiers for an even more accurate mechanistic understanding, and outline the likely applications.
Collapse
Affiliation(s)
- Kohitij Kar
- Department of Biology, Centre for Vision Research, and Centre for Integrative and Applied Neuroscience, York University, Toronto, Ontario, Canada;
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT Quest for Intelligence, and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| |
Collapse
|
18
|
Tousi E, Mur M. The face inversion effect through the lens of deep neural networks. Proc Biol Sci 2024; 291:20241342. [PMID: 39137884 PMCID: PMC11321844 DOI: 10.1098/rspb.2024.1342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 08/15/2024] Open
Affiliation(s)
- Ehsan Tousi
- Department of Psychology, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
- Neuroscience Graduate Program, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
| | - Marieke Mur
- Department of Psychology, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
- Department of Computer Science, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
| |
Collapse
|
19
|
Margalit E, Lee H, Finzi D, DiCarlo JJ, Grill-Spector K, Yamins DLK. A unifying framework for functional organization in early and higher ventral visual cortex. Neuron 2024; 112:2435-2451.e7. [PMID: 38733985 PMCID: PMC11257790 DOI: 10.1016/j.neuron.2024.04.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 12/08/2023] [Accepted: 04/15/2024] [Indexed: 05/13/2024]
Abstract
A key feature of cortical systems is functional organization: the arrangement of functionally distinct neurons in characteristic spatial patterns. However, the principles underlying the emergence of functional organization in the cortex are poorly understood. Here, we develop the topographic deep artificial neural network (TDANN), the first model to predict several aspects of the functional organization of multiple cortical areas in the primate visual system. We analyze the factors driving the TDANN's success and find that it balances two objectives: learning a task-general sensory representation and maximizing the spatial smoothness of responses according to a metric that scales with cortical surface area. In turn, the representations learned by the TDANN are more brain-like than in spatially unconstrained models. Finally, we provide evidence that the TDANN's functional organization balances performance with between-area connection length. Our results offer a unified principle for understanding the functional organization of the primate ventral visual system.
Collapse
Affiliation(s)
- Eshed Margalit
- Neurosciences Graduate Program, Stanford University, Stanford, CA 94305, USA.
| | - Hyodong Lee
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Dawn Finzi
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Kalanit Grill-Spector
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
20
|
Nassan M. Proposal for a Mechanistic Disease Conceptualization in Clinical Neurosciences: The Neural Network Components (NNC) Model. Harv Rev Psychiatry 2024; 32:150-159. [PMID: 38990903 DOI: 10.1097/hrp.0000000000000399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
ABSTRACT Clinical neurosciences, and psychiatry specifically, have been challenged by the lack of a comprehensive and practical framework that explains the core mechanistic processes of variable psychiatric presentations. Current conceptualization and classification of psychiatric presentations are primarily centered on a non-biologically based clinical descriptive approach. Despite various attempts, advances in neuroscience research have not led to an improved conceptualization or mechanistic classification of psychiatric disorders. This perspective article proposes a new-work-in-progress-framework for conceptualizing psychiatric presentations based on neural network components (NNC). This framework could guide the development of mechanistic disease classification, improve understanding of underpinning pathology, and provide specific intervention targets. This model also has the potential to dissolve artificial barriers between the fields of psychiatry and neurology.
Collapse
Affiliation(s)
- Malik Nassan
- From Mesulam Center for Cognitive Neurology and Alzheimer's Disease, Northwestern University, Chicago, IL; Department of Neurology and Department of Psychiatry and Behavioral Sciences, Northwestern University Feinberg School of Medicine (Dr. Nassan)
| |
Collapse
|
21
|
Kumar S, Sumers TR, Yamakoshi T, Goldstein A, Hasson U, Norman KA, Griffiths TL, Hawkins RD, Nastase SA. Shared functional specialization in transformer-based language models and the human brain. Nat Commun 2024; 15:5523. [PMID: 38951520 PMCID: PMC11217339 DOI: 10.1038/s41467-024-49173-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 05/24/2024] [Indexed: 07/03/2024] Open
Abstract
When processing language, the brain is thought to deploy specialized computations to construct meaning from complex linguistic structures. Recently, artificial neural networks based on the Transformer architecture have revolutionized the field of natural language processing. Transformers integrate contextual information across words via structured circuit computations. Prior work has focused on the internal representations ("embeddings") generated by these circuits. In this paper, we instead analyze the circuit computations directly: we deconstruct these computations into the functionally-specialized "transformations" that integrate contextual information across words. Using functional MRI data acquired while participants listened to naturalistic stories, we first verify that the transformations account for considerable variance in brain activity across the cortical language network. We then demonstrate that the emergent computations performed by individual, functionally-specialized "attention heads" differentially predict brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers and context lengths in a low-dimensional cortical space.
Collapse
Affiliation(s)
- Sreejan Kumar
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA.
| | - Theodore R Sumers
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
| | - Takateru Yamakoshi
- Faculty of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Ariel Goldstein
- Department of Cognitive and Brain Sciences and Business School, Hebrew University, Jerusalem, 9190401, Israel
| | - Uri Hasson
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Kenneth A Norman
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Thomas L Griffiths
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Robert D Hawkins
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Samuel A Nastase
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA.
| |
Collapse
|
22
|
Feng X, Deng N, Yu W, Peng Z, Su D, Kang W, Cheng B. Review: Application of Bionic-Structured Materials in Solid-State Electrolytes for High-Performance Lithium Metal Batteries. ACS NANO 2024; 18:15387-15415. [PMID: 38843224 DOI: 10.1021/acsnano.4c02547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2024]
Abstract
Solid-state lithium metal batteries (SSLMBs) have gained significant attention in energy storage research due to their high energy density and significantly improved safety. But there are still certain problems with lithium dendrite growth, interface stability, and room-temperature practicality. Nature continually inspires human development and intricate design strategies to achieve optimal structural applications. Innovative solid-state electrolytes (SSEs), inspired by diverse natural species, have demonstrated exceptional physical, chemical, and mechanical properties. This review provides an overview of typical bionic-structured materials in SSEs, particularly those mimicking plant and animal structures, with a focus on their latest advancements in applications of solid-state lithium metal batteries. Commencing from plant structures encompassing roots, trunks, leaves, flowers, fruits, and cellular levels, the detailed influence of biomimetic strategies on SSE design and electrochemical performance are presented in this review. Subsequently, the recent progress of animal-inspired nanostructures in SSEs is summarized, including layered structures, surface morphologies, and interface compatibility in both two-dimensional (2D) and three-dimensional (3D) aspects. Finally, we also evaluate the current challenges and provide a concise outlook on future research directions. We anticipate that the review will provide useful information for future reference regarding the design of bionic-structured materials in SSEs.
Collapse
Affiliation(s)
- Xiaofan Feng
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| | - Nanping Deng
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| | - Wen Yu
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| | - Zhaozhao Peng
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| | - Dongyue Su
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| | - Weimin Kang
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| | - Bowen Cheng
- State Key Laboratory of Separation Membranes and Membrane Processes/National Center for International Joint Research on Separation Membranes, School of Textile Science and Engineering, Tiangong University, Tianjin 300387, People's Republic of China
| |
Collapse
|
23
|
Mahowald K, Ivanova AA, Blank IA, Kanwisher N, Tenenbaum JB, Fedorenko E. Dissociating language and thought in large language models. Trends Cogn Sci 2024; 28:517-540. [PMID: 38508911 DOI: 10.1016/j.tics.2024.01.011] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 01/31/2024] [Accepted: 01/31/2024] [Indexed: 03/22/2024]
Abstract
Large language models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence (knowledge of linguistic rules and patterns) and functional linguistic competence (understanding and using language in the world). We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of separate mechanisms specialized for formal versus functional linguistic competence.
Collapse
|
24
|
Ren Y, Bashivan P. How well do models of visual cortex generalize to out of distribution samples? PLoS Comput Biol 2024; 20:e1011145. [PMID: 38820563 PMCID: PMC11216589 DOI: 10.1371/journal.pcbi.1011145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/01/2024] [Accepted: 04/29/2024] [Indexed: 06/02/2024] Open
Abstract
Unit activity in particular deep neural networks (DNNs) are remarkably similar to the neuronal population responses to static images along the primate ventral visual cortex. Linear combinations of DNN unit activities are widely used to build predictive models of neuronal activity in the visual cortex. Nevertheless, prediction performance in these models is often investigated on stimulus sets consisting of everyday objects under naturalistic settings. Recent work has revealed a generalization gap in how predicting neuronal responses to synthetically generated out-of-distribution (OOD) stimuli. Here, we investigated how the recent progress in improving DNNs' object recognition generalization, as well as various DNN design choices such as architecture, learning algorithm, and datasets have impacted the generalization gap in neural predictivity. We came to a surprising conclusion that the performance on none of the common computer vision OOD object recognition benchmarks is predictive of OOD neural predictivity performance. Furthermore, we found that adversarially robust models often yield substantially higher generalization in neural predictivity, although the degree of robustness itself was not predictive of neural predictivity score. These results suggest that improving object recognition behavior on current benchmarks alone may not lead to more general models of neurons in the primate ventral visual cortex.
Collapse
Affiliation(s)
- Yifei Ren
- Department of Computer Science, McGill University, Montreal, Canada
| | - Pouya Bashivan
- Department of Computer Science, McGill University, Montreal, Canada
- Department of Computer Physiology, McGill University, Montreal, Canada
- Mila, Université de Montréal, Montreal, Canada
| |
Collapse
|
25
|
Farzmahdi A, Zarco W, Freiwald WA, Kriegeskorte N, Golan T. Emergence of brain-like mirror-symmetric viewpoint tuning in convolutional neural networks. eLife 2024; 13:e90256. [PMID: 38661128 PMCID: PMC11142642 DOI: 10.7554/elife.90256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 04/25/2024] [Indexed: 04/26/2024] Open
Abstract
Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric views (e.g. left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.
Collapse
Affiliation(s)
- Amirhossein Farzmahdi
- Laboratory of Neural Systems, The Rockefeller UniversityNew YorkUnited States
- School of Cognitive Sciences, Institute for Research in Fundamental SciencesTehranIslamic Republic of Iran
| | - Wilbert Zarco
- Laboratory of Neural Systems, The Rockefeller UniversityNew YorkUnited States
| | - Winrich A Freiwald
- Laboratory of Neural Systems, The Rockefeller UniversityNew YorkUnited States
- The Center for Brains, Minds & MachinesCambridgeUnited States
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkUnited States
- Department of Psychology, Columbia UniversityNew YorkUnited States
- Department of Neuroscience, Columbia UniversityNew YorkUnited States
- Department of Electrical Engineering, Columbia UniversityNew YorkUnited States
| | - Tal Golan
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkUnited States
| |
Collapse
|
26
|
Liu P, Bo K, Ding M, Fang R. Emergence of Emotion Selectivity in Deep Neural Networks Trained to Recognize Visual Objects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.16.537079. [PMID: 37163104 PMCID: PMC10168209 DOI: 10.1101/2023.04.16.537079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Recent neuroimaging studies have shown that the visual cortex plays an important role in representing the affective significance of visual input. The origin of these affect-specific visual representations is debated: they are intrinsic to the visual system versus they arise through reentry from frontal emotion processing structures such as the amygdala. We examined this problem by combining convolutional neural network (CNN) models of the human ventral visual cortex pre-trained on ImageNet with two datasets of affective images. Our results show that (1) in all layers of the CNN models, there were artificial neurons that responded consistently and selectively to neutral, pleasant, or unpleasant images and (2) lesioning these neurons by setting their output to 0 or enhancing these neurons by increasing their gain led to decreased or increased emotion recognition performance respectively. These results support the idea that the visual system may have the intrinsic ability to represent the affective significance of visual input and suggest that CNNs offer a fruitful platform for testing neuroscientific theories.
Collapse
Affiliation(s)
- Peng Liu
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, USA
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Ke Bo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Mingzhou Ding
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, USA
| | - Ruogu Fang
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, USA
- Center for Cognitive Aging and Memory, McKnight Brain Institute, University of Florida, Gainesville, FL, USA
| |
Collapse
|
27
|
Liu P, Bo K, Ding M, Fang R. Emergence of Emotion Selectivity in Deep Neural Networks Trained to Recognize Visual Objects. PLoS Comput Biol 2024; 20:e1011943. [PMID: 38547053 PMCID: PMC10977720 DOI: 10.1371/journal.pcbi.1011943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 02/24/2024] [Indexed: 04/02/2024] Open
Abstract
Recent neuroimaging studies have shown that the visual cortex plays an important role in representing the affective significance of visual input. The origin of these affect-specific visual representations is debated: they are intrinsic to the visual system versus they arise through reentry from frontal emotion processing structures such as the amygdala. We examined this problem by combining convolutional neural network (CNN) models of the human ventral visual cortex pre-trained on ImageNet with two datasets of affective images. Our results show that in all layers of the CNN models, there were artificial neurons that responded consistently and selectively to neutral, pleasant, or unpleasant images and lesioning these neurons by setting their output to zero or enhancing these neurons by increasing their gain led to decreased or increased emotion recognition performance respectively. These results support the idea that the visual system may have the intrinsic ability to represent the affective significance of visual input and suggest that CNNs offer a fruitful platform for testing neuroscientific theories.
Collapse
Affiliation(s)
- Peng Liu
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Ke Bo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Mingzhou Ding
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Ruogu Fang
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- Center for Cognitive Aging and Memory, McKnight Brain Institute, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|
28
|
Stoinski LM, Perkuhn J, Hebart MN. THINGSplus: New norms and metadata for the THINGS database of 1854 object concepts and 26,107 natural object images. Behav Res Methods 2024; 56:1583-1603. [PMID: 37095326 PMCID: PMC10991023 DOI: 10.3758/s13428-023-02110-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2023] [Indexed: 04/26/2023]
Abstract
To study visual and semantic object representations, the need for well-curated object concepts and images has grown significantly over the past years. To address this, we have previously developed THINGS, a large-scale database of 1854 systematically sampled object concepts with 26,107 high-quality naturalistic images of these concepts. With THINGSplus, we significantly extend THINGS by adding concept- and image-specific norms and metadata for all 1854 concepts and one copyright-free image example per concept. Concept-specific norms were collected for the properties of real-world size, manmadeness, preciousness, liveliness, heaviness, naturalness, ability to move or be moved, graspability, holdability, pleasantness, and arousal. Further, we provide 53 superordinate categories as well as typicality ratings for all their members. Image-specific metadata includes a nameability measure, based on human-generated labels of the objects depicted in the 26,107 images. Finally, we identified one new public domain image per concept. Property (M = 0.97, SD = 0.03) and typicality ratings (M = 0.97, SD = 0.01) demonstrate excellent consistency, with the subsequently collected arousal ratings as the only exception (r = 0.69). Our property (M = 0.85, SD = 0.11) and typicality (r = 0.72, 0.74, 0.88) data correlated strongly with external norms, again with the lowest validity for arousal (M = 0.41, SD = 0.08). To summarize, THINGSplus provides a large-scale, externally validated extension to existing object norms and an important extension to THINGS, allowing detailed selection of stimuli and control variables for a wide range of research interested in visual object processing, language, and semantic memory.
Collapse
Affiliation(s)
- Laura M Stoinski
- Max Planck Institute for Human Cognitive & Brain Sciences, Leipzig, Germany.
| | - Jonas Perkuhn
- Max Planck Institute for Human Cognitive & Brain Sciences, Leipzig, Germany
| | - Martin N Hebart
- Max Planck Institute for Human Cognitive & Brain Sciences, Leipzig, Germany
- Justus Liebig University, Gießen, Germany
| |
Collapse
|
29
|
Wang J, Cao R, Chakravarthula PN, Li X, Wang S. A critical period for developing face recognition. PATTERNS (NEW YORK, N.Y.) 2024; 5:100895. [PMID: 38370121 PMCID: PMC10873156 DOI: 10.1016/j.patter.2023.100895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 02/20/2024]
Abstract
Face learning has important critical periods during development. However, the computational mechanisms of critical periods remain unknown. Here, we conducted a series of in silico experiments and showed that, similar to humans, deep artificial neural networks exhibited critical periods during which a stimulus deficit could impair the development of face learning. Face learning could only be restored when providing information within the critical period, whereas, outside of the critical period, the model could not incorporate new information anymore. We further provided a full computational account by learning rate and demonstrated an alternative approach by knowledge distillation and attention transfer to partially recover the model outside of the critical period. We finally showed that model performance and recovery were associated with identity-selective units and the correspondence with the primate visual systems. Our present study not only reveals computational mechanisms underlying face learning but also points to strategies to restore impaired face learning.
Collapse
Affiliation(s)
- Jinge Wang
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Runnan Cao
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
- Department of Radiology, Washington University in St. Louis, St. Louis, MO 63110, USA
| | | | - Xin Li
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
- Department of Computer Science, University at Albany, Albany, NY 12222, USA
| | - Shuo Wang
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
- Department of Radiology, Washington University in St. Louis, St. Louis, MO 63110, USA
| |
Collapse
|
30
|
Shoham A, Grosbard ID, Patashnik O, Cohen-Or D, Yovel G. Using deep neural networks to disentangle visual and semantic information in human perception and memory. Nat Hum Behav 2024:10.1038/s41562-024-01816-9. [PMID: 38332339 DOI: 10.1038/s41562-024-01816-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 12/22/2023] [Indexed: 02/10/2024]
Abstract
Mental representations of familiar categories are composed of visual and semantic information. Disentangling the contributions of visual and semantic information in humans is challenging because they are intermixed in mental representations. Deep neural networks that are trained either on images or on text or by pairing images and text enable us now to disentangle human mental representations into their visual, visual-semantic and semantic components. Here we used these deep neural networks to uncover the content of human mental representations of familiar faces and objects when they are viewed or recalled from memory. The results show a larger visual than semantic contribution when images are viewed and a reversed pattern when they are recalled. We further reveal a previously unknown unique contribution of an integrated visual-semantic representation in both perception and memory. We propose a new framework in which visual and semantic information contribute independently and interactively to mental representations in perception and memory.
Collapse
Affiliation(s)
- Adva Shoham
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel.
| | - Idan Daniel Grosbard
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Or Patashnik
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Daniel Cohen-Or
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Galit Yovel
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel.
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
31
|
Shekhar M, Rahnev D. Human-like dissociations between confidence and accuracy in convolutional neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.01.578187. [PMID: 38352596 PMCID: PMC10862905 DOI: 10.1101/2024.02.01.578187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Prior research has shown that manipulating stimulus energy by changing both stimulus contrast and variability results in confidence-accuracy dissociations in humans. Specifically, even when performance is matched, higher stimulus energy leads to higher confidence. The most common explanation for this effect is the positive evidence heuristic where confidence neglects evidence that disconfirms the choice. However, an alternative explanation is the signal-and-variance-increase hypothesis, according to which these dissociations arise from low-level changes in the separation and variance of perceptual representations. Because artificial neural networks lack built-in confidence heuristics, they can serve as a test for the necessity of confidence heuristics in explaining confidence-accuracy dissociations. Therefore, we tested whether confidence-accuracy dissociations induced by stimulus energy manipulations emerge naturally in convolutional neural networks (CNNs). We found that, across three different energy manipulations, CNNs produced confidence-accuracy dissociations similar to those found in humans. This effect was present for a range of CNN architectures from shallow 4-layer networks to very deep ones, such as VGG-19 and ResNet -50 pretrained on ImageNet. Further, we traced back the reason for the confidence-accuracy dissociations in all CNNs to the same signal-and-variance increase that has been proposed for humans: higher stimulus energy increased the separation and variance of the CNNs' internal representations leading to higher confidence even for matched accuracy. These findings cast doubt on the necessity of the positive evidence heuristic to explain human confidence and establish CNNs as promising models for adjudicating between low-level, stimulus-driven and high-level, cognitive explanations of human behavior.
Collapse
Affiliation(s)
- Medha Shekhar
- School of Psychology, Georgia Institute of Technology, Atlanta, GA
| | - Dobromir Rahnev
- School of Psychology, Georgia Institute of Technology, Atlanta, GA
| |
Collapse
|
32
|
Cao R, Wang J, Brunner P, Willie JT, Li X, Rutishauser U, Brandmeir NJ, Wang S. Neural mechanisms of face familiarity and learning in the human amygdala and hippocampus. Cell Rep 2024; 43:113520. [PMID: 38151023 PMCID: PMC10834150 DOI: 10.1016/j.celrep.2023.113520] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 09/12/2023] [Accepted: 11/14/2023] [Indexed: 12/29/2023] Open
Abstract
Recognizing familiar faces and learning new faces play an important role in social cognition. However, the underlying neural computational mechanisms remain unclear. Here, we record from single neurons in the human amygdala and hippocampus and find a greater neuronal representational distance between pairs of familiar faces than unfamiliar faces, suggesting that neural representations for familiar faces are more distinct. Representational distance increases with exposures to the same identity, suggesting that neural face representations are sharpened with learning and familiarization. Furthermore, representational distance is positively correlated with visual dissimilarity between faces, and exposure to visually similar faces increases representational distance, thus sharpening neural representations. Finally, we construct a computational model that demonstrates an increase in the representational distance of artificial units with training. Together, our results suggest that the neuronal population geometry, quantified by the representational distance, encodes face familiarity, similarity, and learning, forming the basis of face recognition and memory.
Collapse
Affiliation(s)
- Runnan Cao
- Department of Radiology, Washington University in St. Louis, St. Louis, MO 63110, USA; Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA.
| | - Jinge Wang
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Peter Brunner
- Department of Neurosurgery, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Jon T Willie
- Department of Neurosurgery, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Xin Li
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Ueli Rutishauser
- Departments of Neurosurgery and Neurology, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | | | - Shuo Wang
- Department of Radiology, Washington University in St. Louis, St. Louis, MO 63110, USA; Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA; Department of Neurosurgery, Washington University in St. Louis, St. Louis, MO 63110, USA.
| |
Collapse
|
33
|
Op de Beeck H, Bracci S. Going after the bigger picture: Using high-capacity models to understand mind and brain. Behav Brain Sci 2023; 46:e404. [PMID: 38054291 DOI: 10.1017/s0140525x2300153x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Deep neural networks (DNNs) provide a unique opportunity to move towards a generic modelling framework in psychology. The high representational capacity of these models combined with the possibility for further extensions has already allowed us to investigate the forest, namely the complex landscape of representations and processes that underlie human cognition, without forgetting about the trees, which include individual psychological phenomena.
Collapse
Affiliation(s)
| | - Stefania Bracci
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy ://webapps.unitn.it/du/en/Persona/PER0076943/Curriculum
| |
Collapse
|
34
|
Jiahui G, Feilong M, Visconti di Oleggio Castello M, Nastase SA, Haxby JV, Gobbini MI. Modeling naturalistic face processing in humans with deep convolutional neural networks. Proc Natl Acad Sci U S A 2023; 120:e2304085120. [PMID: 37847731 PMCID: PMC10614847 DOI: 10.1073/pnas.2304085120] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 09/11/2023] [Indexed: 10/19/2023] Open
Abstract
Deep convolutional neural networks (DCNNs) trained for face identification can rival and even exceed human-level performance. The ways in which the internal face representations in DCNNs relate to human cognitive representations and brain activity are not well understood. Nearly all previous studies focused on static face image processing with rapid display times and ignored the processing of naturalistic, dynamic information. To address this gap, we developed the largest naturalistic dynamic face stimulus set in human neuroimaging research (700+ naturalistic video clips of unfamiliar faces). We used this naturalistic dataset to compare representational geometries estimated from DCNNs, behavioral responses, and brain responses. We found that DCNN representational geometries were consistent across architectures, cognitive representational geometries were consistent across raters in a behavioral arrangement task, and neural representational geometries in face areas were consistent across brains. Representational geometries in late, fully connected DCNN layers, which are optimized for individuation, were much more weakly correlated with cognitive and neural geometries than were geometries in late-intermediate layers. The late-intermediate face-DCNN layers successfully matched cognitive representational geometries, as measured with a behavioral arrangement task that primarily reflected categorical attributes, and correlated with neural representational geometries in known face-selective topographies. Our study suggests that current DCNNs successfully capture neural cognitive processes for categorical attributes of faces but less accurately capture individuation and dynamic features.
Collapse
Affiliation(s)
- Guo Jiahui
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | - Ma Feilong
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | | | - Samuel A. Nastase
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ08544
| | - James V. Haxby
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | - M. Ida Gobbini
- Department of Medical and Surgical Sciences, University of Bologna, Bologna40138, Italy
- Istituti di Ricovero e Cura a Carattere Scientifico, Istituto delle Scienze Neurologiche di Bologna, Bologna40139, Italia
| |
Collapse
|
35
|
van Dyck LE, Gruber WR. Modeling Biological Face Recognition with Deep Convolutional Neural Networks. J Cogn Neurosci 2023; 35:1521-1537. [PMID: 37584587 DOI: 10.1162/jocn_a_02040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground, and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces." In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.
Collapse
|
36
|
Cao R, Zhang N, Yu H, Webster PJ, Paul LK, Li X, Lin C, Wang S. Comprehensive Social Trait Judgments From Faces in Autism Spectrum Disorder. Psychol Sci 2023; 34:1121-1145. [PMID: 37671893 PMCID: PMC10626626 DOI: 10.1177/09567976231192236] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 07/13/2023] [Indexed: 09/07/2023] Open
Abstract
Processing social information from faces is difficult for individuals with autism spectrum disorder (ASD). However, it remains unclear whether individuals with ASD make high-level social trait judgments from faces in the same way as neurotypical individuals. Here, we comprehensively addressed this question using naturalistic face images and representatively sampled traits. Despite similar underlying dimensional structures across traits, online adult participants with self-reported ASD showed different judgments and reduced specificity within each trait compared with neurotypical individuals. Deep neural networks revealed that these group differences were driven by specific types of faces and differential utilization of features within a face. Our results were replicated in well-characterized in-lab participants and partially generalized to more controlled face images (a preregistered study). By investigating social trait judgments in a broader population, including individuals with neurodevelopmental variations, we found important theoretical implications for the fundamental dimensions, variations, and potential behavioral consequences of social cognition.
Collapse
Affiliation(s)
- Runnan Cao
- Department of Radiology, Washington University in St. Louis
- Lane Department of Computer Science and Electrical Engineering, West Virginia University
| | - Na Zhang
- Lane Department of Computer Science and Electrical Engineering, West Virginia University
| | - Hongbo Yu
- Department of Psychological & Brain Sciences, University of California, Santa Barbara
| | - Paula J. Webster
- Department of Chemical and Biomedical Engineering, West Virginia University
| | - Lynn K. Paul
- Division of the Humanities and Social Sciences, California Institute of Technology
| | - Xin Li
- Lane Department of Computer Science and Electrical Engineering, West Virginia University
| | - Chujun Lin
- Department of Psychology, University of California, San Diego
| | - Shuo Wang
- Department of Radiology, Washington University in St. Louis
- Lane Department of Computer Science and Electrical Engineering, West Virginia University
| |
Collapse
|
37
|
Lin C, Bulls LS, Tepfer LJ, Vyas AD, Thornton MA. Advancing Naturalistic Affective Science with Deep Learning. AFFECTIVE SCIENCE 2023; 4:550-562. [PMID: 37744976 PMCID: PMC10514024 DOI: 10.1007/s42761-023-00215-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 08/03/2023] [Indexed: 09/26/2023]
Abstract
People express their own emotions and perceive others' emotions via a variety of channels, including facial movements, body gestures, vocal prosody, and language. Studying these channels of affective behavior offers insight into both the experience and perception of emotion. Prior research has predominantly focused on studying individual channels of affective behavior in isolation using tightly controlled, non-naturalistic experiments. This approach limits our understanding of emotion in more naturalistic contexts where different channels of information tend to interact. Traditional methods struggle to address this limitation: manually annotating behavior is time-consuming, making it infeasible to do at large scale; manually selecting and manipulating stimuli based on hypotheses may neglect unanticipated features, potentially generating biased conclusions; and common linear modeling approaches cannot fully capture the complex, nonlinear, and interactive nature of real-life affective processes. In this methodology review, we describe how deep learning can be applied to address these challenges to advance a more naturalistic affective science. First, we describe current practices in affective research and explain why existing methods face challenges in revealing a more naturalistic understanding of emotion. Second, we introduce deep learning approaches and explain how they can be applied to tackle three main challenges: quantifying naturalistic behaviors, selecting and manipulating naturalistic stimuli, and modeling naturalistic affective processes. Finally, we describe the limitations of these deep learning methods, and how these limitations might be avoided or mitigated. By detailing the promise and the peril of deep learning, this review aims to pave the way for a more naturalistic affective science.
Collapse
Affiliation(s)
- Chujun Lin
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Landry S. Bulls
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Lindsey J. Tepfer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Amisha D. Vyas
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Mark A. Thornton
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| |
Collapse
|
38
|
Dobs K, Yuan J, Martinez J, Kanwisher N. Behavioral signatures of face perception emerge in deep neural networks optimized for face recognition. Proc Natl Acad Sci U S A 2023; 120:e2220642120. [PMID: 37523537 PMCID: PMC10410721 DOI: 10.1073/pnas.2220642120] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 06/08/2023] [Indexed: 08/02/2023] Open
Abstract
Human face recognition is highly accurate and exhibits a number of distinctive and well-documented behavioral "signatures" such as the use of a characteristic representational space, the disproportionate performance cost when stimuli are presented upside down, and the drop in accuracy for faces from races the participant is less familiar with. These and other phenomena have long been taken as evidence that face recognition is "special". But why does human face perception exhibit these properties in the first place? Here, we use deep convolutional neural networks (CNNs) to test the hypothesis that all of these signatures of human face perception result from optimization for the task of face recognition. Indeed, as predicted by this hypothesis, these phenomena are all found in CNNs trained on face recognition, but not in CNNs trained on object recognition, even when additionally trained to detect faces while matching the amount of face experience. To test whether these signatures are in principle specific to faces, we optimized a CNN on car discrimination and tested it on upright and inverted car images. As we found for face perception, the car-trained network showed a drop in performance for inverted vs. upright cars. Similarly, CNNs trained on inverted faces produced an inverted face inversion effect. These findings show that the behavioral signatures of human face perception reflect and are well explained as the result of optimization for the task of face recognition, and that the nature of the computations underlying this task may not be so special after all.
Collapse
Affiliation(s)
- Katharina Dobs
- Department of Psychology, Justus Liebig University Giessen, Giessen35394, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Marburg35302, Germany
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Joanne Yuan
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Julio Martinez
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- Department of Psychology, Stanford University, Stanford, CA94305
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
| |
Collapse
|
39
|
Deen B, Schwiedrzik CM, Sliwa J, Freiwald WA. Specialized Networks for Social Cognition in the Primate Brain. Annu Rev Neurosci 2023; 46:381-401. [PMID: 37428602 PMCID: PMC11115357 DOI: 10.1146/annurev-neuro-102522-121410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2023]
Abstract
Primates have evolved diverse cognitive capabilities to navigate their complex social world. To understand how the brain implements critical social cognitive abilities, we describe functional specialization in the domains of face processing, social interaction understanding, and mental state attribution. Systems for face processing are specialized from the level of single cells to populations of neurons within brain regions to hierarchically organized networks that extract and represent abstract social information. Such functional specialization is not confined to the sensorimotor periphery but appears to be a pervasive theme of primate brain organization all the way to the apex regions of cortical hierarchies. Circuits processing social information are juxtaposed with parallel systems involved in processing nonsocial information, suggesting common computations applied to different domains. The emerging picture of the neural basis of social cognition is a set of distinct but interacting subnetworks involved in component processes such as face perception and social reasoning, traversing large parts of the primate brain.
Collapse
Affiliation(s)
- Ben Deen
- Psychology Department & Tulane Brain Institute, Tulane University, New Orleans, Louisiana, USA
| | - Caspar M Schwiedrzik
- Neural Circuits and Cognition Lab, European Neuroscience Institute Göttingen, A Joint Initiative of the University Medical Center Göttingen and the Max Planck Society; Perception and Plasticity Group, German Primate Center, Leibniz Institute for Primate Research; and Leibniz-Science Campus Primate Cognition, Göttingen, Germany
| | - Julia Sliwa
- Sorbonne Université, Institut du Cerveau, ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - Winrich A Freiwald
- Laboratory of Neural Systems and The Price Family Center for the Social Brain, The Rockefeller University, New York, NY, USA;
- The Center for Brains, Minds and Machines, Cambridge, Massachusetts, USA
| |
Collapse
|
40
|
Farzmahdi A, Zarco W, Freiwald W, Kriegeskorte N, Golan T. Emergence of brain-like mirror-symmetric viewpoint tuning in convolutional neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522909. [PMID: 36711779 PMCID: PMC9881894 DOI: 10.1101/2023.01.05.522909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric views (e.g., left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.
Collapse
|
41
|
Feather J, Chung S. Unveiling the benefits of multitasking in disentangled representation formation. Trends Cogn Sci 2023:S1364-6613(23)00127-4. [PMID: 37357063 DOI: 10.1016/j.tics.2023.05.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 05/22/2023] [Indexed: 06/27/2023]
Abstract
Johnston and Fusi recently investigated the emergence of disentangled representations when a neural network was trained to perform multiple simultaneous tasks. Such experiments explore the benefits of flexible representations and add to a growing field of research investigating the representational geometry of artificial and biological neural networks.
Collapse
Affiliation(s)
- Jenelle Feather
- Center for Computational Neuroscience, Flatiron Institute, NY, USA; Center for Neural Science, New York University, NY, USA
| | - SueYeon Chung
- Center for Computational Neuroscience, Flatiron Institute, NY, USA; Center for Neural Science, New York University, NY, USA.
| |
Collapse
|
42
|
Doshi FR, Konkle T. Cortical topographic motifs emerge in a self-organized map of object space. SCIENCE ADVANCES 2023; 9:eade8187. [PMID: 37343093 PMCID: PMC10284546 DOI: 10.1126/sciadv.ade8187] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 05/17/2023] [Indexed: 06/23/2023]
Abstract
The human ventral visual stream has a highly systematic organization of object information, but the causal pressures driving these topographic motifs are highly debated. Here, we use self-organizing principles to learn a topographic representation of the data manifold of a deep neural network representational space. We find that a smooth mapping of this representational space showed many brain-like motifs, with a large-scale organization by animacy and real-world object size, supported by mid-level feature tuning, with naturally emerging face- and scene-selective regions. While some theories of the object-selective cortex posit that these differently tuned regions of the brain reflect a collection of distinctly specified functional modules, the present work provides computational support for an alternate hypothesis that the tuning and topography of the object-selective cortex reflect a smooth mapping of a unified representational space.
Collapse
Affiliation(s)
- Fenil R. Doshi
- Department of Psychology and Center for Brain Sciences, Harvard University, Cambridge, MA, USA
| | | |
Collapse
|
43
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
44
|
Margalit E, Lee H, Finzi D, DiCarlo JJ, Grill-Spector K, Yamins DLK. A Unifying Principle for the Functional Organization of Visual Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.18.541361. [PMID: 37292946 PMCID: PMC10245753 DOI: 10.1101/2023.05.18.541361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A key feature of many cortical systems is functional organization: the arrangement of neurons with specific functional properties in characteristic spatial patterns across the cortical surface. However, the principles underlying the emergence and utility of functional organization are poorly understood. Here we develop the Topographic Deep Artificial Neural Network (TDANN), the first unified model to accurately predict the functional organization of multiple cortical areas in the primate visual system. We analyze the key factors responsible for the TDANN's success and find that it strikes a balance between two specific objectives: achieving a task-general sensory representation that is self-supervised, and maximizing the smoothness of responses across the cortical sheet according to a metric that scales relative to cortical surface area. In turn, the representations learned by the TDANN are lower dimensional and more brain-like than those in models that lack a spatial smoothness constraint. Finally, we provide evidence that the TDANN's functional organization balances performance with inter-area connection length, and use the resulting models for a proof-of-principle optimization of cortical prosthetic design. Our results thus offer a unified principle for understanding functional organization and a novel view of the functional role of the visual system in particular.
Collapse
Affiliation(s)
- Eshed Margalit
- Neurosciences Graduate Program, Stanford University, Stanford, CA 94305
| | - Hyodong Lee
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Dawn Finzi
- Department of Psychology, Stanford University, Stanford, CA 94305
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Kalanit Grill-Spector
- Department of Psychology, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| |
Collapse
|
45
|
Yovel G, Grosbard I, Abudarham N. Deep learning models challenge the prevailing assumption that face-like effects for objects of expertise support domain-general mechanisms. Proc Biol Sci 2023; 290:20230093. [PMID: 37161322 PMCID: PMC10170201 DOI: 10.1098/rspb.2023.0093] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/04/2023] [Indexed: 05/11/2023] Open
Abstract
The question of whether task performance is best achieved by domain-specific, or domain-general processing mechanisms is fundemental for both artificial and biological systems. This question has generated a fierce debate in the study of expert object recognition. Because humans are experts in face recognition, face-like neural and cognitive effects for objects of expertise were considered support for domain-general mechanisms. However, effects of domain, experience and level of categorization, are confounded in human studies, which may lead to erroneous inferences. To overcome these limitations, we trained deep learning algorithms on different domains (objects, faces, birds) and levels of categorization (basic, sub-ordinate, individual), matched for amount of experience. Like humans, the models generated a larger inversion effect for faces than for objects. Importantly, a face-like inversion effect was found for individual-based categorization of non-faces (birds) but only in a network specialized for that domain. Thus, contrary to prevalent assumptions, face-like effects for objects of expertise do not support domain-general mechanisms but may originate from domain-specific mechanisms. More generally, we show how deep learning algorithms can be used to dissociate factors that are inherently confounded in the natural environment of biological organisms to test hypotheses about their isolated contributions to cognition and behaviour.
Collapse
Affiliation(s)
- Galit Yovel
- School of Psychological Sciences, Tel Aviv University, Tel Aviv 69987, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 69987, Israel
| | - Idan Grosbard
- School of Psychological Sciences, Tel Aviv University, Tel Aviv 69987, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 69987, Israel
| | - Naphtali Abudarham
- School of Psychological Sciences, Tel Aviv University, Tel Aviv 69987, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 69987, Israel
| |
Collapse
|
46
|
Bracci S, Mraz J, Zeman A, Leys G, Op de Beeck H. The representational hierarchy in human and artificial visual systems in the presence of object-scene regularities. PLoS Comput Biol 2023; 19:e1011086. [PMID: 37115763 PMCID: PMC10171658 DOI: 10.1371/journal.pcbi.1011086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 05/10/2023] [Accepted: 04/09/2023] [Indexed: 04/29/2023] Open
Abstract
Human vision is still largely unexplained. Computer vision made impressive progress on this front, but it is still unclear to which extent artificial neural networks approximate human object vision at the behavioral and neural levels. Here, we investigated whether machine object vision mimics the representational hierarchy of human object vision with an experimental design that allows testing within-domain representations for animals and scenes, as well as across-domain representations reflecting their real-world contextual regularities such as animal-scene pairs that often co-occur in the visual environment. We found that DCNNs trained in object recognition acquire representations, in their late processing stage, that closely capture human conceptual judgements about the co-occurrence of animals and their typical scenes. Likewise, the DCNNs representational hierarchy shows surprising similarities with the representational transformations emerging in domain-specific ventrotemporal areas up to domain-general frontoparietal areas. Despite these remarkable similarities, the underlying information processing differs. The ability of neural networks to learn a human-like high-level conceptual representation of object-scene co-occurrence depends upon the amount of object-scene co-occurrence present in the image set thus highlighting the fundamental role of training history. Further, although mid/high-level DCNN layers represent the category division for animals and scenes as observed in VTC, its information content shows reduced domain-specific representational richness. To conclude, by testing within- and between-domain selectivity while manipulating contextual regularities we reveal unknown similarities and differences in the information processing strategies employed by human and artificial visual systems.
Collapse
Affiliation(s)
- Stefania Bracci
- Center for Mind/Brain Sciences-CIMeC, University of Trento, Rovereto, Italy
- KU Leuven, Leuven Brain Institute, Brain & Cognition Research Unit, Leuven, Belgium
| | - Jakob Mraz
- KU Leuven, Leuven Brain Institute, Brain & Cognition Research Unit, Leuven, Belgium
| | - Astrid Zeman
- KU Leuven, Leuven Brain Institute, Brain & Cognition Research Unit, Leuven, Belgium
| | - Gaëlle Leys
- KU Leuven, Leuven Brain Institute, Brain & Cognition Research Unit, Leuven, Belgium
| | - Hans Op de Beeck
- KU Leuven, Leuven Brain Institute, Brain & Cognition Research Unit, Leuven, Belgium
| |
Collapse
|
47
|
Avcu E, Hwang M, Brown KS, Gow DW. A tale of two lexica: Investigating computational pressures on word representation with neural networks. Front Artif Intell 2023; 6:1062230. [PMID: 37051161 PMCID: PMC10083378 DOI: 10.3389/frai.2023.1062230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 03/10/2023] [Indexed: 03/28/2023] Open
Abstract
Introduction The notion of a single localized store of word representations has become increasingly less plausible as evidence has accumulated for the widely distributed neural representation of wordform grounded in motor, perceptual, and conceptual processes. Here, we attempt to combine machine learning methods and neurobiological frameworks to propose a computational model of brain systems potentially responsible for wordform representation. We tested the hypothesis that the functional specialization of word representation in the brain is driven partly by computational optimization. This hypothesis directly addresses the unique problem of mapping sound and articulation vs. mapping sound and meaning. Results We found that artificial neural networks trained on the mapping between sound and articulation performed poorly in recognizing the mapping between sound and meaning and vice versa. Moreover, a network trained on both tasks simultaneously could not discover the features required for efficient mapping between sound and higher-level cognitive states compared to the other two models. Furthermore, these networks developed internal representations reflecting specialized task-optimized functions without explicit training. Discussion Together, these findings demonstrate that different task-directed representations lead to more focused responses and better performance of a machine or algorithm and, hypothetically, the brain. Thus, we imply that the functional specialization of word representation mirrors a computational optimization strategy given the nature of the tasks that the human brain faces.
Collapse
Affiliation(s)
- Enes Avcu
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | | | - Kevin Scott Brown
- Department of Pharmaceutical Sciences and School of Chemical, Biological, and Environmental Engineering, Oregon State University, Corvallis, OR, United States
| | - David W. Gow
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
- Athinoula A. Martinos Center for Biomedical Imaging Massachusetts General Hospital, Charlestown, MA, United States
- Department of Psychology, Salem State University, Salem, MA, United States
- Harvard-MIT Division of Health Sciences and Technology, Boston, MA, United States
| |
Collapse
|
48
|
Kanwisher N, Khosla M, Dobs K. Using artificial neural networks to ask 'why' questions of minds and brains. Trends Neurosci 2023; 46:240-254. [PMID: 36658072 DOI: 10.1016/j.tins.2022.12.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 11/29/2022] [Accepted: 12/22/2022] [Indexed: 01/19/2023]
Abstract
Neuroscientists have long characterized the properties and functions of the nervous system, and are increasingly succeeding in answering how brains perform the tasks they do. But the question 'why' brains work the way they do is asked less often. The new ability to optimize artificial neural networks (ANNs) for performance on human-like tasks now enables us to approach these 'why' questions by asking when the properties of networks optimized for a given task mirror the behavioral and neural characteristics of humans performing the same task. Here we highlight the recent success of this strategy in explaining why the visual and auditory systems work the way they do, at both behavioral and neural levels.
Collapse
Affiliation(s)
- Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Meenakshi Khosla
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Katharina Dobs
- Department of Psychology, Justus Liebig University Giessen, Giessen, Germany; Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University, Giessen, Germany.
| |
Collapse
|
49
|
Schwartz E, O’Nell K, Saxe R, Anzellotti S. Challenging the Classical View: Recognition of Identity and Expression as Integrated Processes. Brain Sci 2023; 13:296. [PMID: 36831839 PMCID: PMC9954353 DOI: 10.3390/brainsci13020296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 02/01/2023] [Accepted: 02/02/2023] [Indexed: 02/12/2023] Open
Abstract
Recent neuroimaging evidence challenges the classical view that face identity and facial expression are processed by segregated neural pathways, showing that information about identity and expression are encoded within common brain regions. This article tests the hypothesis that integrated representations of identity and expression arise spontaneously within deep neural networks. A subset of the CelebA dataset is used to train a deep convolutional neural network (DCNN) to label face identity (chance = 0.06%, accuracy = 26.5%), and the FER2013 dataset is used to train a DCNN to label facial expression (chance = 14.2%, accuracy = 63.5%). The identity-trained and expression-trained networks each successfully transfer to labeling both face identity and facial expression on the Karolinska Directed Emotional Faces dataset. This study demonstrates that DCNNs trained to recognize face identity and DCNNs trained to recognize facial expression spontaneously develop representations of facial expression and face identity, respectively. Furthermore, a congruence coefficient analysis reveals that features distinguishing between identities and features distinguishing between expressions become increasingly orthogonal from layer to layer, suggesting that deep neural networks disentangle representational subspaces corresponding to different sources.
Collapse
Affiliation(s)
- Emily Schwartz
- Department of Psychology and Neuroscience, Boston College, Boston, MA 02467, USA
| | - Kathryn O’Nell
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA
| | - Rebecca Saxe
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Stefano Anzellotti
- Department of Psychology and Neuroscience, Boston College, Boston, MA 02467, USA
| |
Collapse
|
50
|
Han Z, Sereno A. Identifying and Localizing Multiple Objects Using Artificial Ventral and Dorsal Cortical Visual Pathways. Neural Comput 2023; 35:249-275. [PMID: 36543331 DOI: 10.1162/neco_a_01559] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 10/09/2022] [Indexed: 12/24/2022]
Abstract
In our previous study (Han & Sereno, 2022a), we found that two artificial cortical visual pathways trained for either identity or space actively retain information about both identity and space independently and differently. We also found that this independently and differently retained information about identity and space in two separate pathways may be necessary to accurately and optimally recognize and localize objects. One limitation of our previous study was that there was only one object in each visual image, whereas in reality, there may be multiple objects in a scene. In this study, we find we are able to generalize our findings to object recognition and localization tasks where multiple objects are present in each visual image. We constrain the binding problem by training the identity network pathway to report the identities of objects in a given order according to the relative spatial relationships between the objects, given that most visual cortical areas including high-level ventral steam areas retain spatial information. Under these conditions, we find that the artificial neural networks with two pathways for identity and space have better performance in multiple-objects recognition and localization tasks (higher average testing accuracy, lower testing accuracy variance, less training time) than the artificial neural networks with a single pathway. We also find that the required number of training samples and the required training time increase quickly, and potentially exponentially, when the number of objects in each image increases, and we suggest that binding information from multiple objects simultaneously within any network (cortical area) induces conflict or competition and may be part of the reason why our brain has limited attentional and visual working memory capacities.
Collapse
Affiliation(s)
- Zhixian Han
- Department of Psychological Sciences, Purdue University, West Lafayette, IN 47907, U.S.A.
| | - Anne Sereno
- Department of Psychological Sciences and Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, U.S.A.
| |
Collapse
|