1
|
Phillips PJ, White D. The state of modelling face processing in humans with deep learning. Br J Psychol 2025. [PMID: 40364689 DOI: 10.1111/bjop.12794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 04/20/2025] [Indexed: 05/15/2025]
Abstract
Deep learning models trained for facial recognition now surpass the highest performing human participants. Recent evidence suggests that they also model some qualitative aspects of face processing in humans. This review compares the current understanding of deep learning models with psychological models of the face processing system. Psychological models consist of two components that operate on the information encoded when people perceive a face, which we refer to here as 'face codes'. The first component, the core system, extracts face codes from retinal input that encode invariant and changeable properties. The second component, the extended system, links face codes to personal information about a person and their social context. Studies of face codes in existing deep learning models reveal some surprising results. For example, face codes in networks designed for identity recognition also encode expression information, which contrasts with psychological models that separate invariant and changeable properties. Deep learning can also be used to implement candidate models of the face processing system, for example to compare alternative cognitive architectures and codes that might support interchange between core and extended face processing systems. We conclude by summarizing seven key lessons from this research and outlining three open questions for future study.
Collapse
Affiliation(s)
| | - David White
- School of Psychology, UNSW Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
2
|
Shoura M, Walther DB, Nestor A. Unraveling other-race face perception with GAN-based image reconstruction. Behav Res Methods 2025; 57:115. [PMID: 40087201 DOI: 10.3758/s13428-025-02636-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2025] [Indexed: 03/17/2025]
Abstract
The other-race effect (ORE) is the disadvantage of recognizing faces of another race than one's own. While its prevalence is behaviorally well documented, the representational basis of ORE remains unclear. This study employs StyleGAN2, a deep learning technique for generating photorealistic images to uncover face representations and to investigate ORE's representational basis. To this end, we collected pairwise visual similarity ratings with same- and other-race faces across East Asian and White participants exhibiting robust levels of ORE. Leveraging the significant overlap in representational similarity between the GAN's latent space and perceptual representations in human participants, we designed an image reconstruction approach aiming to reveal internal face representations from behavioral similarity data. This methodology yielded hyper-realistic depictions of face percepts, with reconstruction accuracy well above chance, as well as an accuracy advantage for same-race over other-race reconstructions, which mirrored ORE in both populations. Further, a comparison of reconstructions across participant race revealed a novel age bias, with other-race face reconstructions appearing younger than their same-race counterpart. Thus, our work proposes a new approach to exploiting the utility of GANs in image reconstruction and provides new avenues in the study of ORE.
Collapse
Affiliation(s)
- Moaz Shoura
- Department of Psychology at Scarborough, University of Toronto, 1265 Military Trail, Scarborough, ON, M1C 1A4, Canada.
| | - Dirk B Walther
- Department of Psychology, University of Toronto, Toronto, Canada
| | - Adrian Nestor
- Department of Psychology at Scarborough, University of Toronto, 1265 Military Trail, Scarborough, ON, M1C 1A4, Canada
| |
Collapse
|
3
|
Kanwisher N. Animal models of the human brain: Successes, limitations, and alternatives. Curr Opin Neurobiol 2025; 90:102969. [PMID: 39914250 DOI: 10.1016/j.conb.2024.102969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 12/19/2024] [Accepted: 12/21/2024] [Indexed: 02/21/2025]
Abstract
The last three decades of research in human cognitive neuroscience have given us an initial "parts list" for the human mind in the form of a set of cortical regions with distinct and often very specific functions. But current neuroscientific methods in humans have limited ability to reveal exactly what these regions represent and compute, the causal role of each in behavior, and the interactions among regions that produce real-world cognition. Animal models can help to answer these questions when homologues exist in other species, like the face system in macaques. When homologues do not exist in animals, for example for speech and music perception, and understanding of language or other people's thoughts, intracranial recordings in humans play a central role, along with a new alternative to animal models: artificial neural networks.
Collapse
Affiliation(s)
- Nancy Kanwisher
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, United States.
| |
Collapse
|
4
|
Duyck S, Costantino AI, Bracci S, Op de Beeck H. A computational deep learning investigation of animacy perception in the human brain. Commun Biol 2024; 7:1718. [PMID: 39741161 DOI: 10.1038/s42003-024-07415-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 12/18/2024] [Indexed: 01/02/2025] Open
Abstract
The functional organization of the human object vision pathway distinguishes between animate and inanimate objects. To understand animacy perception, we explore the case of zoomorphic objects resembling animals. While the perception of these objects as animal-like seems obvious to humans, such "Animal bias" is a striking discrepancy between the human brain and deep neural networks (DNNs). We computationally investigated the potential origins of this bias. We successfully induced this bias in DNNs trained explicitly with zoomorphic objects. Alternative training schedules failed to cause an Animal bias. We considered the superordinate distinction between animate and inanimate classes, the sensitivity for faces and bodies, the bias for shape over texture, the role of ecologically valid categories, recurrent connections, and language-informed visual processing. These findings provide computational support that the Animal bias for zoomorphic objects is a unique property of human perception yet can be explained by human learning history.
Collapse
Affiliation(s)
- Stefanie Duyck
- Brain and Cognition, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
| | - Andrea I Costantino
- Brain and Cognition, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium.
| | - Stefania Bracci
- Center for Mind/Brain Sciences (CIMeC), University of Trento, Trento, Italy
| | - Hans Op de Beeck
- Brain and Cognition, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
| |
Collapse
|
5
|
Bongrand P. Should Artificial Intelligence Play a Durable Role in Biomedical Research and Practice? Int J Mol Sci 2024; 25:13371. [PMID: 39769135 PMCID: PMC11676049 DOI: 10.3390/ijms252413371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 11/26/2024] [Accepted: 12/09/2024] [Indexed: 01/11/2025] Open
Abstract
During the last decade, artificial intelligence (AI) was applied to nearly all domains of human activity, including scientific research. It is thus warranted to ask whether AI thinking should be durably involved in biomedical research. This problem was addressed by examining three complementary questions (i) What are the major barriers currently met by biomedical investigators? It is suggested that during the last 2 decades there was a shift towards a growing need to elucidate complex systems, and that this was not sufficiently fulfilled by previously successful methods such as theoretical modeling or computer simulation (ii) What is the potential of AI to meet the aforementioned need? it is suggested that recent AI methods are well-suited to perform classification and prediction tasks on multivariate systems, and possibly help in data interpretation, provided their efficiency is properly validated. (iii) Recent representative results obtained with machine learning suggest that AI efficiency may be comparable to that displayed by human operators. It is concluded that AI should durably play an important role in biomedical practice. Also, as already suggested in other scientific domains such as physics, combining AI with conventional methods might generate further progress and new applications, involving heuristic and data interpretation.
Collapse
Affiliation(s)
- Pierre Bongrand
- Laboratory Adhesion and Inflammation (LAI), Inserm UMR 1067, Cnrs Umr 7333, Aix-Marseille Université UM 61, 13009 Marseille, France
| |
Collapse
|
6
|
Prince JS, Alvarez GA, Konkle T. Contrastive learning explains the emergence and function of visual category-selective regions. SCIENCE ADVANCES 2024; 10:eadl1776. [PMID: 39321304 PMCID: PMC11423896 DOI: 10.1126/sciadv.adl1776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 08/21/2024] [Indexed: 09/27/2024]
Abstract
Modular and distributed coding theories of category selectivity along the human ventral visual stream have long existed in tension. Here, we present a reconciling framework-contrastive coding-based on a series of analyses relating category selectivity within biological and artificial neural networks. We discover that, in models trained with contrastive self-supervised objectives over a rich natural image diet, category-selective tuning naturally emerges for faces, bodies, scenes, and words. Further, lesions of these model units lead to selective, dissociable recognition deficits, highlighting their distinct functional roles in information processing. Finally, these pre-identified units can predict neural responses in all corresponding face-, scene-, body-, and word-selective regions of human visual cortex, under a highly constrained sparse positive encoding procedure. The success of this single model indicates that brain-like functional specialization can emerge without category-specific learning pressures, as the system learns to untangle rich image content. Contrastive coding, therefore, provides a unifying account of object category emergence and representation in the human brain.
Collapse
Affiliation(s)
- Jacob S Prince
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - George A Alvarez
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Kempner Institute for Biological and Artificial Intelligence, Harvard University, Cambridge, MA, USA
| |
Collapse
|
7
|
Wood JN, Pandey L, Wood SMW. Digital Twin Studies for Reverse Engineering the Origins of Visual Intelligence. Annu Rev Vis Sci 2024; 10:145-170. [PMID: 39292554 DOI: 10.1146/annurev-vision-101322-103628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/20/2024]
Abstract
What are the core learning algorithms in brains? Nativists propose that intelligence emerges from innate domain-specific knowledge systems, whereas empiricists propose that intelligence emerges from domain-general systems that learn domain-specific knowledge from experience. We address this debate by reviewing digital twin studies designed to reverse engineer the learning algorithms in newborn brains. In digital twin studies, newborn animals and artificial agents are raised in the same environments and tested with the same tasks, permitting direct comparison of their learning abilities. Supporting empiricism, digital twin studies show that domain-general algorithms learn animal-like object perception when trained on the first-person visual experiences of newborn animals. Supporting nativism, digital twin studies show that domain-general algorithms produce innate domain-specific knowledge when trained on prenatal experiences (retinal waves). We argue that learning across humans, animals, and machines can be explained by a universal principle, which we call space-time fitting. Space-time fitting explains both empiricist and nativist phenomena, providing a unified framework for understanding the origins of intelligence.
Collapse
Affiliation(s)
- Justin N Wood
- Informatics Department, Indiana University Bloomington, Bloomington, Indiana, USA; , ,
- Cognitive Science Program, Indiana University Bloomington, Bloomington, Indiana, USA
- Neuroscience Department, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Lalit Pandey
- Informatics Department, Indiana University Bloomington, Bloomington, Indiana, USA; , ,
| | - Samantha M W Wood
- Informatics Department, Indiana University Bloomington, Bloomington, Indiana, USA; , ,
- Cognitive Science Program, Indiana University Bloomington, Bloomington, Indiana, USA
- Neuroscience Department, Indiana University Bloomington, Bloomington, Indiana, USA
| |
Collapse
|
8
|
Tousi E, Mur M. The face inversion effect through the lens of deep neural networks. Proc Biol Sci 2024; 291:20241342. [PMID: 39137884 PMCID: PMC11321844 DOI: 10.1098/rspb.2024.1342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 08/15/2024] Open
Affiliation(s)
- Ehsan Tousi
- Department of Psychology, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
- Neuroscience Graduate Program, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
| | - Marieke Mur
- Department of Psychology, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
- Department of Computer Science, Western University, 1151 Richmond Street, London, OntarioN6A 3K7, Canada
| |
Collapse
|
9
|
Yovel G, Abudarham N. Why psychologists should embrace rather than abandon DNNs. Behav Brain Sci 2023; 46:e414. [PMID: 38054326 DOI: 10.1017/s0140525x2300167x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Deep neural networks (DNNs) are powerful computational models, which generate complex, high-level representations that were missing in previous models of human cognition. By studying these high-level representations, psychologists can now gain new insights into the nature and origin of human high-level vision, which was not possible with traditional handcrafted models. Abandoning DNNs would be a huge oversight for psychological sciences.
Collapse
Affiliation(s)
- Galit Yovel
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel ; https://people.socsci.tau.ac.il/mu/galityovel/
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Naphtali Abudarham
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel ; https://people.socsci.tau.ac.il/mu/galityovel/
| |
Collapse
|
10
|
Dobs K, Yuan J, Martinez J, Kanwisher N. Behavioral signatures of face perception emerge in deep neural networks optimized for face recognition. Proc Natl Acad Sci U S A 2023; 120:e2220642120. [PMID: 37523537 PMCID: PMC10410721 DOI: 10.1073/pnas.2220642120] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 06/08/2023] [Indexed: 08/02/2023] Open
Abstract
Human face recognition is highly accurate and exhibits a number of distinctive and well-documented behavioral "signatures" such as the use of a characteristic representational space, the disproportionate performance cost when stimuli are presented upside down, and the drop in accuracy for faces from races the participant is less familiar with. These and other phenomena have long been taken as evidence that face recognition is "special". But why does human face perception exhibit these properties in the first place? Here, we use deep convolutional neural networks (CNNs) to test the hypothesis that all of these signatures of human face perception result from optimization for the task of face recognition. Indeed, as predicted by this hypothesis, these phenomena are all found in CNNs trained on face recognition, but not in CNNs trained on object recognition, even when additionally trained to detect faces while matching the amount of face experience. To test whether these signatures are in principle specific to faces, we optimized a CNN on car discrimination and tested it on upright and inverted car images. As we found for face perception, the car-trained network showed a drop in performance for inverted vs. upright cars. Similarly, CNNs trained on inverted faces produced an inverted face inversion effect. These findings show that the behavioral signatures of human face perception reflect and are well explained as the result of optimization for the task of face recognition, and that the nature of the computations underlying this task may not be so special after all.
Collapse
Affiliation(s)
- Katharina Dobs
- Department of Psychology, Justus Liebig University Giessen, Giessen35394, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Marburg35302, Germany
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Joanne Yuan
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Julio Martinez
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- Department of Psychology, Stanford University, Stanford, CA94305
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
| |
Collapse
|