1
|
Bülthoff I, Manno L, Zhao M. Varying sex and identity of faces affects face categorization differently in humans and computational models. Sci Rep 2023; 13:16120. [PMID: 37752212 PMCID: PMC10522766 DOI: 10.1038/s41598-023-43169-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 09/20/2023] [Indexed: 09/28/2023] Open
Abstract
Our faces display socially important sex and identity information. How perceptually independent are these facial characteristics? Here, we used a sex categorization task to investigate how changing faces in terms of either their sex or identity affects sex categorization of those faces, whether these manipulations affect sex categorization similarly when the original faces were personally familiar or unknown, and, whether computational models trained for sex classification respond similarly to human observers. Our results show that varying faces along either sex or identity dimension affects their sex categorization. When the sex was swapped (e.g., female faces became male looking, Experiment 1), sex categorization performance was different from that with the original unchanged faces, and significantly more so for people who were familiar with the original faces than those who were not. When the identity of the faces was manipulated by caricaturing or anti-caricaturing them (these manipulations either augment or diminish idiosyncratic facial information, Experiment 2), sex categorization performance to caricatured, original, and anti-caricatured faces increased in that order, independently of face familiarity. Moreover, our face manipulations showed different effects upon computational models trained for sex classification and elicited different patterns of responses in humans and computational models. These results not only support the notion that the sex and identity of faces are processed integratively by human observers but also demonstrate that computational models of face categorization may not capture key characteristics of human face categorization.
Collapse
|
2
|
Roy Chowdhury P, Singh Wadhwa A, Tyagi N. Brain inspired face recognition: A computational framework. COGN SYST RES 2022. [DOI: 10.1016/j.cogsys.2022.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
3
|
Face dissimilarity judgments are predicted by representational distance in morphable and image-computable models. Proc Natl Acad Sci U S A 2022; 119:e2115047119. [PMID: 35767642 PMCID: PMC9271164 DOI: 10.1073/pnas.2115047119] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Discerning the subtle differences between individuals’ faces is crucial for social functioning. It requires us not only to solve general challenges of object recognition (e.g., invariant recognition over changes in view or lighting) but also to be attuned to the specific ways in which face structure varies. Three-dimensional morphable models based on principal component analyses of real faces provide descriptions of statistical differences between faces, as well as tools to generate novel faces. We rendered large sets of realistic face pairs from such a model and collected similarity and same/different identity judgments. The statistical model predicted human perception as well as state-of-the-art image-computable neural networks. Results underscore the statistical tuning of face encoding. Human vision is attuned to the subtle differences between individual faces. Yet we lack a quantitative way of predicting how similar two face images look and whether they appear to show the same person. Principal component–based three-dimensional (3D) morphable models are widely used to generate stimuli in face perception research. These models capture the distribution of real human faces in terms of dimensions of physical shape and texture. How well does a “face space” based on these dimensions capture the similarity relationships humans perceive among faces? To answer this, we designed a behavioral task to collect dissimilarity and same/different identity judgments for 232 pairs of realistic faces. Stimuli sampled geometric relationships in a face space derived from principal components of 3D shape and texture (Basel face model [BFM]). We then compared a wide range of models in their ability to predict the data, including the BFM from which faces were generated, an active appearance model derived from face photographs, and image-computable models of visual perception. Euclidean distance in the BFM explained both dissimilarity and identity judgments surprisingly well. In a comparison against 16 diverse models, BFM distance was competitive with representational distances in state-of-the-art deep neural networks (DNNs), including novel DNNs trained on BFM synthetic identities or BFM latents. Models capturing the distribution of face shape and texture across individuals are not only useful tools for stimulus generation. They also capture important information about how faces are perceived, suggesting that human face representations are tuned to the statistical distribution of faces.
Collapse
|
4
|
Bülthoff I, Zhao M. Average faces: How does the averaging process change faces physically and perceptually? Cognition 2021; 216:104867. [PMID: 34364004 DOI: 10.1016/j.cognition.2021.104867] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 07/27/2021] [Accepted: 07/28/2021] [Indexed: 12/01/2022]
Abstract
Average faces have been used frequently in face recognition studies, either as a theoretical concept (e.g., face norm) or as a tool to manipulate facial attributes (e.g., modifying identity strength). Nonetheless, how the face averaging process- the creation of average faces using an increasing number of faces -changes the resulting averaged faces and our ability to differentiate between them remains to be elucidated. Here we addressed these questions by combining 3D-face averaging, eye-movement tracking, and the computation of image-based face similarity. Participants judged whether two average faces showed the same person while we systematically increased their average level (i.e., number of faces being averaged). Our results showed, with increasing averaging, both a nonlinear increase of the computational similarity between the resulting average faces and a nonlinear decrease of face discrimination performance. Participants' performance dropped from near-ceiling level when two different faces had been averaged together to chance level when 80 faces were mixed. We also found a nonlinear relationship between face similarity and face discrimination performance, which was fitted nicely with an exponential function. Furthermore, when the comparison task became more challenging, participants performed more fixations onto the faces. Nonetheless, the distribution of fixations across facial features (eyes, nose, mouth, and the center area of a face) remained unchanged. These results not only set new constraints on the theoretical characterization of the average face and its role in establishing face norms but also offer practical guidance for creating approximated face norms to manipulate face identity.
Collapse
Affiliation(s)
| | - Mintao Zhao
- Max Planck Institute for Biological Cybernetics, Germany; University of East Anglia, United Kingdom
| |
Collapse
|
5
|
Newport C, Wallis G, Siebeck UE. Object recognition in fish: accurate discrimination across novel views of an unfamiliar object category (human faces). Anim Behav 2018. [DOI: 10.1016/j.anbehav.2018.09.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
6
|
Hurley RS, Mesulam MM, Sridhar J, Rogalski EJ, Thompson CK. A nonverbal route to conceptual knowledge involving the right anterior temporal lobe. Neuropsychologia 2018; 117:92-101. [PMID: 29802865 DOI: 10.1016/j.neuropsychologia.2018.05.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Revised: 05/18/2018] [Accepted: 05/22/2018] [Indexed: 11/19/2022]
Abstract
The semantic variant of primary progressive aphasia (PPA-S) is diagnosed based on impaired single-word comprehension, but nonverbal impairments in face and object recognition can also be present, particularly in later disease stages. PPA-S is associated with focal atrophy in the left anterior temporal lobe (ATL), often accompanied by a lesser degree of atrophy in the right ATL. According to a dual-route account, the left ATL is critical for verbal access to conceptual knowledge while nonverbal access to conceptual knowledge depends upon the integrity of right ATL. Consistent with this view, single-word comprehension deficits in PPA-S have consistently been linked to the degree of atrophy in left ATL. In the current study we examined object processing and cortical thickness in 19 patients diagnosed with PPA-S, to evaluate the hypothesis that nonverbal object impairments would instead be determined by the amount of atrophy in the right ATL. All patients demonstrated inability to access conceptual knowledge on standardized tests with word stimuli: they were unable to match spoken words with their corresponding pictures on the Peabody Picture Vocabulary Test. Only a minority of patients, however, performed abnormally on an experimental thematic verification task, which requires judgments as to whether pairs of object pictures are thematically-associated, and does not rely on auditory or visual word input. The entire PPA-S group showed cortical thinning in left ATL, but atrophy in right ATL was more prominent in the subgroup with low verification scores. Thematic verification scores were correlated with cortical thickness in the right rather than left ATL, an asymmetric mapping which persisted when controlling for the degree of atrophy in the contralateral hemisphere. These results are consistent with a dual-route account of conceptual knowledge: breakdown of the verbal left hemispheric route produces an aphasic syndrome, which is only accompanied by visual object processing impairments when the nonverbal right hemispheric route is also compromised.
Collapse
Affiliation(s)
- Robert S Hurley
- Cognitive Neurology & Alzheimer's Disease Center, Northwestern University, Chicago, IL 60611, USA; Department of Neurology, Northwestern University, Chicago, IL 60611, USA; Department of Psychology, Cleveland State University, Cleveland, OH 44115, USA.
| | - M-Marsel Mesulam
- Cognitive Neurology & Alzheimer's Disease Center, Northwestern University, Chicago, IL 60611, USA; Department of Neurology, Northwestern University, Chicago, IL 60611, USA
| | - Jaiashre Sridhar
- Cognitive Neurology & Alzheimer's Disease Center, Northwestern University, Chicago, IL 60611, USA
| | - Emily J Rogalski
- Cognitive Neurology & Alzheimer's Disease Center, Northwestern University, Chicago, IL 60611, USA
| | - Cynthia K Thompson
- Cognitive Neurology & Alzheimer's Disease Center, Northwestern University, Chicago, IL 60611, USA; Department of Neurology, Northwestern University, Chicago, IL 60611, USA; Department of Communications Sciences and Disorders, Northwestern University, Chicago, IL 60611, USA
| |
Collapse
|
7
|
Sampaio C, Reinke V, Mathews J, Swart A, Wallinger S. High confidence in falsely recognizing prototypical faces. Q J Exp Psychol (Hove) 2018; 71:1348-1356. [DOI: 10.1080/17470218.2017.1329844] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
We applied a metacognitive approach to investigate confidence in recognition of prototypical faces. Participants were presented with sets of faces constructed digitally as deviations from prototype/base faces. Participants were then tested with a simple recognition task (Experiment 1) or a multiple-choice task (Experiment 2) for old and new items plus new prototypes, and they showed a high rate of confident false alarms to the prototypes. Confidence and accuracy relationship in this face recognition paradigm was found to be positive for standard items but negative for the prototypes; thus, it was contingent on the nature of the items used. The data have implications for lineups that employ match-to-suspect strategies.
Collapse
Affiliation(s)
- Cristina Sampaio
- Department of Psychology, Western Washington University, Bellingham, WA, USA
| | - Victoria Reinke
- Department of Psychology, Western Washington University, Bellingham, WA, USA
| | - Jeffrey Mathews
- Department of Psychology, Western Washington University, Bellingham, WA, USA
| | - Alexandra Swart
- Department of Psychology, Western Washington University, Bellingham, WA, USA
| | - Stephen Wallinger
- Department of Psychology, Western Washington University, Bellingham, WA, USA
| |
Collapse
|
8
|
Orth UR, Cornwell TB, Ohlhoff J, Naber C. Seeing faces: The role of brand visual processing and social connection in brand liking. EUROPEAN JOURNAL OF SOCIAL PSYCHOLOGY 2017. [DOI: 10.1002/ejsp.2245] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Ulrich R. Orth
- Consumer Psychology; Christian-Albrechts-University Kiel; Kiel Germany
- Business School; University of Adelaide; Adelaide Australia
- Ehrenberg-Bass-Institute for Marketing Science; University of South Australia; Adelaide Australia
| | - T. Bettina Cornwell
- Lundquist College of Business, Lillis 476; University of Oregon; Eugene Oregon USA
| | - Jana Ohlhoff
- Department of A&F Marketing-Consumer Psychology; Christian-Albrechts-University Kiel; Kiel Germany
| | - Christiane Naber
- Department of A&F Marketing-Consumer Psychology; Christian-Albrechts-University Kiel; Kiel Germany
| |
Collapse
|
9
|
Damon F, Mottier H, Méary D, Pascalis O. A Review of Attractiveness Preferences in Infancy: From Faces to Objects. ADAPTIVE HUMAN BEHAVIOR AND PHYSIOLOGY 2017. [DOI: 10.1007/s40750-017-0071-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
10
|
Damon F, Méary D, Quinn PC, Lee K, Simpson EA, Paukner A, Suomi SJ, Pascalis O. Preference for facial averageness: Evidence for a common mechanism in human and macaque infants. Sci Rep 2017; 7:46303. [PMID: 28406237 PMCID: PMC5390246 DOI: 10.1038/srep46303] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 03/14/2017] [Indexed: 11/12/2022] Open
Abstract
Human adults and infants show a preference for average faces, which could stem from a general processing mechanism and may be shared among primates. However, little is known about preference for facial averageness in monkeys. We used a comparative developmental approach and eye-tracking methodology to assess visual attention in human and macaque infants to faces naturally varying in their distance from a prototypical face. In Experiment 1, we examined the preference for faces relatively close to or far from the prototype in 12-month-old human infants with human adult female faces. Infants preferred faces closer to the average than faces farther from it. In Experiment 2, we measured the looking time of 3-month-old rhesus macaques (Macaca mulatta) viewing macaque faces varying in their distance from the prototype. Like human infants, macaque infants looked longer to faces closer to the average. In Experiments 3 and 4, both species were presented with unfamiliar categories of faces (i.e., macaque infants tested with adult macaque faces; human infants and adults tested with infant macaque faces) and showed no prototype preferences, suggesting that the prototypicality effect is experience-dependent. Overall, the findings suggest a common processing mechanism across species, leading to averageness preferences in primates.
Collapse
Affiliation(s)
- Fabrice Damon
- Univ. Grenoble-Alpes, LPNC, France
- CNRS, LPNC,UMR 5105, France
| | - David Méary
- Univ. Grenoble-Alpes, LPNC, France
- CNRS, LPNC,UMR 5105, France
| | | | | | | | - Annika Paukner
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, USA
| | - Stephen J. Suomi
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, USA
| | | |
Collapse
|
11
|
Erickson WB, Lampinen JM, Frowd CD, Mahoney G. When age-progressed images are unreliable: The roles of external features and age range. Sci Justice 2017; 57:136-143. [PMID: 28284439 DOI: 10.1016/j.scijus.2016.11.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2016] [Revised: 11/20/2016] [Accepted: 11/29/2016] [Indexed: 10/20/2022]
Abstract
When children go missing for many years, investigators commission age-progressed images from forensic artists to depict an updated appearance. These images have anecdotal success, and systematic research has found they lead to accurate recognition rates comparable to outdated photos. The present study examines the reliability of age progressions of the same individuals created by different artists. Eight artists first generated age progressions of eight targets across three age ranges. Eighty-five participants then evaluated the similarity of these images against other images depicting the same targets progressed at the same age ranges, viewing either whole faces or faces with external features concealed. Similarities were highest over shorter age ranges and when external features were concealed. Implications drawn from theory and application are discussed.
Collapse
|
12
|
Discrimination of human faces by archerfish (Toxotes chatareus). Sci Rep 2016; 6:27523. [PMID: 27272551 PMCID: PMC4895153 DOI: 10.1038/srep27523] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 05/20/2016] [Indexed: 11/30/2022] Open
Abstract
Two rival theories of how humans recognize faces exist: (i) recognition is innate, relying on specialized neocortical circuitry, and (ii) recognition is a learned expertise, relying on general object recognition pathways. Here, we explore whether animals without a neocortex, can learn to recognize human faces. Human facial recognition has previously been demonstrated for birds, however they are now known to possess neocortex-like structures. Also, with much of the work done in domesticated pigeons, one cannot rule out the possibility that they have developed adaptations for human face recognition. Fish do not appear to possess neocortex-like cells, and given their lack of direct exposure to humans, are unlikely to have evolved any specialized capabilities for human facial recognition. Using a two-alternative forced-choice procedure, we show that archerfish (Toxotes chatareus) can learn to discriminate a large number of human face images (Experiment 1, 44 faces), even after controlling for colour, head-shape and brightness (Experiment 2, 18 faces). This study not only demonstrates that archerfish have impressive pattern discrimination abilities, but also provides evidence that a vertebrate lacking a neocortex and without an evolutionary prerogative to discriminate human faces, can nonetheless do so to a high degree of accuracy.
Collapse
|
13
|
A specialized face-processing model inspired by the organization of monkey face patches explains several face-specific phenomena observed in humans. Sci Rep 2016; 6:25025. [PMID: 27113635 PMCID: PMC4844965 DOI: 10.1038/srep25025] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 04/08/2016] [Indexed: 11/30/2022] Open
Abstract
Converging reports indicate that face images are processed through specialized neural networks in the brain –i.e. face patches in monkeys and the fusiform face area (FFA) in humans. These studies were designed to find out how faces are processed in visual system compared to other objects. Yet, the underlying mechanism of face processing is not completely revealed. Here, we show that a hierarchical computational model, inspired by electrophysiological evidence on face processing in primates, is able to generate representational properties similar to those observed in monkey face patches (posterior, middle and anterior patches). Since the most important goal of sensory neuroscience is linking the neural responses with behavioral outputs, we test whether the proposed model, which is designed to account for neural responses in monkey face patches, is also able to predict well-documented behavioral face phenomena observed in humans. We show that the proposed model satisfies several cognitive face effects such as: composite face effect and the idea of canonical face views. Our model provides insights about the underlying computations that transfer visual information from posterior to anterior face patches.
Collapse
|
14
|
Wood JN, Prasad A, Goldman JG, Wood SMW. Enhanced learning of natural visual sequences in newborn chicks. Anim Cogn 2016; 19:835-45. [PMID: 27079969 DOI: 10.1007/s10071-016-0982-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 02/06/2016] [Accepted: 03/31/2016] [Indexed: 10/21/2022]
Abstract
To what extent are newborn brains designed to operate over natural visual input? To address this question, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) show enhanced learning of natural visual sequences at the onset of vision. We took the same set of images and grouped them into either natural sequences (i.e., sequences showing different viewpoints of the same real-world object) or unnatural sequences (i.e., sequences showing different images of different real-world objects). When raised in virtual worlds containing natural sequences, newborn chicks developed the ability to recognize familiar images of objects. Conversely, when raised in virtual worlds containing unnatural sequences, newborn chicks' object recognition abilities were severely impaired. In fact, the majority of the chicks raised with the unnatural sequences failed to recognize familiar images of objects despite acquiring over 100 h of visual experience with those images. Thus, newborn chicks show enhanced learning of natural visual sequences at the onset of vision. These results indicate that newborn brains are designed to operate over natural visual input.
Collapse
Affiliation(s)
- Justin N Wood
- Department of Psychology, University of Southern California, Building SGM, Room 501, 3620 South McClintock Avenue, Los Angeles, CA, 90089, USA.
| | - Aditya Prasad
- Department of Psychology, University of Southern California, Building SGM, Room 501, 3620 South McClintock Avenue, Los Angeles, CA, 90089, USA
| | - Jason G Goldman
- Department of Psychology, University of Southern California, Building SGM, Room 501, 3620 South McClintock Avenue, Los Angeles, CA, 90089, USA
| | - Samantha M W Wood
- Department of Psychology, University of Southern California, Building SGM, Room 501, 3620 South McClintock Avenue, Los Angeles, CA, 90089, USA
| |
Collapse
|
15
|
Dahl CD, Rasch MJ, Bülthoff I, Chen CC. Integration or separation in the processing of facial properties--a computational view. Sci Rep 2016; 6:20247. [PMID: 26829891 PMCID: PMC4735755 DOI: 10.1038/srep20247] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2015] [Accepted: 12/31/2015] [Indexed: 11/10/2022] Open
Abstract
A face recognition system ought to read out information about the identity, facial expression and invariant properties of faces, such as sex and race. A current debate is whether separate neural units in the brain deal with these face properties individually or whether a single neural unit processes in parallel all aspects of faces. While the focus of studies has been directed toward the processing of identity and facial expression, little research exists on the processing of invariant aspects of faces. In a theoretical framework we tested whether a system can deal with identity in combination with sex, race or facial expression using the same underlying mechanism. We used dimension reduction to describe how the representational face space organizes face properties when trained on different aspects of faces. When trained to learn identities, the system not only successfully recognized identities, but also was immediately able to classify sex and race, suggesting that no additional system for the processing of invariant properties is needed. However, training on identity was insufficient for the recognition of facial expressions and vice versa. We provide a theoretical approach on the interconnection of invariant facial properties and the separation of variant and invariant facial properties.
Collapse
Affiliation(s)
- Christoph D. Dahl
- Department of Psychology, National Taiwan University, Roosevelt Road, Taipei 106, Taiwan
- Department of Comparative Cognition, Institute of Biology, University of Neuchâtel, 2000, Rue Emile-Argand 11, Neuchâtel, Switzerland
| | - Malte J. Rasch
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Xinjiekouwai Street 19, 100875 Beijing, China
| | - Isabelle Bülthoff
- Max Planck Institute for Biological Cybernetics, Human Perception, Cognition and Action, Spemannstrasse 38, 72074 Tübingen, Germany
| | - Chien-Chung Chen
- Department of Psychology, National Taiwan University, Roosevelt Road, Taipei 106, Taiwan
| |
Collapse
|
16
|
Leibo JZ, Liao Q, Anselmi F, Poggio T. The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex. PLoS Comput Biol 2015; 11:e1004390. [PMID: 26496457 PMCID: PMC4619805 DOI: 10.1371/journal.pcbi.1004390] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Accepted: 05/11/2015] [Indexed: 12/24/2022] Open
Abstract
Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system's optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions.
Collapse
Affiliation(s)
- Joel Z. Leibo
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- McGovern Institute for Brain Research, MIT, Cambridge, Massachusetts, United States of America
| | - Qianli Liao
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- McGovern Institute for Brain Research, MIT, Cambridge, Massachusetts, United States of America
| | - Fabio Anselmi
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- McGovern Institute for Brain Research, MIT, Cambridge, Massachusetts, United States of America
- Istituto Italiano di Tecnologia, Genova, Italy
| | - Tomaso Poggio
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- McGovern Institute for Brain Research, MIT, Cambridge, Massachusetts, United States of America
- Istituto Italiano di Tecnologia, Genova, Italy
| |
Collapse
|
17
|
Robinson L, Rolls ET. Invariant visual object recognition: biologically plausible approaches. BIOLOGICAL CYBERNETICS 2015; 109:505-35. [PMID: 26335743 PMCID: PMC4572081 DOI: 10.1007/s00422-015-0658-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 08/14/2015] [Indexed: 06/01/2023]
Abstract
Key properties of inferior temporal cortex neurons are described, and then, the biological plausibility of two leading approaches to invariant visual object recognition in the ventral visual system is assessed to investigate whether they account for these properties. Experiment 1 shows that VisNet performs object classification with random exemplars comparably to HMAX, except that the final layer C neurons of HMAX have a very non-sparse representation (unlike that in the brain) that provides little information in the single-neuron responses about the object class. Experiment 2 shows that VisNet forms invariant representations when trained with different views of each object, whereas HMAX performs poorly when assessed with a biologically plausible pattern association network, as HMAX has no mechanism to learn view invariance. Experiment 3 shows that VisNet neurons do not respond to scrambled images of faces, and thus encode shape information. HMAX neurons responded with similarly high rates to the unscrambled and scrambled faces, indicating that low-level features including texture may be relevant to HMAX performance. Experiment 4 shows that VisNet can learn to recognize objects even when the view provided by the object changes catastrophically as it transforms, whereas HMAX has no learning mechanism in its S-C hierarchy that provides for view-invariant learning. This highlights some requirements for the neurobiological mechanisms of high-level vision, and how some different approaches perform, in order to help understand the fundamental underlying principles of invariant visual object recognition in the ventral visual stream.
Collapse
Affiliation(s)
- Leigh Robinson
- Department of Computer Science, University of Warwick, Coventry, UK
| | - Edmund T Rolls
- Department of Computer Science, University of Warwick, Coventry, UK.
- Oxford Centre for Computational Neuroscience, Oxford, UK.
| |
Collapse
|
18
|
Boutet I, Taler V, Collin CA. On the particular vulnerability of face recognition to aging: a review of three hypotheses. Front Psychol 2015; 6:1139. [PMID: 26347670 PMCID: PMC4543816 DOI: 10.3389/fpsyg.2015.01139] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 07/22/2015] [Indexed: 11/30/2022] Open
Abstract
Age-related face recognition deficits are characterized by high false alarms to unfamiliar faces, are not as pronounced for other complex stimuli, and are only partially related to general age-related impairments in cognition. This paper reviews some of the underlying processes likely to be implicated in theses deficits by focusing on areas where contradictions abound as a means to highlight avenues for future research. Research pertaining to the three following hypotheses is presented: (i) perceptual deterioration, (ii) encoding of configural information, and (iii) difficulties in recollecting contextual information. The evidence surveyed provides support for the idea that all three factors are likely to contribute, under certain conditions, to the deficits in face recognition seen in older adults. We discuss how these different factors might interact in the context of a generic framework of the different stages implicated in face recognition. Several suggestions for future investigations are outlined.
Collapse
Affiliation(s)
- Isabelle Boutet
- School of Psychology, University of Ottawa , Ottawa, ON, Canada
| | - Vanessa Taler
- School of Psychology, University of Ottawa , Ottawa, ON, Canada ; School of Psychology, Bruyère Research Institute , Ottawa ON, Canada
| | | |
Collapse
|
19
|
Eguchi A, Mender BMW, Evans BD, Humphreys GW, Stringer SM. Computational modeling of the neural representation of object shape in the primate ventral visual system. Front Comput Neurosci 2015; 9:100. [PMID: 26300766 PMCID: PMC4523947 DOI: 10.3389/fncom.2015.00100] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2015] [Accepted: 07/17/2015] [Indexed: 11/13/2022] Open
Abstract
Neurons in successive stages of the primate ventral visual pathway encode the spatial structure of visual objects. In this paper, we investigate through computer simulation how these cell firing properties may develop through unsupervised visually-guided learning. Individual neurons in the model are shown to exploit statistical regularity and temporal continuity of the visual inputs during training to learn firing properties that are similar to neurons in V4 and TEO. Neurons in V4 encode the conformation of boundary contour elements at a particular position within an object regardless of the location of the object on the retina, while neurons in TEO integrate information from multiple boundary contour elements. This representation goes beyond mere object recognition, in which neurons simply respond to the presence of a whole object, but provides an essential foundation from which the brain is subsequently able to recognize the whole object.
Collapse
Affiliation(s)
- Akihiro Eguchi
- Department of Experimental Psychology, Oxford Centre for Theoretical Neuroscience and Artificial Intelligence, Oxford UniversityOxford, UK
| | - Bedeho M. W. Mender
- Department of Experimental Psychology, Oxford Centre for Theoretical Neuroscience and Artificial Intelligence, Oxford UniversityOxford, UK
| | - Benjamin D. Evans
- Department of Experimental Psychology, Oxford Centre for Theoretical Neuroscience and Artificial Intelligence, Oxford UniversityOxford, UK
| | - Glyn W. Humphreys
- Department of Experimental Psychology, Oxford Cognitive Neuropsychology Centre, Oxford UniversityOxford, UK
| | - Simon M. Stringer
- Department of Experimental Psychology, Oxford Centre for Theoretical Neuroscience and Artificial Intelligence, Oxford UniversityOxford, UK
| |
Collapse
|
20
|
Blaser R, Heyser C. Spontaneous object recognition: a promising approach to the comparative study of memory. Front Behav Neurosci 2015; 9:183. [PMID: 26217207 PMCID: PMC4498097 DOI: 10.3389/fnbeh.2015.00183] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 06/29/2015] [Indexed: 01/11/2023] Open
Abstract
Spontaneous recognition of a novel object is a popular measure of exploratory behavior, perception and recognition memory in rodent models. Because of its relative simplicity and speed of testing, the variety of stimuli that can be used, and its ecological validity across species, it is also an attractive task for comparative research. To date, variants of this test have been used with vertebrate and invertebrate species, but the methods have seldom been sufficiently standardized to allow cross-species comparison. Here, we review the methods necessary for the study of novel object recognition in mammalian and non-mammalian models, as well as the results of these experiments. Critical to the use of this test is an understanding of the organism's initial response to a novel object, the modulation of exploration by context, and species differences in object perception and exploratory behaviors. We argue that with appropriate consideration of species differences in perception, object affordances, and natural exploratory behaviors, the spontaneous object recognition test can be a valid and versatile tool for translational research with non-mammalian models.
Collapse
Affiliation(s)
- Rachel Blaser
- Department of Psychological Sciences, University of San DiegoSan Diego, CA, USA
| | - Charles Heyser
- Behavioral Testing Core, Department of Neurosciences, University of California, San DiegoSan Diego, CA, USA
| |
Collapse
|
21
|
Abstract
The idea that faces are represented within a structured face space (Valentine Quarterly Journal of Experimental Psychology 43: 161-204, 1991) has gained considerable experimental support, from both physiological and perceptual studies. Recent work has also shown that faces can even be recognized haptically-that is, from touch alone. Although some evidence favors congruent processing strategies in the visual and haptic processing of faces, the question of how similar the two modalities are in terms of face processing remains open. Here, this question was addressed by asking whether there is evidence for a haptic face space, and if so, how it compares to visual face space. For this, a physical face space was created, consisting of six laser-scanned individual faces, their morphed average, 50%-morphs between two individual faces, as well as 50%-morphs of the individual faces with the average, resulting in a set of 19 faces. Participants then rated either the visual or haptic pairwise similarity of the tangible 3-D face shapes. Multidimensional scaling analyses showed that both modalities extracted perceptual spaces that conformed to critical predictions of the face space framework, hence providing support for similar processing of complex face shapes in haptics and vision. Despite the overall similarities, however, systematic differences also emerged between the visual and haptic data. These differences are discussed in the context of face processing and complex-shape processing in vision and haptics.
Collapse
|
22
|
Ramon M. Differential Processing of Vertical Interfeature Relations Due to Real-Life Experience with Personally Familiar Faces. Perception 2015; 44:368-82. [DOI: 10.1068/p7909] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Identification of personally familiar faces is possibly the most complex and likewise efficient task achieved by the human visual system, yet to date the mechanisms underlying this extreme proficiency remain largely unknown. Building on empirical evidence from unfamiliar face processing in healthy populations and neuropsychological patients, the present work aimed to determine the type of information processed differently due to repeated, real-life experience with faces. A modulatory effect of familiarity was observed for processing of vertical interfeature distances, which have been suggested to rely on holistic processing skills. Contrariwise, no such effect was found for processing of information that can be discriminated locally (ie featural cues, interocular distances). The results indicate that familiarity-related advantages in face processing may arise from more efficient, or increased, holistic processing.
Collapse
Affiliation(s)
- Meike Ramon
- Institute of Research in Psychology and Institute of Neuroscience, University of Louvain, 10 place du Cardinal Mercier, B1348 Louvain-La-Neuve, Belgium; and Institute of Neuroscience and Psychology, University of Glasgow, UK
| |
Collapse
|
23
|
Dahl CD, Chen CC, Rasch MJ. Own-race and own-species advantages in face perception: a computational view. Sci Rep 2014; 4:6654. [PMID: 25323815 PMCID: PMC4200398 DOI: 10.1038/srep06654] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 09/10/2014] [Indexed: 11/17/2022] Open
Abstract
The frequency to which an organism is exposed to a particular type of face influences recognition performance. For example, Asians are better in individuating Asian than Caucasian faces, known as the own-race advantage. Similarly, humans in general are better in individuating human than monkey faces, known as the own-species advantage. It is an open question whether the underlying mechanisms causing these effects are similar. We hypothesize that these processes are governed by neural plasticity of the face discrimination system to retain optimal discrimination performance in its environment. Using common face features derived from a set of images from various face classes, we show that maximizing the feature variance between different individuals while ensuring minimal variance within individuals achieved good discrimination performances on own-class faces when selecting a subset of feature dimensions. Further, the selected subset of features does not necessarily lead to an optimal performance on the other class of faces. Thus, the face discrimination system continuously re-optimizes its space constraint face representation to optimize recognition performance on the current distribution of faces in its environment. This model can account for both, the own-race and own-species advantages. We name this approach Space Constraint Optimized Representational Embedding (SCORE).
Collapse
Affiliation(s)
- Christoph D. Dahl
- Department of Psychology, National Taiwan University, Roosevelt Road, Taipei, Taiwan (ROC)
| | - Chien-Chung Chen
- Department of Psychology, National Taiwan University, Roosevelt Road, Taipei, Taiwan (ROC)
| | - Malte J. Rasch
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, China
- Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, China
| |
Collapse
|
24
|
Rolls ET, Webb TJ. Finding and recognizing objects in natural scenes: complementary computations in the dorsal and ventral visual systems. Front Comput Neurosci 2014; 8:85. [PMID: 25161619 PMCID: PMC4130325 DOI: 10.3389/fncom.2014.00085] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 07/16/2014] [Indexed: 01/09/2023] Open
Abstract
Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes.
Collapse
Affiliation(s)
- Edmund T. Rolls
- Department of Computer Science, University of WarwickCoventry, UK
- Oxford Centre for Computational NeuroscienceOxford, UK
| | - Tristan J. Webb
- Department of Computer Science, University of WarwickCoventry, UK
| |
Collapse
|
25
|
Webb TJ, Rolls ET. Deformation-specific and deformation-invariant visual object recognition: pose vs. identity recognition of people and deforming objects. Front Comput Neurosci 2014; 8:37. [PMID: 24744725 PMCID: PMC3978248 DOI: 10.3389/fncom.2014.00037] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2013] [Accepted: 03/12/2014] [Indexed: 11/18/2022] Open
Abstract
When we see a human sitting down, standing up, or walking, we can recognize one of these poses independently of the individual, or we can recognize the individual person, independently of the pose. The same issues arise for deforming objects. For example, if we see a flag deformed by the wind, either blowing out or hanging languidly, we can usually recognize the flag, independently of its deformation; or we can recognize the deformation independently of the identity of the flag. We hypothesize that these types of recognition can be implemented by the primate visual system using temporo-spatial continuity as objects transform as a learning principle. In particular, we hypothesize that pose or deformation can be learned under conditions in which large numbers of different people are successively seen in the same pose, or objects in the same deformation. We also hypothesize that person-specific representations that are independent of pose, and object-specific representations that are independent of deformation and view, could be built, when individual people or objects are observed successively transforming from one pose or deformation and view to another. These hypotheses were tested in a simulation of the ventral visual system, VisNet, that uses temporal continuity, implemented in a synaptic learning rule with a short-term memory trace of previous neuronal activity, to learn invariant representations. It was found that depending on the statistics of the visual input, either pose-specific or deformation-specific representations could be built that were invariant with respect to individual and view; or that identity-specific representations could be built that were invariant with respect to pose or deformation and view. We propose that this is how pose-specific and pose-invariant, and deformation-specific and deformation-invariant, perceptual representations are built in the brain.
Collapse
Affiliation(s)
- Tristan J. Webb
- Department of Computer Science, University of WarwickCoventry, UK
| | - Edmund T. Rolls
- Department of Computer Science, University of WarwickCoventry, UK
- Oxford Centre for Computational NeuroscienceOxford, UK
| |
Collapse
|
26
|
Watson TL, Robbins RA. The nature of holistic processing in face and object recognition: current opinions. Front Psychol 2014; 5:3. [PMID: 24478737 PMCID: PMC3901004 DOI: 10.3389/fpsyg.2014.00003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2013] [Accepted: 01/03/2014] [Indexed: 11/14/2022] Open
Affiliation(s)
- Tamara L Watson
- Foundational Processes of Behaviour Research Laboratories, School of Social Science and Psychology, University of Western Sydney Sydney, NSW, Australia
| | - Rachel A Robbins
- Foundational Processes of Behaviour Research Laboratories, School of Social Science and Psychology, University of Western Sydney Sydney, NSW, Australia
| |
Collapse
|