1
|
Sun Y, Yao L, Fu Q. Crossmodal Correspondence Mediates Crossmodal Transfer from Visual to Auditory Stimuli in Category Learning. J Intell 2024; 12:80. [PMID: 39330459 PMCID: PMC11433196 DOI: 10.3390/jintelligence12090080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/12/2024] [Accepted: 08/26/2024] [Indexed: 09/28/2024] Open
Abstract
This article investigated whether crossmodal correspondence, as a sensory translation phenomenon, can mediate crossmodal transfer from visual to auditory stimuli in category learning and whether multimodal category learning can influence the crossmodal correspondence between auditory and visual stimuli. Experiment 1 showed that the category knowledge acquired from elevation stimuli affected the categorization of pitch stimuli when there were robust crossmodal correspondence effects between elevation and size, indicating that crossmodal transfer occurred between elevation and pitch stimuli. Experiments 2 and 3 revealed that the size category knowledge could not be transferred to the categorization of pitches, but interestingly, size and pitch category learning determined the direction of the pitch-size correspondence, suggesting that the pitch-size correspondence was not stable and could be determined using multimodal category learning. Experiment 4 provided further evidence that there was no crossmodal transfer between size and pitch, due to the absence of a robust pitch-size correspondence. These results demonstrated that crossmodal transfer can occur between audio-visual stimuli with crossmodal correspondence, and multisensory category learning can change the corresponding relationship between audio-visual stimuli. These findings suggest that crossmodal transfer and crossmodal correspondence share similar abstract representations, which can be mediated by semantic content such as category labels.
Collapse
Affiliation(s)
- Ying Sun
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; (Y.S.); (L.Y.)
- University of Chinese Academy of Sciences, Beijing 101408, China
- College of Humanities and Education, Inner Mongolia Medical University, Hohhot 010110, China
| | - Liansheng Yao
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; (Y.S.); (L.Y.)
- University of Chinese Academy of Sciences, Beijing 101408, China
| | - Qiufang Fu
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; (Y.S.); (L.Y.)
- University of Chinese Academy of Sciences, Beijing 101408, China
| |
Collapse
|
2
|
Cao S, Kelly J, Nyugen C, Chow HM, Leonardo B, Sabov A, Ciaramitaro VM. Prior visual experience increases children's use of effective haptic exploration strategies in audio-tactile sound-shape correspondences. J Exp Child Psychol 2024; 241:105856. [PMID: 38306737 DOI: 10.1016/j.jecp.2023.105856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 12/21/2023] [Accepted: 12/23/2023] [Indexed: 02/04/2024]
Abstract
Sound-shape correspondence refers to the preferential mapping of information across the senses, such as associating a nonsense word like bouba with rounded abstract shapes and kiki with spiky abstract shapes. Here we focused on audio-tactile (AT) sound-shape correspondences between nonsense words and abstract shapes that are felt but not seen. Despite previous research indicating a role for visual experience in establishing AT associations, it remains unclear how visual experience facilitates AT correspondences. Here we investigated one hypothesis: seeing the abstract shapes improve haptic exploration by (a) increasing effective haptic strategies and/or (b) decreasing ineffective haptic strategies. We analyzed five haptic strategies in video-recordings of 6- to 8-year-old children obtained in a previous study. We found the dominant strategy used to explore shapes differed based on visual experience. Effective strategies, which provide information about shape, were dominant in participants with prior visual experience, whereas ineffective strategies, which do not provide information about shape, were dominant in participants without prior visual experience. With prior visual experience, poking-an effective and efficient strategy-was dominant, whereas without prior visual experience, uncategorizable and ineffective strategies were dominant. These findings suggest that prior visual experience of abstract shapes in 6- to 8-year-olds can increase the effectiveness and efficiency of haptic exploration, potentially explaining why prior visual experience can increase the strength of AT sound-shape correspondences.
Collapse
Affiliation(s)
- Shibo Cao
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Julia Kelly
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Cuong Nyugen
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Hiu Mei Chow
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA; Department of Psychology, St. Thomas University, Fredericton, New Brunswick E3B 5G3, Canada
| | - Brianna Leonardo
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Aleksandra Sabov
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Vivian M Ciaramitaro
- Department of Psychology, University of Massachusetts Boston, Boston, MA 02125, USA.
| |
Collapse
|
3
|
Abstract
Vision and learning have long been considered to be two areas of research linked only distantly. However, recent developments in vision research have changed the conceptual definition of vision from a signal-evaluating process to a goal-oriented interpreting process, and this shift binds learning, together with the resulting internal representations, intimately to vision. In this review, we consider various types of learning (perceptual, statistical, and rule/abstract) associated with vision in the past decades and argue that they represent differently specialized versions of the fundamental learning process, which must be captured in its entirety when applied to complex visual processes. We show why the generalized version of statistical learning can provide the appropriate setup for such a unified treatment of learning in vision, what computational framework best accommodates this kind of statistical learning, and what plausible neural scheme could feasibly implement this framework. Finally, we list the challenges that the field of statistical learning faces in fulfilling the promise of being the right vehicle for advancing our understanding of vision in its entirety. Expected final online publication date for the Annual Review of Vision Science, Volume 8 is September 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- József Fiser
- Department of Cognitive Science, Center for Cognitive Computation, Central European University, Vienna 1100, Austria;
| | - Gábor Lengyel
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
4
|
Yang Y, Piantadosi ST. One model for the learning of language. Proc Natl Acad Sci U S A 2022; 119:e2021865119. [PMID: 35074868 PMCID: PMC8812683 DOI: 10.1073/pnas.2021865119] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 11/18/2021] [Indexed: 01/29/2023] Open
Abstract
A major goal of linguistics and cognitive science is to understand what class of learning systems can acquire natural language. Until recently, the computational requirements of language have been used to argue that learning is impossible without a highly constrained hypothesis space. Here, we describe a learning system that is maximally unconstrained, operating over the space of all computations, and is able to acquire many of the key structures present in natural language from positive evidence alone. We demonstrate this by providing the same learning model with data from 74 distinct formal languages which have been argued to capture key features of language, have been studied in experimental work, or come from an interesting complexity class. The model is able to successfully induce the latent system generating the observed strings from small amounts of evidence in almost all cases, including for regular (e.g., an , [Formula: see text], and [Formula: see text]), context-free (e.g., [Formula: see text], and [Formula: see text]), and context-sensitive (e.g., [Formula: see text], and xx) languages, as well as for many languages studied in learning experiments. These results show that relatively small amounts of positive evidence can support learning of rich classes of generative computations over structures. The model provides an idealized learning setup upon which additional cognitive constraints and biases can be formalized.
Collapse
Affiliation(s)
- Yuan Yang
- College of Computing, Georgia Institute of Technology, Atlanta, GA 30332
| | - Steven T Piantadosi
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720
| |
Collapse
|
5
|
Visual and Tactile Sensory Systems Share Common Features in Object Recognition. eNeuro 2021; 8:ENEURO.0101-21.2021. [PMID: 34544756 PMCID: PMC8493885 DOI: 10.1523/eneuro.0101-21.2021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 08/24/2021] [Accepted: 08/31/2021] [Indexed: 11/24/2022] Open
Abstract
Although we use our visual and tactile sensory systems interchangeably for object recognition on a daily basis, little is known about the mechanism underlying this ability. This study examined how 3D shape features of objects form two congruent and interchangeable visual and tactile perceptual spaces in healthy male and female participants. Since active exploration plays an important role in shape processing, a virtual reality environment was used to visually explore 3D objects called digital embryos without using the tactile sense. In addition, during the tactile procedure, blindfolded participants actively palpated a 3D-printed version of the same objects with both hands. We first demonstrated that the visual and tactile perceptual spaces were highly similar. We then extracted a series of 3D shape features to investigate how visual and tactile exploration can lead to the correct identification of the relationships between objects. The results indicate that both modalities share the same shape features to form highly similar veridical spaces. This finding suggests that visual and tactile systems might apply similar cognitive processes to sensory inputs that enable humans to rely merely on one modality in the absence of another to recognize surrounding objects.
Collapse
|
6
|
Cavdan M, Drewing K, Doerschner K. The look and feel of soft are similar across different softness dimensions. J Vis 2021; 21:20. [PMID: 34581768 PMCID: PMC8479577 DOI: 10.1167/jov.21.10.20] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 08/27/2021] [Indexed: 11/24/2022] Open
Abstract
The softness of objects can be perceived through several senses. For instance, to judge the softness of a cat's fur, we do not only look at it, we often also run our fingers through its coat. Recently, we have shown that haptically perceived softness covaries with the compliance, viscosity, granularity, and furriness of materials (Dovencioglu, Üstün, Doerschner, & Drewing, 2020). However, it is unknown whether vision can provide similar information about the various aspects of perceived softness. Here, we investigated this question in an experiment with three conditions: in the haptic condition, blindfolded participants explored materials with their hands, in the static visual condition participants were presented with close-up photographs of the same materials, and in the dynamic visual condition participants watched videos of the hand-material interactions that were recorded in the haptic condition. After haptically or visually exploring the materials, participants rated them on various attributes. Our results show a high overall perceptual correspondence among the three experimental conditions. With a few exceptions, this correspondence tended to be strongest between haptic and dynamic visual conditions. These results are discussed with respect to information potentially available through the senses, or through prior experience, when judging the softness of materials.
Collapse
Affiliation(s)
- Müge Cavdan
- Justus Liebig University, Department of Psychology, Giessen, Germany
| | - Knut Drewing
- Justus Liebig University, Department of Psychology, Giessen, Germany
| | - Katja Doerschner
- Justus Liebig University, Department of Psychology, Giessen, Germany
- Bilkent University, National Magnetic Resonance Research Center, Ankara, Turkey
| |
Collapse
|
7
|
Piantadosi ST. The computational origin of representation. Minds Mach (Dordr) 2021; 31:1-58. [PMID: 34305318 PMCID: PMC8300595 DOI: 10.1007/s11023-020-09540-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 08/29/2020] [Indexed: 01/29/2023]
Abstract
Each of our theories of mental representation provides some insight into how the mind works. However, these insights often seem incompatible, as the debates between symbolic, dynamical, emergentist, sub-symbolic, and grounded approaches to cognition attest. Mental representations-whatever they are-must share many features with each of our theories of representation, and yet there are few hypotheses about how a synthesis could be possible. Here, I develop a theory of the underpinnings of symbolic cognition that shows how sub-symbolic dynamics may give rise to higher-level cognitive representations of structures, systems of knowledge, and algorithmic processes. This theory implements a version of conceptual role semantics by positing an internal universal representation language in which learners may create mental models to capture dynamics they observe in the world. The theory formalizes one account of how truly novel conceptual content may arise, allowing us to explain how even elementary logical and computational operations may be learned from a more primitive basis. I provide an implementation that learns to represent a variety of structures, including logic, number, kinship trees, regular languages, context-free languages, domains of theories like magnetism, dominance hierarchies, list structures, quantification, and computational primitives like repetition, reversal, and recursion. This account is based on simple discrete dynamical processes that could be implemented in a variety of different physical or biological systems. In particular, I describe how the required dynamics can be directly implemented in a connectionist framework. The resulting theory provides an "assembly language" for cognition, where high-level theories of symbolic computation can be implemented in simple dynamics that themselves could be encoded in biologically plausible systems.
Collapse
|
8
|
Rule JS, Riesenhuber M. Leveraging Prior Concept Learning Improves Generalization From Few Examples in Computational Models of Human Object Recognition. Front Comput Neurosci 2021; 14:586671. [PMID: 33510629 PMCID: PMC7835122 DOI: 10.3389/fncom.2020.586671] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 11/30/2020] [Indexed: 11/13/2022] Open
Abstract
Humans quickly and accurately learn new visual concepts from sparse data, sometimes just a single example. The impressive performance of artificial neural networks which hierarchically pool afferents across scales and positions suggests that the hierarchical organization of the human visual system is critical to its accuracy. These approaches, however, require magnitudes of order more examples than human learners. We used a benchmark deep learning model to show that the hierarchy can also be leveraged to vastly improve the speed of learning. We specifically show how previously learned but broadly tuned conceptual representations can be used to learn visual concepts from as few as two positive examples; reusing visual representations from earlier in the visual hierarchy, as in prior approaches, requires significantly more examples to perform comparably. These results suggest techniques for learning even more efficiently and provide a biologically plausible way to learn new visual concepts from few examples.
Collapse
Affiliation(s)
- Joshua S. Rule
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Maximilian Riesenhuber
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, United States
| |
Collapse
|
9
|
Junker FB, Schlaffke L, Axmacher N, Schmidt-Wilcke T. Impact of multisensory learning on perceptual and lexical processing of unisensory Morse code. Brain Res 2021; 1755:147259. [PMID: 33422535 DOI: 10.1016/j.brainres.2020.147259] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 12/17/2020] [Accepted: 12/19/2020] [Indexed: 11/30/2022]
Abstract
Multisensory learning profits from stimulus congruency at different levels of processing. In the current study, we sought to investigate whether multisensory learning can potentially be based on high-level feature congruency (same meaning) without perceptual congruency (same time) and how this relates to changes in brain function and behaviour. 50 subjects learned to decode Morse code (MC) either in unisensory or different multisensory manners. During unisensory learning, the MC was trained as sequences of auditory trains. For low-level congruent (perceptual) multisensory learning, MC was applied as tactile stimulation to the left hand simultaneously to the auditory stimulation. In contrast, high-level congruent multisensory learning involved auditory training, followed by the production of MC sequences requiring motor actions and thereby excludes perceptual congruency. After learning, group differences were observed within three distinct brain regions while processing unisensory (auditory) MC. Both types of multisensory learning were associated with increased activation in the right inferior frontal gyrus. Multisensory low-level learning elicited additional activation in the somatosensory cortex, while multisensory high-level learners showed a reduced activation in the inferior parietal lobule, which is relevant for decoding MC. Furthermore, differences in brain function associated with multisensory learning was related to behavioural reaction times for both multisensory learning groups. Overall, our data support the idea that multisensory learning is potentially based on high-level features without perceptual congruency. Furthermore, learning of multisensory associations involves neural representations of stimulus features involved in learning, but also share common brain activation (i.e. the right IFG), which seems to serve as a site of multisensory integration.
Collapse
Affiliation(s)
- F B Junker
- Department of Neuropsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr-University Bochum, Universitätsstraße 150, D-44801 Bochum, Germany; Department of Clinical Neuroscience and Medical Psychology, Heinrich Heine University, Universitätsstraße 1, D-40225 Düsseldorf, Germany
| | - L Schlaffke
- Department for Neurology, BG-University Hospital Bergmannsheil, Bürkle de la Camp-Platz 1, D-44789 Bochum, Germany
| | - N Axmacher
- Department of Neuropsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr-University Bochum, Universitätsstraße 150, D-44801 Bochum, Germany
| | - T Schmidt-Wilcke
- Department of Clinical Neuroscience and Medical Psychology, Heinrich Heine University, Universitätsstraße 1, D-40225 Düsseldorf, Germany; Department of Neurology, St. Mauritius Clinic, Strümper Str. 111, D-40670 Meerbusch, Germany
| |
Collapse
|
10
|
Kalashnikova M, Goswami U, Burnham D. Novel word learning deficits in infants at family risk for dyslexia. DYSLEXIA (CHICHESTER, ENGLAND) 2020; 26:3-17. [PMID: 31994263 DOI: 10.1002/dys.1649] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 08/06/2019] [Accepted: 11/26/2019] [Indexed: 06/10/2023]
Abstract
Children of reading age diagnosed with dyslexia show deficits in reading and spelling skills, but early markers of later dyslexia are already present in infancy in auditory processing and phonological domains. Deficits in lexical development are not typically associated with dyslexia. Nevertheless, it is possible that early auditory/phonological deficits would have detrimental effects on the encoding and storage of novel lexical items. Word-learning difficulties have been demonstrated in school-aged dyslexic children using paired associate learning tasks, but earlier manifestations in infants who are at family risk for dyslexia have not been investigated. This study assessed novel word learning in 19-month-old infants at risk for dyslexia (by virtue of having one dyslexic parent) and infants not at risk for any developmental disorder. Infants completed a word-learning task that required them to map two novel words to their corresponding novel referents. Not at-risk infants showed increased looking time to the novel referents at test compared with at-risk infants. These findings demonstrate, for the first time, that at-risk infants show differences in novel word-learning (fast-mapping) tasks compared with not at-risk infants. Our findings have implications for the development and consolidation of early lexical and phonological skills in infants at family risk of later dyslexia.
Collapse
Affiliation(s)
- Marina Kalashnikova
- BCBL Basque Center on Cognition, Brain and Language, San Sebastian, Spain
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, New South Wales, Australia
| | - Usha Goswami
- Centre for Neuroscience in Education, University of Cambridge, Cambridge, UK
| | - Denis Burnham
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, New South Wales, Australia
| |
Collapse
|
11
|
Escudero P, Kalashnikova M. Infants use phonetic detail in speech perception and word learning when detail is easy to perceive. J Exp Child Psychol 2019; 190:104714. [PMID: 31734323 DOI: 10.1016/j.jecp.2019.104714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 09/10/2019] [Accepted: 09/10/2019] [Indexed: 10/25/2022]
Abstract
Infants successfully discriminate speech sound contrasts that belong to their native language's phonemic inventory in auditory-only paradigms, but they encounter difficulties in distinguishing the same contrasts in the context of word learning. These difficulties are usually attributed to the fact that infants' attention to the phonetic detail in novel words is attenuated when they must allocate additional cognitive resources demanded by word-learning tasks. The current study investigated 15-month-old infants' ability to distinguish novel words that differ by a single vowel in an auditory discrimination paradigm (Experiment 1) and a word-learning paradigm (Experiment 2). These experiments aimed to tease apart whether infants' performance is dependent solely on the specific acoustic properties of the target vowels or on the context of the task. Experiment 1 showed that infants were able to discriminate only a contrast marked by a large difference along a static dimension (the vowels' second formant), whereas they were not able to discriminate a contrast with a small phonetic distance between its vowels, due to the dynamic nature of the vowels. In Experiment 2, infants did not succeed at learning words containing the same contrast they were able to discriminate in Experiment 1. The current findings demonstrate that both the specific acoustic properties of vowels in infants' native language and the task presented continue to play a significant role in early speech perception well into the second year of life.
Collapse
Affiliation(s)
- Paola Escudero
- MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Penrith, NSW 2751, Australia; ARC Centre of Excellence for the Dynamics of Language, The Australian National University, Canberra, ACT 2601, Australia.
| | - Marina Kalashnikova
- MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Penrith, NSW 2751, Australia; Basque Center on Cognition, Brain and Language, 20009 Donostia, Gipuzkoa, Spain
| |
Collapse
|
12
|
Jacobs RA, Xu C. Can multisensory training aid visual learning? A computational investigation. J Vis 2019; 19:1. [PMID: 31480074 DOI: 10.1167/19.11.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Although real-world environments are often multisensory, visual scientists typically study visual learning in unisensory environments containing visual signals only. Here, we use deep or artificial neural networks to address the question, Can multisensory training aid visual learning? We examine a network's internal representations of objects based on visual signals in two conditions: (a) when the network is initially trained with both visual and haptic signals, and (b) when it is initially trained with visual signals only. Our results demonstrate that a network trained in a visual-haptic environment (in which visual, but not haptic, signals are orientation-dependent) tends to learn visual representations containing useful abstractions, such as the categorical structure of objects, and also learns representations that are less sensitive to imaging parameters, such as viewpoint or orientation, that are irrelevant for object recognition or classification tasks. We conclude that researchers studying perceptual learning in vision-only contexts may be overestimating the difficulties associated with important perceptual learning problems. Although multisensory perception has its own challenges, perceptual learning can become easier when it is considered in a multisensory setting.
Collapse
Affiliation(s)
- Robert A Jacobs
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
| | - Chenliang Xu
- Department of Computer Science, University of Rochester, Rochester, NY, USA
| |
Collapse
|
13
|
Lengyel G, Žalalytė G, Pantelides A, Ingram JN, Fiser J, Lengyel M, Wolpert DM. Unimodal statistical learning produces multimodal object-like representations. eLife 2019; 8:43942. [PMID: 31042148 PMCID: PMC6529220 DOI: 10.7554/elife.43942] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 04/30/2019] [Indexed: 11/18/2022] Open
Abstract
The concept of objects is fundamental to cognition and is defined by a consistent set of sensory properties and physical affordances. Although it is unknown how the abstract concept of an object emerges, most accounts assume that visual or haptic boundaries are crucial in this process. Here, we tested an alternative hypothesis that boundaries are not essential but simply reflect a more fundamental principle: consistent visual or haptic statistical properties. Using a novel visuo-haptic statistical learning paradigm, we familiarised participants with objects defined solely by across-scene statistics provided either visually or through physical interactions. We then tested them on both a visual familiarity and a haptic pulling task, thus measuring both within-modality learning and across-modality generalisation. Participants showed strong within-modality learning and ‘zero-shot’ across-modality generalisation which were highly correlated. Our results demonstrate that humans can segment scenes into objects, without any explicit boundary cues, using purely statistical information.
Collapse
Affiliation(s)
- Gábor Lengyel
- Department of Cognitive Science, Central European University, Budapest, Hungary
| | - Goda Žalalytė
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | - Alexandros Pantelides
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | - James N Ingram
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom.,Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, United States
| | - József Fiser
- Department of Cognitive Science, Central European University, Budapest, Hungary
| | - Máté Lengyel
- Department of Cognitive Science, Central European University, Budapest, Hungary.,Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | - Daniel M Wolpert
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom.,Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, United States
| |
Collapse
|
14
|
Yildirim I, Wu J, Kanwisher N, Tenenbaum J. An integrative computational architecture for object-driven cortex. Curr Opin Neurobiol 2019; 55:73-81. [PMID: 30825704 PMCID: PMC6548583 DOI: 10.1016/j.conb.2019.01.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 12/24/2018] [Accepted: 01/13/2019] [Indexed: 01/09/2023]
Abstract
Computational architecture for object-driven cortex Objects in motion activate multiple cortical regions in every lobe of the human brain. Do these regions represent a collection of independent systems, or is there an overarching functional architecture spanning all of object-driven cortex? Inspired by recent work in artificial intelligence (AI), machine learning, and cognitive science, we consider the hypothesis that these regions can be understood as a coherent network implementing an integrative computational system that unifies the functions needed to perceive, predict, reason about, and plan with physical objects-as in the paradigmatic case of using or making tools. Our proposal draws on a modeling framework that combines multiple AI methods, including causal generative models, hybrid symbolic-continuous planning algorithms, and neural recognition networks, with object-centric, physics-based representations. We review evidence relating specific components of our proposal to the specific regions that comprise object-driven cortex, and lay out future research directions with the goal of building a complete functional and mechanistic account of this system.
Collapse
Affiliation(s)
- Ilker Yildirim
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA 02138, United States; Department of Brain & Cognitive Science, MIT, Cambridge, MA 02138, United States.
| | - Jiajun Wu
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA 02138, United States; Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02138, United States
| | - Nancy Kanwisher
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA 02138, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02138, United States; Department of Brain & Cognitive Science, MIT, Cambridge, MA 02138, United States
| | - Joshua Tenenbaum
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA 02138, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02138, United States; Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02138, United States; Department of Brain & Cognitive Science, MIT, Cambridge, MA 02138, United States
| |
Collapse
|
15
|
Carducci P, Schwing R, Huber L, Truppa V. Tactile information improves visual object discrimination in kea, Nestor notabilis, and capuchin monkeys, Sapajus spp. Anim Behav 2018. [DOI: 10.1016/j.anbehav.2017.11.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
16
|
Elleström L. Bridging the gap between image and metaphor through cross-modal iconicity. DIMENSIONS OF ICONICITY 2017. [DOI: 10.1075/ill.15.10ell] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
17
|
Hernández-Pérez R, Cuaya LV, Rojas-Hortelano E, Reyes-Aguilar A, Concha L, de Lafuente V. Tactile object categories can be decoded from the parietal and lateral-occipital cortices. Neuroscience 2017; 352:226-235. [DOI: 10.1016/j.neuroscience.2017.03.038] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2016] [Revised: 03/23/2017] [Accepted: 03/24/2017] [Indexed: 01/08/2023]
|
18
|
Kalashnikova M, Burnham D. Novel Word Learning, Reading Difficulties, and Phonological Processing Skills. DYSLEXIA (CHICHESTER, ENGLAND) 2016; 22:101-119. [PMID: 27146374 DOI: 10.1002/dys.1525] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Revised: 12/17/2015] [Accepted: 12/19/2015] [Indexed: 06/05/2023]
Abstract
Visual-verbal paired associate learning (PAL) refers to the ability to establish an arbitrary association between a visual referent and an unfamiliar label. It is now established that this ability is impaired in children with dyslexia, but the source of this deficit is yet to be specified. This study assesses PAL performance in children with reading difficulties using a modified version of the PAL paradigm, comprising a comprehension and a production phase, to determine whether the PAL deficit lies in children's ability to establish and retain novel object-novel word associations or their ability to retrieve the learned novel labels for production. Results showed that while children with reading difficulties required significantly more trials to learn the object-word associations, when they were required to use these associations in a comprehension-referent selection task, their accuracy and speed did not differ from controls. Nevertheless, children with reading difficulties were significantly less successful when they were required to produce the learned novel labels in response to the visual stimuli. Thus, these results indicate that while children with reading difficulties are successful at establishing visual-verbal associations, they have a deficit in the verbal production component of PAL tasks, which may relate to a more general underlying impairment in auditory or phonological processing. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
| | - Denis Burnham
- The MARCS Institute, Western Sydney University, Penrith, Australia
| |
Collapse
|
19
|
Piantadosi ST, Jacobs RA. Four Problems Solved by the Probabilistic Language of Thought. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 2016. [DOI: 10.1177/0963721415609581] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We argue for the advantages of the probabilistic language of thought (pLOT), a recently emerging approach to modeling human cognition. Work using this framework demonstrates how the pLOT (a) refines the debate between symbols and statistics in cognitive modeling, (b) permits theories that draw on insights from both nativist and empiricist approaches, (c) explains the origins of novel and complex computational concepts, and (d) provides a framework for abstraction that can link sensation and conception. In each of these areas, the pLOT provides a productive middle ground between historical divides in cognitive psychology, pointing to a promising way forward for the field.
Collapse
Affiliation(s)
| | - Robert A. Jacobs
- Department of Brain and Cognitive Sciences, University of
Rochester
| |
Collapse
|
20
|
Erdogan G, Yildirim I, Jacobs RA. From Sensory Signals to Modality-Independent Conceptual Representations: A Probabilistic Language of Thought Approach. PLoS Comput Biol 2015; 11:e1004610. [PMID: 26554704 PMCID: PMC4640543 DOI: 10.1371/journal.pcbi.1004610] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2015] [Accepted: 10/17/2015] [Indexed: 12/02/2022] Open
Abstract
People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models-that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model's percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects' ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception.
Collapse
Affiliation(s)
- Goker Erdogan
- Department of Brain & Cognitive Sciences, University of Rochester, Rochester, New York, United States of America
| | - Ilker Yildirim
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Laboratory of Neural Systems, The Rockefeller University, New York, New York, United States of America
| | - Robert A. Jacobs
- Department of Brain & Cognitive Sciences, University of Rochester, Rochester, New York, United States of America
| |
Collapse
|
21
|
Chen L, Rogers TT. A Model of Emergent Category-specific Activation in the Posterior Fusiform Gyrus of Sighted and Congenitally Blind Populations. J Cogn Neurosci 2015; 27:1981-99. [DOI: 10.1162/jocn_a_00834] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Abstract
Theories about the neural bases of semantic knowledge tend between two poles, one proposing that distinct brain regions are innately dedicated to different conceptual domains and the other suggesting that all concepts are encoded within a single network. Category-sensitive functional activations in the fusiform cortex of the congenitally blind have been taken to support the former view but also raise several puzzles. We use neural network models to assess a hypothesis that spans the two poles: The interesting functional activation patterns reflect the base connectivity of a domain-general semantic network. Both similarities and differences between sighted and congenitally blind groups can emerge through learning in a neural network, but only in architectures adopting real anatomical constraints. Surprisingly, the same constraints suggest a novel account of a quite different phenomenon: the dyspraxia observed in patients with semantic impairments from anterior temporal pathology. From this work, we suggest that the cortical semantic network is wired not to encode knowledge of distinct conceptual domains but to promote learning about both conceptual and affordance structure in the environment.
Collapse
Affiliation(s)
- Lang Chen
- 1University of Wisconsin–Madison
- 2Stanford Cognitive and Systems Neuroscience Laboratory, Palo Alto, CA
| | | |
Collapse
|
22
|
Prause N, Park J, Leung S, Miller G. Women's Preferences for Penis Size: A New Research Method Using Selection among 3D Models. PLoS One 2015; 10:e0133079. [PMID: 26332467 PMCID: PMC4558040 DOI: 10.1371/journal.pone.0133079] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Accepted: 06/22/2015] [Indexed: 11/18/2022] Open
Abstract
Women's preferences for penis size may affect men's comfort with their own bodies and may have implications for sexual health. Studies of women's penis size preferences typically have relied on their abstract ratings or selecting amongst 2D, flaccid images. This study used haptic stimuli to allow assessment of women's size recall accuracy for the first time, as well as examine their preferences for erect penis sizes in different relationship contexts. Women (N = 75) selected amongst 33, 3D models. Women recalled model size accurately using this method, although they made more errors with respect to penis length than circumference. Women preferred a penis of slightly larger circumference and length for one-time (length = 6.4 inches/16.3 cm, circumference = 5.0 inches/12.7 cm) versus long-term (length = 6.3 inches/16.0 cm, circumference = 4.8 inches/12.2 cm) sexual partners. These first estimates of erect penis size preferences using 3D models suggest women accurately recall size and prefer penises only slightly larger than average.
Collapse
Affiliation(s)
- Nicole Prause
- Department of Psychiatry, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| | - Jaymie Park
- Department of Psychiatry, University of California Los Angeles, Los Angeles, California, United States of America
| | - Shannon Leung
- Department of Psychiatry, University of California Los Angeles, Los Angeles, California, United States of America
| | - Geoffrey Miller
- Department of Psychology, University of New Mexico; Albuquerque, New Mexico, United States of America
| |
Collapse
|
23
|
The eyes grasp, the hands see: metric category knowledge transfers between vision and touch. Psychon Bull Rev 2015; 21:976-85. [PMID: 24307250 DOI: 10.3758/s13423-013-0563-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Categorization of seen objects is often determined by the shapes of objects. However, shape is not exclusive to the visual modality: The haptic system also is expert at identifying shapes. Hence, an important question for understanding shape processing is whether humans store separate modality-dependent shape representations, or whether information is integrated into one multisensory representation. To answer this question, we created a metric space of computer-generated novel objects varying in shape. These objects were then printed using a 3-D printer, to generate tangible stimuli. In a categorization experiment, participants first explored the objects visually and haptically. We found that both modalities led to highly similar categorization behavior. Next, participants were trained either visually or haptically on shape categories within the metric space. As expected, visual training increased visual performance, and haptic training increased haptic performance. Importantly, however, we found that visual training also improved haptic performance, and vice versa. Two additional experiments showed that the location of the categorical boundary in the metric space also transferred across modalities, as did heightened discriminability of objects adjacent to the boundary. This observed transfer of metric category knowledge across modalities indicates that visual and haptic forms of shape information are integrated into a shared multisensory representation.
Collapse
|
24
|
Ursino M, Cuppini C, Magosso E. Neurocomputational approaches to modelling multisensory integration in the brain: A review. Neural Netw 2014; 60:141-65. [DOI: 10.1016/j.neunet.2014.08.003] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2014] [Revised: 08/05/2014] [Accepted: 08/07/2014] [Indexed: 10/24/2022]
|
25
|
Learning multisensory representations for auditory-visual transfer of sequence category knowledge: a probabilistic language of thought approach. Psychon Bull Rev 2014; 22:673-86. [PMID: 25338656 DOI: 10.3758/s13423-014-0734-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Revised: 08/29/2014] [Accepted: 09/04/2014] [Indexed: 11/08/2022]
Abstract
If a person is trained to recognize or categorize objects or events using one sensory modality, the person can often recognize or categorize those same (or similar) objects and events via a novel modality. This phenomenon is an instance of cross-modal transfer of knowledge. Here, we study the Multisensory Hypothesis which states that people extract the intrinsic, modality-independent properties of objects and events, and represent these properties in multisensory representations. These representations underlie cross-modal transfer of knowledge. We conducted an experiment evaluating whether people transfer sequence category knowledge across auditory and visual domains. Our experimental data clearly indicate that we do. We also developed a computational model accounting for our experimental results. Consistent with the probabilistic language of thought approach to cognitive modeling, our model formalizes multisensory representations as symbolic "computer programs" and uses Bayesian inference to learn these representations. Because the model demonstrates how the acquisition and use of amodal, multisensory representations can underlie cross-modal transfer of knowledge, and because the model accounts for subjects' experimental performances, our work lends credence to the Multisensory Hypothesis. Overall, our work suggests that people automatically extract and represent objects' and events' intrinsic properties, and use these properties to process and understand the same (and similar) objects and events when they are perceived through novel sensory modalities.
Collapse
|
26
|
Recognizing familiar objects by hand and foot: Haptic shape perception generalizes to inputs from unusual locations and untrained body parts. Atten Percept Psychophys 2014; 76:541-58. [PMID: 24197503 DOI: 10.3758/s13414-013-0559-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The limits of generalization of our 3-D shape recognition system to identifying objects by touch was investigated by testing exploration at unusual locations and using untrained effectors. In Experiments 1 and 2, people found identification by hand of real objects, plastic 3-D models of objects, and raised line drawings placed in front of themselves no easier than when exploration was behind their back. Experiment 3 compared one-handed, two-handed, one-footed, and two-footed haptic object recognition of familiar objects. Recognition by foot was slower (7 vs. 13 s) and much less accurate (9 % vs. 47 % errors) than recognition by either one or both hands. Nevertheless, item difficulty was similar across hand and foot exploration, and there was a strong correlation between an individual's hand and foot performance. Furthermore, foot recognition was better with the largest 20 of the 80 items (32 % errors), suggesting that physical limitations hampered exploration by foot. Thus, object recognition by hand generalized efficiently across the spatial location of stimuli, while object recognition by foot seemed surprisingly good given that no prior training was provided. Active touch (haptics) thus efficiently extracts 3-D shape information and accesses stored representations of familiar objects from novel modes of input.
Collapse
|
27
|
Lacey S, Sathian K. Visuo-haptic multisensory object recognition, categorization, and representation. Front Psychol 2014; 5:730. [PMID: 25101014 PMCID: PMC4102085 DOI: 10.3389/fpsyg.2014.00730] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 06/23/2014] [Indexed: 12/15/2022] Open
Abstract
Visual and haptic unisensory object processing show many similarities in terms of categorization, recognition, and representation. In this review, we discuss how these similarities contribute to multisensory object processing. In particular, we show that similar unisensory visual and haptic representations lead to a shared multisensory representation underlying both cross-modal object recognition and view-independence. This shared representation suggests a common neural substrate and we review several candidate brain regions, previously thought to be specialized for aspects of visual processing, that are now known also to be involved in analogous haptic tasks. Finally, we lay out the evidence for a model of multisensory object recognition in which top-down and bottom-up pathways to the object-selective lateral occipital complex are modulated by object familiarity and individual differences in object and spatial imagery.
Collapse
Affiliation(s)
- Simon Lacey
- Department of Neurology, Emory University School of Medicine Atlanta, GA, USA
| | - K Sathian
- Department of Neurology, Emory University School of Medicine Atlanta, GA, USA ; Department of Rehabilitation Medicine, Emory University School of Medicine Atlanta, GA, USA ; Department of Psychology, Emory University School of Medicine Atlanta, GA, USA ; Rehabilitation Research and Development Center of Excellence, Atlanta Veterans Affairs Medical Center Decatur, GA, USA
| |
Collapse
|
28
|
Hemmer P, Persaud K. Interaction between categorical knowledge and episodic memory across domains. Front Psychol 2014; 5:584. [PMID: 24966848 PMCID: PMC4052730 DOI: 10.3389/fpsyg.2014.00584] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2014] [Accepted: 05/25/2014] [Indexed: 12/03/2022] Open
Abstract
Categorical knowledge and episodic memory have traditionally been viewed as separate lines of inquiry. Here, we present a perspective on the interrelatedness of categorical knowledge and reconstruction from memory. We address three underlying questions: what knowledge do people bring to the task of remembering? How do people integrate that knowledge with episodic memory? Is this the optimal way for the memory system to work? In the review of five studies spanning four category domains (discrete, continuous, temporal, and linguistic), we evaluate the relative contribution and the structure of influence of categorical knowledge on long-term episodic memory. These studies suggest a robustness of peoples’ knowledge of the statistical regularities of the environment, and provide converging evidence of the quality and influence of category knowledge on reconstructive memory. Lastly, we argue that combining categorical knowledge and episodic memory is an efficient strategy of the memory system.
Collapse
Affiliation(s)
- Pernille Hemmer
- Department of Psychology, Rutgers University Piscataway, NJ, USA
| | - Kimele Persaud
- Department of Psychology, Rutgers University Piscataway, NJ, USA
| |
Collapse
|
29
|
Marton ZC, Balint-Benczedi F, Mozos OM, Blodow N, Kanezaki A, Goron LC, Pangercic D, Beetz M. Part-Based Geometric Categorization and Object Reconstruction in Cluttered Table-Top Scenes. J INTELL ROBOT SYST 2014. [DOI: 10.1007/s10846-013-0011-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|