1
|
Zbären GA, Kapur M, Meissner SN, Wenderoth N. Inferring occluded projectile motion changes connectivity within a visuo-fronto-parietal network. Brain Struct Funct 2024; 229:1605-1615. [PMID: 38914897 PMCID: PMC11374914 DOI: 10.1007/s00429-024-02815-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 06/03/2024] [Indexed: 06/26/2024]
Abstract
Anticipating the behaviour of moving objects in the physical environment is essential for a wide range of daily actions. This ability is thought to rely on mental simulations and has been shown to involve frontoparietal and early visual areas. Yet, the connectivity patterns between these regions during intuitive physical inference remain largely unknown. In this study, participants underwent fMRI while performing a task requiring them to infer the parabolic trajectory of an occluded ball falling under Newtonian physics, and a control task. Building on our previous research showing that when solving the physical inference task, early visual areas encode task-specific and perception-like information about the inferred trajectory, the present study aimed to (i) identify regions that are functionally coupled with early visual areas during the physical inference task, and (ii) investigate changes in effective connectivity within this network of regions. We found that early visual areas are functionally connected to a set of parietal and premotor regions when inferring occluded trajectories. Using dynamic causal modelling, we show that predicting occluded trajectories is associated with changes in effective connectivity within a parieto-premotor network, which may drive internally generated early visual activity in a top-down fashion. These findings offer new insights into the interaction between early visual and frontoparietal regions during physical inference, contributing to our understanding of the neural mechanisms underlying the ability to predict physical outcomes.
Collapse
Affiliation(s)
- Gabrielle Aude Zbären
- Neural Control of Movement Lab, Department of Health Science and technology, ETH Zurich, Zurich, Switzerland.
| | - Manu Kapur
- Professorship for Learning Sciences and Higher Education, ETH Zurich, Zurich, Switzerland
| | - Sarah Nadine Meissner
- Neural Control of Movement Lab, Department of Health Science and technology, ETH Zurich, Zurich, Switzerland
| | - Nicole Wenderoth
- Neural Control of Movement Lab, Department of Health Science and technology, ETH Zurich, Zurich, Switzerland
- Future Health Technologies, Singapore-ETH Centre, Campus for Research Excellence And Technological Enterprise (CREATE), Singapore, Singapore
| |
Collapse
|
2
|
Friedrich J, Fischer MH, Raab M. Invariant representations in abstract concept grounding - the physical world in grounded cognition. Psychon Bull Rev 2024:10.3758/s13423-024-02522-3. [PMID: 38806790 DOI: 10.3758/s13423-024-02522-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/26/2024] [Indexed: 05/30/2024]
Abstract
Grounded cognition states that mental representations of concepts consist of experiential aspects. For example, the concept "cup" consists of the sensorimotor experiences from interactions with cups. Typical modalities in which concepts are grounded are: The sensorimotor system (including interoception), emotion, action, language, and social aspects. Here, we argue that this list should be expanded to include physical invariants (unchanging features of physical motion; e.g., gravity, momentum, friction). Research on physical reasoning consistently demonstrates that physical invariants are represented as fundamentally as other grounding substrates, and therefore should qualify. We assess several theories of concept representation (simulation, conceptual metaphor, conceptual spaces, predictive processing) and their positions on physical invariants. We find that the classic grounded cognition theories, simulation and conceptual metaphor theory, have not considered physical invariants, while conceptual spaces and predictive processing have. We conclude that physical invariants should be included into grounded cognition theories, and that the core mechanisms of simulation and conceptual metaphor theory are well suited to do this. Furthermore, conceptual spaces and predictive processing are very promising and should also be integrated with grounded cognition in the future.
Collapse
Affiliation(s)
- Jannis Friedrich
- German Sport University Cologne, Germany, Am Sportpark Müngersdorf 6, 50933, Cologne, Germany.
| | - Martin H Fischer
- Psychology Department, University of Potsdam, Karl-Liebknecht-Strasse 24-25, House 14 D - 14476, Potsdam-Golm, Germany
| | - Markus Raab
- German Sport University Cologne, Germany, Am Sportpark Müngersdorf 6, 50933, Cologne, Germany
| |
Collapse
|
3
|
Huang T, Liu J. A stochastic world model on gravity for stability inference. eLife 2024; 12:RP88953. [PMID: 38712832 DOI: 10.7554/elife.88953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024] Open
Abstract
The fact that objects without proper support will fall to the ground is not only a natural phenomenon, but also common sense in mind. Previous studies suggest that humans may infer objects' stability through a world model that performs mental simulations with a priori knowledge of gravity acting upon the objects. Here we measured participants' sensitivity to gravity to investigate how the world model works. We found that the world model on gravity was not a faithful replica of the physical laws, but instead encoded gravity's vertical direction as a Gaussian distribution. The world model with this stochastic feature fit nicely with participants' subjective sense of objects' stability and explained the illusion that taller objects are perceived as more likely to fall. Furthermore, a computational model with reinforcement learning revealed that the stochastic characteristic likely originated from experience-dependent comparisons between predictions formed by internal simulations and the realities observed in the external world, which illustrated the ecological advantage of stochastic representation in balancing accuracy and speed for efficient stability inference. The stochastic world model on gravity provides an example of how a priori knowledge of the physical world is implemented in mind that helps humans operate flexibly in open-ended environments.
Collapse
Affiliation(s)
- Taicheng Huang
- Department of Psychological and Cognitive Sciences & Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| | - Jia Liu
- Department of Psychological and Cognitive Sciences & Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| |
Collapse
|
4
|
Yildirim I, Paul LA. From task structures to world models: what do LLMs know? Trends Cogn Sci 2024; 28:404-415. [PMID: 38443199 DOI: 10.1016/j.tics.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/03/2024] [Accepted: 02/13/2024] [Indexed: 03/07/2024]
Abstract
In what sense does a large language model (LLM) have knowledge? We answer by granting LLMs 'instrumental knowledge': knowledge gained by using next-word generation as an instrument. We then ask how instrumental knowledge is related to the ordinary, 'worldly knowledge' exhibited by humans, and explore this question in terms of the degree to which instrumental knowledge can be said to incorporate the structured world models of cognitive science. We discuss ways LLMs could recover degrees of worldly knowledge and suggest that such recovery will be governed by an implicit, resource-rational tradeoff between world models and tasks. Our answer to this question extends beyond the capabilities of a particular AI system and challenges assumptions about the nature of knowledge and intelligence.
Collapse
Affiliation(s)
- Ilker Yildirim
- Department of Psychology, Yale University, New Haven, CT, USA; Department of Statistics and Data Science, Yale University, New Haven, CT, USA; Wu-Tsai Institute, Yale University, New Haven, CT, USA; Foundations of Data Science Institute, Yale University, New Haven, CT, USA.
| | - L A Paul
- Department of Philosophy, Yale University, New Haven, CT, USA; Wu-Tsai Institute, Yale University, New Haven, CT, USA; Munich Center for Mathematical Philosophy, Ludwig Maximilian University of Munich, Munich, Germany.
| |
Collapse
|
5
|
Fischer J. Physical reasoning is the missing link between action goals and kinematics: A comment on "An active inference model of hierarchical action understanding, learning, and imitation" by Proietti et al. Phys Life Rev 2024; 48:198-200. [PMID: 38350304 DOI: 10.1016/j.plrev.2023.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 08/24/2023] [Indexed: 02/15/2024]
Affiliation(s)
- Jason Fischer
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
6
|
Liu Y, Ayzenberg V, Lourenco SF. Object geometry serves humans' intuitive physics of stability. Sci Rep 2024; 14:1701. [PMID: 38242998 PMCID: PMC10799025 DOI: 10.1038/s41598-024-51677-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 01/08/2024] [Indexed: 01/21/2024] Open
Abstract
How do humans judge physical stability? A prevalent account emphasizes the mental simulation of physical events implemented by an intuitive physics engine in the mind. Here we test the extent to which the perceptual features of object geometry are sufficient for supporting judgments of falling direction. In all experiments, adults and children judged the falling direction of a tilted object and, across experiments, objects differed in the geometric features (i.e., geometric centroid, object height, base size and/or aspect ratio) relevant to the judgment. Participants' performance was compared to computational models trained on geometric features, as well as a deep convolutional neural network (ResNet-50), none of which incorporated mental simulation. Adult and child participants' performance was well fit by models of object geometry, particularly the geometric centroid. ResNet-50 also provided a good account of human performance. Altogether, our findings suggest that object geometry may be sufficient for judging the falling direction of tilted objects, independent of mental simulation.
Collapse
Affiliation(s)
- Yaxin Liu
- Emory University, 36 Eagle Row, Atlanta, GA, 30322, USA.
| | | | | |
Collapse
|
7
|
Karakose-Akbiyik S, Sussman O, Wurm MF, Caramazza A. The Role of Agentive and Physical Forces in the Neural Representation of Motion Events. J Neurosci 2024; 44:e1363232023. [PMID: 38050107 PMCID: PMC10860628 DOI: 10.1523/jneurosci.1363-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 11/14/2023] [Accepted: 11/19/2023] [Indexed: 12/06/2023] Open
Abstract
How does the brain represent information about motion events in relation to agentive and physical forces? In this study, we investigated the neural activity patterns associated with observing animated actions of agents (e.g., an agent hitting a chair) in comparison to similar movements of inanimate objects that were either shaped solely by the physics of the scene (e.g., gravity causing an object to fall down a hill and hit a chair) or initiated by agents (e.g., a visible agent causing an object to hit a chair). Using an fMRI-based multivariate pattern analysis (MVPA), this design allowed testing where in the brain the neural activity patterns associated with motion events change as a function of, or are invariant to, agentive versus physical forces behind them. A total of 29 human participants (nine male) participated in the study. Cross-decoding revealed a shared neural representation of animate and inanimate motion events that is invariant to agentive or physical forces in regions spanning frontoparietal and posterior temporal cortices. In contrast, the right lateral occipitotemporal cortex showed a higher sensitivity to agentive events, while the left dorsal premotor cortex was more sensitive to information about inanimate object events that were solely shaped by the physics of the scene.
Collapse
Affiliation(s)
| | - Oliver Sussman
- Department of Psychology, Harvard University, Cambridge, Massachusetts 02138
| | - Moritz F Wurm
- Center for Mind/Brain Sciences - CIMeC, University of Trento, 38068 Rovereto, Italy
| | - Alfonso Caramazza
- Department of Psychology, Harvard University, Cambridge, Massachusetts 02138
- Center for Mind/Brain Sciences - CIMeC, University of Trento, 38068 Rovereto, Italy
| |
Collapse
|
8
|
Nayebi A, Rajalingham R, Jazayeri M, Yang GR. Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes. ARXIV 2023:arXiv:2305.11772v2. [PMID: 37292459 PMCID: PMC10246064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Humans and animals have a rich and flexible understanding of the physical world, which enables them to infer the underlying dynamical trajectories of objects and events, plausible future states, and use that to plan and anticipate the consequences of actions. However, the neural mechanisms underlying these computations are unclear. We combine a goal-driven modeling approach with dense neurophysiological data and high-throughput human behavioral readouts that contain thousands of comparisons to directly impinge on this question. Specifically, we construct and evaluate several classes of sensory-cognitive networks to predict the future state of rich, ethologically-relevant environments, ranging from self-supervised end-to-end models with pixel-wise or object-slot objectives, to models that future predict in the latent space of purely static image-pretrained or dynamic video-pretrained foundation models. We find that "scale is not all you need", and that many state-of-the-art machine learning models fail to perform well on our neural and behavioral benchmarks for future prediction. In fact, only one class of models matches these data well overall. We find that neural responses are currently best predicted by models trained to predict the future state of their environment in the latent space of pretrained foundation models optimized for dynamic scenes in a self-supervised manner. These models also approach the neurons' ability to predict the environmental state variables that are visually hidden from view, despite not being explicitly trained to do so. Finally, we find that not all foundation model latents are equal. Notably, models that future predict in the latent space of video foundation models that are optimized to support a diverse range of egocentric sensorimotor tasks, reasonably match both human behavioral error patterns and neural dynamics across all environmental scenarios that we were able to test. Overall, these findings suggest that the neural mechanisms and behaviors of primate mental simulation have strong inductive biases associated with them, and are thus far most consistent with being optimized to future predict on reusable visual representations that are useful for Embodied AI more generally.
Collapse
Affiliation(s)
- Aran Nayebi
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
| | - Rishi Rajalingham
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
- Reality Labs, Meta; 390 9th Ave, New York, NY 10001
| | - Mehrdad Jazayeri
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
- Department of Brain and Cognitive Sciences, MIT; Cambridge, MA 02139
| | - Guangyu Robert Yang
- McGovern Institute for Brain Research, MIT; Cambridge, MA 02139
- Department of Brain and Cognitive Sciences, MIT; Cambridge, MA 02139
- Department of Electrical Engineering and Computer Science, MIT; Cambridge, MA 02139
| |
Collapse
|
9
|
Emonds AMX, Srinath R, Nielsen KJ, Connor CE. Object representation in a gravitational reference frame. eLife 2023; 12:e81701. [PMID: 37561119 PMCID: PMC10414968 DOI: 10.7554/elife.81701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 07/19/2023] [Indexed: 08/11/2023] Open
Abstract
When your head tilts laterally, as in sports, reaching, and resting, your eyes counterrotate less than 20%, and thus eye images rotate, over a total range of about 180°. Yet, the world appears stable and vision remains normal. We discovered a neural strategy for rotational stability in anterior inferotemporal cortex (IT), the final stage of object vision in primates. We measured object orientation tuning of IT neurons in macaque monkeys tilted +25 and -25° laterally, producing ~40° difference in retinal image orientation. Among IT neurons with consistent object orientation tuning, 63% remained stable with respect to gravity across tilts. Gravitational tuning depended on vestibular/somatosensory but also visual cues, consistent with previous evidence that IT processes scene cues for gravity's orientation. In addition to stability across image rotations, an internal gravitational reference frame is important for physical understanding of a world where object position, posture, structure, shape, movement, and behavior interact critically with gravity.
Collapse
Affiliation(s)
- Alexandriya MX Emonds
- Department of Biomedical Engineering, Johns Hopkins University School of MedicineBaltimoreUnited States
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
| | - Ramanujan Srinath
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| | - Kristina J Nielsen
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| | - Charles E Connor
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins UniversityBaltimoreUnited States
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of MedicineBaltimoreUnited States
| |
Collapse
|
10
|
Marciniak Dg Agra K, Dg Agra P. F = ma. Is the macaque brain Newtonian? Cogn Neuropsychol 2023; 39:376-408. [PMID: 37045793 DOI: 10.1080/02643294.2023.2191843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/14/2023]
Abstract
Intuitive Physics, the ability to anticipate how the physical events involving mass objects unfold in time and space, is a central component of intelligent systems. Intuitive physics is a promising tool for gaining insight into mechanisms that generalize across species because both humans and non-human primates are subject to the same physical constraints when engaging with the environment. Physical reasoning abilities are widely present within the animal kingdom, but monkeys, with acute 3D vision and a high level of dexterity, appreciate and manipulate the physical world in much the same way humans do.
Collapse
Affiliation(s)
- Karolina Marciniak Dg Agra
- The Rockefeller University, Laboratory of Neural Circuits, New York, NY, USA
- Center for Brain, Minds and Machines, Cambridge, MA, USA
| | - Pedro Dg Agra
- The Rockefeller University, Laboratory of Neural Circuits, New York, NY, USA
- Center for Brain, Minds and Machines, Cambridge, MA, USA
| |
Collapse
|
11
|
Osiurak F, Claidière N, Federico G. Bringing cumulative technological culture beyond copying versus reasoning. Trends Cogn Sci 2023; 27:30-42. [PMID: 36283920 DOI: 10.1016/j.tics.2022.09.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 09/28/2022] [Accepted: 09/29/2022] [Indexed: 11/06/2022]
Abstract
The dominant view of cumulative technological culture suggests that high-fidelity transmission rests upon a high-fidelity copying ability, which allows individuals to reproduce the tool-use actions performed by others without needing to understand them (i.e., without causal reasoning). The opposition between copying versus reasoning is well accepted but with little supporting evidence. In this article, we investigate this distinction by examining the cognitive science literature on tool use. Evidence indicates that the ability to reproduce others' tool-use actions requires causal understanding, which questions the copying versus reasoning distinction and the cognitive reality of the so-called copying ability. We conclude that new insights might be gained by considering causal understanding as a key driver of cumulative technological culture.
Collapse
Affiliation(s)
- François Osiurak
- Laboratoire d'Étude des Mécanismes Cognitifs, Université de Lyon, 5 avenue Pierre Mendès France, 69676 Bron Cedex, France; Institut Universitaire de France, 1 rue Descartes, 75231 Paris Cedex 5, France.
| | - Nicolas Claidière
- Aix-Marseille Univ, CNRS, LPC, 3 Place Victor Hugo, 13331 Marseille, France
| | - Giovanni Federico
- IRCCS Synlab SDN S.p.A., Via Emanuele Gianturco 113, 80143, Naples, Italy
| |
Collapse
|
12
|
Fischer J, Mahon BZ. What tool representation, intuitive physics, and action have in common: The brain's first-person physics engine. Cogn Neuropsychol 2021; 38:455-467. [PMID: 35994054 DOI: 10.1080/02643294.2022.2106126] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/17/2022] [Accepted: 07/21/2022] [Indexed: 10/15/2022]
Abstract
An overlapping set of brain regions in parietal and frontal cortex are engaged by different types of tasks and stimuli: (i) making inferences about the physical structure and dynamics of the world, (ii) passively viewing, or actively interacting with, manipulable objects, and (iii) planning and execution of reaching and grasping actions. We suggest the observed neural overlap is because a common superordinate computation is engaged by each of those different tasks: A forward model of physical reasoning about how first-person actions will affect the world and be affected by unfolding physical events. This perspective offers an account of why some physical predictions are systematically incorrect - there can be a mismatch between how physical scenarios are experimentally framed and the native format of the inferences generated by the brain's first-person physics engine. This perspective generates new empirical expectations about the conditions under which physical reasoning may exhibit systematic biases.
Collapse
Affiliation(s)
- Jason Fischer
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Bradford Z Mahon
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
- Carnegie Mellon Neuroscience Institute, Pittsburgh, PA, USA
| |
Collapse
|