1
|
Danwitz L, von Helversen B. Observational learning of exploration-exploitation strategies in bandit tasks. Cognition 2025; 259:106124. [PMID: 40117983 DOI: 10.1016/j.cognition.2025.106124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 02/07/2025] [Accepted: 03/12/2025] [Indexed: 03/23/2025]
Abstract
In decision-making scenarios, individuals often face the challenge of balancing between exploring new options and exploiting known ones-a dynamic known as the exploration-exploitation trade-off. In such situations, people frequently have the opportunity to observe others' actions. Yet little is known about when, how, and from whom individuals use observational learning in the exploration-exploitation dilemma. In two experiments, participants completed multiple nine-armed bandit tasks, either independently or while observing a fictitious agent using either an explorative or equally successful exploitative strategy. To analyze participants' behaviors, we used a reinforcement learning model (simplified Kalman Filter) to extract parameters for both copying and exploration at the individual level. Results showed that participants copied the observed agents' choices by adding a bonus to the individually estimated value of the observed action. While most participants appear to use an unconditional copying approach, a subset of participants adopted a copy-when-uncertain approach, that is copying more when uncertain about the optimal action based on their individually acquired knowledge. Further, participants adjusted their exploration strategies in alignment with those observed. We discuss, in how far this can be understood as a form of emulation. Results on participants' preferences to copy from explorative versus exploitative agents are ambiguous. Contrary to expectations, similarity or dissimilarity between participants' and agents' exploration tendencies had no impact on observational learning. These results shed light on humans' processing of social and non-social information in exploration scenarios and conditions of observational learning.
Collapse
Affiliation(s)
- Ludwig Danwitz
- Department of Psychology, University of Bremen, Germany.
| | | |
Collapse
|
2
|
Wu CM, Deffner D, Kahl B, Meder B, Ho MH, Kurvers RHJM. Adaptive mechanisms of social and asocial learning in immersive collective foraging. Nat Commun 2025; 16:3539. [PMID: 40280950 PMCID: PMC12032219 DOI: 10.1038/s41467-025-58365-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 03/13/2025] [Indexed: 04/29/2025] Open
Abstract
Human cognition is distinguished by our ability to adapt to different environments and circumstances. Yet the mechanisms driving adaptive behavior have predominantly been studied in separate asocial and social contexts, with an integrated framework remaining elusive. Here, we use a collective foraging task in a virtual Minecraft environment to integrate these two fields, by leveraging automated transcriptions of visual field data combined with high-resolution spatial trajectories. Our behavioral analyses capture both the structure and temporal dynamics of social interactions, which are then directly tested using computational models sequentially predicting each foraging decision. These results reveal that adaptation mechanisms of both asocial foraging and selective social learning are driven by individual foraging success (rather than social factors). Furthermore, it is the degree of adaptivity-of both asocial and social learning-that best predicts individual performance. These findings not only integrate theories across asocial and social domains, but also provide key insights into the adaptability of human decision-making in complex and dynamic social landscapes.
Collapse
Affiliation(s)
- Charley M Wu
- Human and Machine Cognition Lab, University of Tübingen, Tübingen, Germany.
- Centre for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany.
- Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | - Dominik Deffner
- Centre for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Excellence Cluster: Science of Intelligence, Technical University Berlin, Berlin, Germany
- Department of Psychology, University of Marburg, Marburg, Germany
| | - Benjamin Kahl
- Centre for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Excellence Cluster: Science of Intelligence, Technical University Berlin, Berlin, Germany
| | - Björn Meder
- Institute for Mind, Brain and Behavior, Department of Psychology, Health and Medical University, Potsdam, Germany
| | - Mark H Ho
- Department of Psychology, New York University, New York, NY, USA
| | - Ralf H J M Kurvers
- Centre for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Excellence Cluster: Science of Intelligence, Technical University Berlin, Berlin, Germany
| |
Collapse
|
3
|
García-Arch J, Korn CW, Fuentemilla L. Self-utility distance as a computational approach to understanding self-concept clarity. COMMUNICATIONS PSYCHOLOGY 2025; 3:50. [PMID: 40133620 PMCID: PMC11937342 DOI: 10.1038/s44271-025-00231-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 03/13/2025] [Indexed: 03/27/2025]
Abstract
Self-concept stability and cohesion are crucial for psychological functioning and well-being, yet the mechanisms that underpin this fundamental aspect of human cognition remain underexplored. Integrating insights from cognitive and personality psychology with reinforcement learning, we introduce Self-Utility Distance (SUD)-a metric quantifying the dissimilarities between individuals' self-concept attributes and their expected utility value. In Study 1 (n = 155), participants provided self- and expected utility ratings using a set of predefined adjectives. SUD showed a significant negative relationship with Self-Concept Clarity that persisted after accounting for individuals' Self-Esteem. In Study 2 (n = 323), we found that SUD provides incremental predictive accuracy over Ideal-Self and Ought-Self discrepancies in the prediction of Self-Concept Clarity. In Study 3 (n = 85), we investigated the mechanistic principles underlying Self-Utility Distance. Participants conducted a social learning task where they learned about trait utilities from a reference group. We formalized different computational models to investigate the strategies individuals use to adjust trait utility estimates in response to environmental feedback. Through Hierarchical Bayesian Inference, we found evidence that participants utilized their self-concept to modulate trait utility learning, effectively avoiding the maximization of Self-Utility Distance. Our findings provide insights into self-concept dynamics that might help understand the maintenance of adaptive and maladaptive traits.
Collapse
Affiliation(s)
- Josué García-Arch
- Department of Cognition, Development and Education Psychology, Faculty of Psychology, University of Barcelona, Barcelona, Spain.
- Institute of Neuroscience (UBNeuro), University of Barcelona, Barcelona, Spain.
| | - Christoph W Korn
- Section Social Neuroscience, Department of General Psychiatry, University of Heidelberg, Heidelberg, Germany
| | - Lluís Fuentemilla
- Department of Cognition, Development and Education Psychology, Faculty of Psychology, University of Barcelona, Barcelona, Spain
- Institute of Neuroscience (UBNeuro), University of Barcelona, Barcelona, Spain
- Bellvitge Institute for Biomedical Research, 08908 Hospitalet de Llobregat, Barcelona, Spain
| |
Collapse
|
4
|
Suganuma H, Naito A, Katahira K, Kameda T. When to stop social learning from a predecessor in an information-foraging task. EVOLUTIONARY HUMAN SCIENCES 2025; 7:e2. [PMID: 39935447 PMCID: PMC11810515 DOI: 10.1017/ehs.2024.29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 05/04/2024] [Accepted: 06/01/2024] [Indexed: 02/13/2025] Open
Abstract
Striking a balance between individual and social learning is one of the key capabilities that support adaptation under uncertainty. Although intergenerational transmission of information is ubiquitous, little is known about when and how newcomers switch from learning loyally from preceding models to exploring independently. Using a behavioural experiment, we investigated how social information available from a preceding demonstrator affects the timing of becoming independent and individual performance thereafter. Participants worked on a 30-armed bandit task for 100 trials. For the first 15 trials, participants simply observed the choices of a demonstrator who had accumulated more knowledge about the environment and passively received rewards from the demonstrator's choices. Thereafter, participants could switch to making independent choices at any time. We had three conditions differing in the social information available from the demonstrator: choice only, reward only or both. Results showed that both participants' strategies about when to stop observational learning and their behavioural patterns after independence depended on the available social information. Participants generally failed to make the best use of previously observed social information in their subsequent independent choices, suggesting the importance of direct communication beyond passive observation for better intergenerational transmission under uncertainty. Implications for cultural evolution are discussed.
Collapse
Affiliation(s)
- Hidezo Suganuma
- Department of Social Psychology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Aoi Naito
- School of Environmental Society, Institute of Science Tokyo, 3-3-6 Shibaura, Minato-ku, Tokyo 108-0023, Japan
- Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
| | - Kentaro Katahira
- Human Informatics and Interaction Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8566, Japan
| | - Tatsuya Kameda
- Faculty of Mathematical Informatics, Meiji Gakuin University, 1518 Kamikuratachou, Totsuka-ku, Yokohama, 244-8539 Japan
- Center for Interdisciplinary Informatics, Meiji Gakuin University, 1-2-37 Shirokanedai, Minato-ku, Tokyo 108-8636, Japan
- Center for Experimental Research in Social Sciences, Hokkaido University, N10W7, Kita-ku, Sapporo, Hokkaido 060-0810, Japan
- Brain Science Institute, Tamagawa University, 6-1-1 Tamagawagakuen, Machida, Tokyo, 194-8610 Japan
| |
Collapse
|
5
|
Selbing I, Becker N, Pan Y, Lindström B, Olsson A. Effects of described demonstrator ability on brain and behavior when learning from others. NPJ SCIENCE OF LEARNING 2025; 10:4. [PMID: 39819873 PMCID: PMC11739481 DOI: 10.1038/s41539-024-00292-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 12/16/2024] [Indexed: 01/19/2025]
Abstract
Observational learning enables us to make decisions by watching others' behaviors. The quality of such learning depends on the abilities of those we observe, but also on our beliefs about those abilities. We have previously demonstrated that observers learned better from demonstrators described as high vs. low in ability, regardless of their actual performance. The current study aimed to conceptually replicate these findings, and explore the neural mechanisms involved. Forty-five participants performed an observational learning task while undergoing functional magnetic resonance imaging (fMRI). We hypothesized that participants would perform better when demonstrators were described as having high vs. low ability. Unexpectedly, participants performed equally well regardless of described demonstrator ability. The behavioral effects of biased observational learning seem to be driven by mentalizing processes together with general learning and decision-making processes.
Collapse
Affiliation(s)
- Ida Selbing
- Division of Psychology, Karolinska Institutet, Solna, Sweden.
| | - Nina Becker
- Division of Psychology, Karolinska Institutet, Solna, Sweden
| | - Yafeng Pan
- Division of Psychology, Karolinska Institutet, Solna, Sweden
- Department of Psychology and Behavioral Sciences, Zhejiang University, Hangzhou, China
| | - Björn Lindström
- Division of Psychology, Karolinska Institutet, Solna, Sweden
| | - Andreas Olsson
- Division of Psychology, Karolinska Institutet, Solna, Sweden
| |
Collapse
|
6
|
Anlló H, Salamander G, Raihani N, Palminteri S, Hertz U. Experience and advice consequences shape information sharing strategies. COMMUNICATIONS PSYCHOLOGY 2024; 2:123. [PMID: 39702539 DOI: 10.1038/s44271-024-00175-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024]
Abstract
Individuals often rely on the advice of more experienced peers to minimise uncertainty and increase success likelihood. In most domains where knowledge is acquired through experience, advisers are themselves continuously learning. Here we examine the way advising behaviour changes throughout the learning process, and the way individual traits and costs and benefits of giving advice shape this behaviour. We ran a series of experiments implementing a decision task within a reinforcement learning framework, where participants could decide to share their choices as advice to others. Participants were overall likely to share their choices as advice, even on the first trial before learning. Tendency to share advice and advice quality increased as advisers learned about the value of choices, and moved from exploratory to exploitative behaviour. The introduction of consequences to advising resulted in a shift of the overall tendency to give advice, lowering it when advising implicated monetary loss, and increasing it when advising held reputational value. Individual differences in social anxiety levels were associated with lower tendency to share exploratory decisions. Our results show that advisers tend to share choices that are backed by their own experience, but that this relationship can be altered by advice-consequences and individual traits.
Collapse
Affiliation(s)
- Hernán Anlló
- Département d'études cognitives, École normale Supérieure-Université Paris Sciences et Lettres, Paris, France.
| | - Gil Salamander
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Nichola Raihani
- Department of Experimental Psychology, University College of London, London, UK
- School of Psychology, University of Auckland, Auckland, New Zealand
| | - Stefano Palminteri
- Département d'études cognitives, École normale Supérieure-Université Paris Sciences et Lettres, Paris, France
| | - Uri Hertz
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel.
- The Institute of Information Processing and Decision Making (IIPDM), University of Haifa, Haifa, Israel.
| |
Collapse
|
7
|
Kang P, Moisa M, Lindström B, Soutschek A, Ruff CC, Tobler PN. Causal involvement of dorsomedial prefrontal cortex in learning the predictability of observable actions. Nat Commun 2024; 15:8305. [PMID: 39333062 PMCID: PMC11436984 DOI: 10.1038/s41467-024-52559-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 09/11/2024] [Indexed: 09/29/2024] Open
Abstract
Social learning is well established across species. While recent neuroimaging studies show that dorsomedial prefrontal cortex (DMPFC/preSMA) activation correlates with observational learning signals, the precise computations that are implemented by DMPFC/preSMA have remained unclear. To identify whether DMPFC/preSMA supports learning from observed outcomes or observed actions, or possibly encodes even a higher order factor (such as the reliability of the demonstrator), we downregulate DMPFC/preSMA excitability with continuous theta burst stimulation (cTBS) and assess different forms of observational learning. Relative to a vertex-cTBS control condition, DMPFC/preSMA downregulation decreases performance during action-based learning but has no effect on outcome-based learning. Computational modeling reveals that DMPFC/preSMA cTBS disrupts learning the predictability, a proxy of reliability, of the demonstrator and modulates the rate of learning from observed actions. Thus, our results suggest that the DMPFC is causally involved in observational action learning, mainly by adjusting the speed of learning about the predictability of the demonstrator.
Collapse
Affiliation(s)
- Pyungwon Kang
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland.
| | - Marius Moisa
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Björn Lindström
- Department of Clinical Neuroscience, Division for Psychology, Karolinska Institute, Stockholm, Sweden
| | - Alexander Soutschek
- Ludwig Maximilian University Munich, Department for Psychology, Munich, Germany
| | - Christian C Ruff
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Philippe N Tobler
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, ETH Zurich and University of Zurich, Zurich, Switzerland
| |
Collapse
|
8
|
Witt A, Toyokawa W, Lala KN, Gaissmaier W, Wu CM. Humans flexibly integrate social information despite interindividual differences in reward. Proc Natl Acad Sci U S A 2024; 121:e2404928121. [PMID: 39302964 PMCID: PMC11441569 DOI: 10.1073/pnas.2404928121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 08/19/2024] [Indexed: 09/22/2024] Open
Abstract
There has been much progress in understanding human social learning, including recent studies integrating social information into the reinforcement learning framework. Yet previous studies often assume identical payoffs between observer and demonstrator, overlooking the diversity of social information in real-world interactions. We address this gap by introducing a socially correlated bandit task that accommodates payoff differences among participants, allowing for the study of social learning under more realistic conditions. Our Social Generalization (SG) model, tested through evolutionary simulations and two online experiments, outperforms existing models by incorporating social information into the generalization process, but treating it as noisier than individual observations. Our findings suggest that human social learning is more flexible than previously believed, with the SG model indicating a potential resource-rational trade-off where social learning partially replaces individual exploration. This research highlights the flexibility of humans' social learning, allowing us to integrate social information from others with different preferences, skills, or goals.
Collapse
Affiliation(s)
- Alexandra Witt
- Human and Machine Cognition Lab, University of Tübingen, Tübingen72074, Germany
| | - Wataru Toyokawa
- Social Psychology and Decision Sciences, Department of Psychology, University of Konstanz, Konstanz78464, Germany
- Computational Group Dynamics Unit, RIKEN Center for Brain Science, RIKEN, Wako351-0198, Japan
| | - Kevin N. Lala
- School of Biology, University of St Andrews, St AndrewsKY16 9AJ, United Kingdom
| | - Wolfgang Gaissmaier
- Social Psychology and Decision Sciences, Department of Psychology, University of Konstanz, Konstanz78464, Germany
| | - Charley M. Wu
- Human and Machine Cognition Lab, University of Tübingen, Tübingen72074, Germany
| |
Collapse
|
9
|
Schultner DT, Lindström BR, Cikara M, Amodio DM. Transmission of social bias through observational learning. SCIENCE ADVANCES 2024; 10:eadk2030. [PMID: 38941465 PMCID: PMC11212708 DOI: 10.1126/sciadv.adk2030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 05/22/2024] [Indexed: 06/30/2024]
Abstract
People often rely on social learning-learning by observing others' actions and outcomes-to form preferences in advance of their own direct experiences. Although typically adaptive, we investigated whether social learning may also contribute to the formation and spread of prejudice. In six experiments (n = 1550), we demonstrate that by merely observing interactions between a prejudiced actor and social group members, observers acquired the prejudices of the actor. Moreover, observers were unaware of the actors' bias, misattributing their acquired group preferences to the behavior of group members, despite identical behavior between groups. Computational modeling revealed that this effect was due to value shaping, whereby one's preferences are shaped by another's actions toward a target, in addition to the target's reward feedback. These findings identify social learning as a potent mechanism of prejudice formation that operates implicitly and supports the transmission of intergroup bias.
Collapse
Affiliation(s)
- David T. Schultner
- Faculty of Social and Behavioral Sciences, Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
| | - Björn R. Lindström
- Division of Psychology, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Mina Cikara
- Graduate School of Arts and Sciences, Department of Psychology, Harvard University, Cambridge, MA, USA
| | - David M. Amodio
- Faculty of Social and Behavioral Sciences, Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
10
|
Bavard S, Stuchlý E, Konovalov A, Gluth S. Humans can infer social preferences from decision speed alone. PLoS Biol 2024; 22:e3002686. [PMID: 38900903 PMCID: PMC11189591 DOI: 10.1371/journal.pbio.3002686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/21/2024] [Indexed: 06/22/2024] Open
Abstract
Humans are known to be capable of inferring hidden preferences and beliefs of their conspecifics when observing their decisions. While observational learning based on choices has been explored extensively, the question of how response times (RT) impact our learning of others' social preferences has received little attention. Yet, while observing choices alone can inform us about the direction of preference, they reveal little about the strength of this preference. In contrast, RT provides a continuous measure of strength of preference with faster responses indicating stronger preferences and slower responses signaling hesitation or uncertainty. Here, we outline a preregistered orthogonal design to investigate the involvement of both choices and RT in learning and inferring other's social preferences. Participants observed other people's behavior in a social preferences task (Dictator Game), seeing either their choices, RT, both, or no information. By coupling behavioral analyses with computational modeling, we show that RT is predictive of social preferences and that observers were able to infer those preferences even when receiving only RT information. Based on these findings, we propose a novel observational reinforcement learning model that closely matches participants' inferences in all relevant conditions. In contrast to previous literature suggesting that, from a Bayesian perspective, people should be able to learn equally well from choices and RT, we show that observers' behavior substantially deviates from this prediction. Our study elucidates a hitherto unknown sophistication in human observational learning but also identifies important limitations to this ability.
Collapse
Affiliation(s)
- Sophie Bavard
- Department of Psychology, University of Hamburg, Hamburg, Germany
| | - Erik Stuchlý
- Department of Psychology, University of Hamburg, Hamburg, Germany
| | - Arkady Konovalov
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, United Kingdom
| | - Sebastian Gluth
- Department of Psychology, University of Hamburg, Hamburg, Germany
| |
Collapse
|
11
|
Zurek N, Aljadeff N, Khoury D, Aplin LM, Lotem A. Social demonstration of colour preference improves the learning of associated demonstrated actions. Anim Cogn 2024; 27:31. [PMID: 38592559 PMCID: PMC11004050 DOI: 10.1007/s10071-024-01865-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 03/07/2024] [Accepted: 03/07/2024] [Indexed: 04/10/2024]
Abstract
We studied how different types of social demonstration improve house sparrows' (Passer domesticus) success in solving a foraging task that requires both operant learning (opening covers) and discrimination learning (preferring covers of the rewarding colour). We provided learners with either paired demonstration (of both cover opening and colour preference), action-only demonstration (of opening white covers only), or no demonstration (a companion bird eating without covers). We found that sparrows failed to learn the two tasks with no demonstration, and learned them best with a paired demonstration. Interestingly, the action of cover opening was learned faster with paired rather than action-only demonstration despite being equally demonstrated in both. We also found that only with paired demonstration, the speed of operant (action) learning was related to the demonstrator's level of activity. Colour preference (i.e. discrimination learning) was eventually acquired by all sparrows that learned to open covers, even without social demonstration of colour preference. Thus, adding a demonstration of colour preference was actually more important for operant learning, possibly as a result of increasing the similarity between the demonstrated and the learned tasks, thereby increasing the learner's attention to the actions of the demonstrator. Giving more attention to individuals in similar settings may be an adaptive strategy directing social learners to focus on ecologically relevant behaviours and on tasks that are likely to be learned successfully.
Collapse
Affiliation(s)
- Noam Zurek
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Na'ama Aljadeff
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Donya Khoury
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Lucy M Aplin
- Department of Evolutionary Biology and Environmental Science, University of Zurich, Zurich, Switzerland
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, Australia
| | - Arnon Lotem
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
12
|
Wang R, Wang M, Zhao Q, Gong Y, Zuo L, Zheng X, Gao H. A Novel Obstacle Traversal Method for Multiple Robotic Fish Based on Cross-Modal Variational Autoencoders and Imitation Learning. Biomimetics (Basel) 2024; 9:221. [PMID: 38667232 PMCID: PMC11048022 DOI: 10.3390/biomimetics9040221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
Precision control of multiple robotic fish visual navigation in complex underwater environments has long been a challenging issue in the field of underwater robotics. To address this problem, this paper proposes a multi-robot fish obstacle traversal technique based on the combination of cross-modal variational autoencoder (CM-VAE) and imitation learning. Firstly, the overall framework of the robotic fish control system is introduced, where the first-person view of the robotic fish is encoded into a low-dimensional latent space using CM-VAE, and then different latent features in the space are mapped to the velocity commands of the robotic fish through imitation learning. Finally, to validate the effectiveness of the proposed method, experiments are conducted on linear, S-shaped, and circular gate frame trajectories with both single and multiple robotic fish. Analysis reveals that the visual navigation method proposed in this paper can stably traverse various types of gate frame trajectories. Compared to end-to-end learning and purely unsupervised image reconstruction, the proposed control strategy demonstrates superior performance, offering a new solution for the intelligent navigation of robotic fish in complex environments.
Collapse
Affiliation(s)
- Ruilong Wang
- School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
| | - Ming Wang
- School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
| | - Qianchuan Zhao
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yanling Gong
- School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
| | - Lingchen Zuo
- School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
| | - Xuehan Zheng
- School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
| | - He Gao
- School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
- Shandong Zhengchen Technology Co., Ltd., Jinan 250101, China
| |
Collapse
|
13
|
Tump AN, Deffner D, Pleskac TJ, Romanczuk P, M. Kurvers RHJ. A Cognitive Computational Approach to Social and Collective Decision-Making. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2024; 19:538-551. [PMID: 37671891 PMCID: PMC10913326 DOI: 10.1177/17456916231186964] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Collective dynamics play a key role in everyday decision-making. Whether social influence promotes the spread of accurate information and ultimately results in adaptive behavior or leads to false information cascades and maladaptive social contagion strongly depends on the cognitive mechanisms underlying social interactions. Here we argue that cognitive modeling, in tandem with experiments that allow collective dynamics to emerge, can mechanistically link cognitive processes at the individual and collective levels. We illustrate the strength of this cognitive computational approach with two highly successful cognitive models that have been applied to interactive group experiments: evidence-accumulation and reinforcement-learning models. We show how these approaches make it possible to simultaneously study (a) how individual cognition drives social systems, (b) how social systems drive individual cognition, and (c) the dynamic feedback processes between the two layers.
Collapse
Affiliation(s)
- Alan N. Tump
- Center for Adaptive Rationality, Max Planck Institute for Human Development
- Science of Intelligence, Technische Universität Berlin
| | - Dominik Deffner
- Center for Adaptive Rationality, Max Planck Institute for Human Development
- Science of Intelligence, Technische Universität Berlin
| | | | - Pawel Romanczuk
- Science of Intelligence, Technische Universität Berlin
- Institute for Theoretical Biology, Department of Biology, Humboldt Universität zu Berlin
- Bernstein Center for Computational Neuroscience Berlin
| | - Ralf H. J. M. Kurvers
- Center for Adaptive Rationality, Max Planck Institute for Human Development
- Science of Intelligence, Technische Universität Berlin
| |
Collapse
|
14
|
Pereg M, Hertz U, Ben-Artzi I, Shahar N. Disentangling the contribution of individual and social learning processes in human advice-taking behavior. NPJ SCIENCE OF LEARNING 2024; 9:4. [PMID: 38245562 PMCID: PMC10799906 DOI: 10.1038/s41539-024-00214-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 01/03/2024] [Indexed: 01/22/2024]
Abstract
The study of social learning examines how individuals learn from others by means of observation, imitation, or compliance with advice. However, it still remains largely unknown whether social learning processes have a distinct contribution to behavior, independent from non-social trial-and-error learning that often occurs simultaneously. 153 participants completed a reinforcement learning task, where they were asked to make choices to gain rewards. Advice from an artificial teacher was presented in 60% of the trials, allowing us to compare choice behavior with and without advice. Results showed a strong and reliable tendency to follow advice (test-retest reliability ~0.73). Computational modeling suggested a unique contribution of three distinct learning strategies: (a) individual learning (i.e., learning the value of actions, independent of advice), (b) informed advice-taking (i.e., learning the value of following advice), and (c) non-informed advice-taking (i.e., a constant bias to follow advice regardless of outcome history). Comparing artificial and empirical data provided specific behavioral regression signatures to both informed and non-informed advice taking processes. We discuss the theoretical implications of integrating internal and external information during the learning process.
Collapse
Affiliation(s)
- Maayan Pereg
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel.
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel.
- Minducate Center for the Science of Learning, Sagol School of Neuroscience, Tel Aviv, Israel.
- Department of Psychology, Achva Academic College, Arugot, Israel.
| | - Uri Hertz
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
- Institute of Information Processing and Decision Making, University of Haifa, Haifa, Israel
| | - Ido Ben-Artzi
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- Minducate Center for the Science of Learning, Sagol School of Neuroscience, Tel Aviv, Israel
| | - Nitzan Shahar
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
15
|
Cushman F. Computational Social Psychology. Annu Rev Psychol 2024; 75:625-652. [PMID: 37540891 DOI: 10.1146/annurev-psych-021323-040420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
Social psychologists attempt to explain how we interact by appealing to basic principles of how we think. To make good on this ambition, they are increasingly relying on an interconnected set of formal tools that model inference, attribution, value-guided decision making, and multi-agent interactions. By reviewing progress in each of these areas and highlighting the connections between them, we can better appreciate the structure of social thought and behavior, while also coming to understand when, why, and how formal tools can be useful for social psychologists.
Collapse
Affiliation(s)
- Fiery Cushman
- Department of Psychology, Harvard University, Cambridge, Massachusetts, USA;
| |
Collapse
|
16
|
Hawkins RD, Berdahl AM, Pentland A'S, Tenenbaum JB, Goodman ND, Krafft PM. Flexible social inference facilitates targeted social learning when rewards are not observable. Nat Hum Behav 2023; 7:1767-1776. [PMID: 37591983 DOI: 10.1038/s41562-023-01682-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 07/20/2023] [Indexed: 08/19/2023]
Abstract
Groups coordinate more effectively when individuals are able to learn from others' successes. But acquiring such knowledge is not always easy, especially in real-world environments where success is hidden from public view. We suggest that social inference capacities may help bridge this gap, allowing individuals to update their beliefs about others' underlying knowledge and success from observable trajectories of behaviour. We compared our social inference model against simpler heuristics in three studies of human behaviour in a collective-sensing task. Experiment 1 demonstrated that average performance improved as a function of group size at a rate greater than predicted by heuristic models. Experiment 2 introduced artificial agents to evaluate how individuals selectively rely on social information. Experiment 3 generalized these findings to a more complex reward landscape. Taken together, our findings provide insight into the relationship between individual social cognition and the flexibility of collective behaviour.
Collapse
Affiliation(s)
- Robert D Hawkins
- Department of Psychology, Stanford University, Stanford, CA, USA.
- Department of Psychology, University of Wisconsin-Madison, Madison, WI, USA.
| | - Andrew M Berdahl
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA
| | | | | | - Noah D Goodman
- Department of Psychology, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - P M Krafft
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- Creative Computing Institute, University of Arts London, London, UK
| |
Collapse
|
17
|
Navidi P, Saeedpour S, Ershadmanesh S, Hossein MM, Bahrami B. Prosocial learning: Model-based or model-free? PLoS One 2023; 18:e0287563. [PMID: 37352225 PMCID: PMC10289351 DOI: 10.1371/journal.pone.0287563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/07/2023] [Indexed: 06/25/2023] Open
Abstract
Prosocial learning involves the acquisition of knowledge and skills necessary for making decisions that benefit others. We asked if, in the context of value-based decision-making, there is any difference between learning strategies for oneself vs. for others. We implemented a 2-step reinforcement learning paradigm in which participants learned, in separate blocks, to make decisions for themselves or for a present other confederate who evaluated their performance. We replicated the canonical features of the model-based and model-free reinforcement learning in our results. The behaviour of the majority of participants was best explained by a mixture of the model-based and model-free control, while most participants relied more heavily on MB control, and this strategy enhanced their learning success. Regarding our key self-other hypothesis, we did not find any significant difference between the behavioural performances nor in the model-based parameters of learning when comparing self and other conditions.
Collapse
Affiliation(s)
- Parisa Navidi
- Department of Cognitive Psychology, Institute for Cognitive Science Studies, Tehran, Iran
| | - Sepehr Saeedpour
- Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
| | - Sara Ershadmanesh
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
- Department of Computational Neuroscience, MPI for Biological Cybernetics, Tuebingen, Germany
| | | | - Bahador Bahrami
- Crowd Cognition Group, Department of General Psychology and Education, Ludwig Maximilians University, Munich, Germany
| |
Collapse
|
18
|
Zhang W, Liu Y, Dong Y, He W, Yao S, Xu Z, Mu Y. How we learn social norms: a three-stage model for social norm learning. Front Psychol 2023; 14:1153809. [PMID: 37333598 PMCID: PMC10272593 DOI: 10.3389/fpsyg.2023.1153809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 05/03/2023] [Indexed: 06/20/2023] Open
Abstract
As social animals, humans are unique to make the world function well by developing, maintaining, and enforcing social norms. As a prerequisite among these norm-related processes, learning social norms can act as a basis that helps us quickly coordinate with others, which is beneficial to social inclusion when people enter into a new environment or experience certain sociocultural changes. Given the positive effects of learning social norms on social order and sociocultural adaptability in daily life, there is an urgent need to understand the underlying mechanisms of social norm learning. In this article, we review a set of works regarding social norms and highlight the specificity of social norm learning. We then propose an integrated model of social norm learning containing three stages, i.e., pre-learning, reinforcement learning, and internalization, map a potential brain network in processing social norm learning, and further discuss the potential influencing factors that modulate social norm learning. Finally, we outline a couple of future directions along this line, including theoretical (i.e., societal and individual differences in social norm learning), methodological (i.e., longitudinal research, experimental methods, neuroimaging studies), and practical issues.
Collapse
Affiliation(s)
- Wen Zhang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Yunhan Liu
- School of Humanities and Social Science, Chinese University of Hong Kong, Shenzhen, China
| | - Yixuan Dong
- Faculty of Education, Beijing Normal University, Beijing, China
| | - Wanna He
- Department of Psychology and Behavioral Sciences, Zhejiang University, Hangzhou, China
| | - Shiming Yao
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Ziqian Xu
- Graziadio Business School of Business and Management, Pepperdine University, Los Angeles, CA, United States
| | - Yan Mu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
19
|
Adaptive learning strategies in purely observational learning. CURRENT PSYCHOLOGY 2022. [DOI: 10.1007/s12144-022-03904-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
20
|
Incorporating social knowledge structures into computational models. Nat Commun 2022; 13:6205. [PMID: 36266284 PMCID: PMC9584930 DOI: 10.1038/s41467-022-33418-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 09/16/2022] [Indexed: 12/24/2022] Open
Abstract
To navigate social interactions successfully, humans need to continuously learn about the personality traits of other people (e.g., how helpful or aggressive is the other person?). However, formal models that capture the complexities of social learning processes are currently lacking. In this study, we specify and test potential strategies that humans can employ for learning about others. Standard Rescorla-Wagner (RW) learning models only capture parts of the learning process because they neglect inherent knowledge structures and omit previously acquired knowledge. We therefore formalize two social knowledge structures and implement them in hybrid RW models to test their usefulness across multiple social learning tasks. We name these concepts granularity (knowledge structures about personality traits that can be utilized at different levels of detail during learning) and reference points (previous knowledge formalized into representations of average people within a social group). In five behavioural experiments, results from model comparisons and statistical analyses indicate that participants efficiently combine the concepts of granularity and reference points-with the specific combinations in models depending on the people and traits that participants learned about. Overall, our experiments demonstrate that variants of RW algorithms, which incorporate social knowledge structures, describe crucial aspects of the dynamics at play when people interact with each other.
Collapse
|
21
|
Hofmans L, van den Bos W. Social learning across adolescence: A Bayesian neurocognitive perspective. Dev Cogn Neurosci 2022; 58:101151. [PMID: 36183664 PMCID: PMC9526184 DOI: 10.1016/j.dcn.2022.101151] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 09/14/2022] [Accepted: 09/15/2022] [Indexed: 01/13/2023] Open
Abstract
Adolescence is a period of social re-orientation in which we are generally more prone to peer influence and the updating of our beliefs based on social information, also called social learning, than in any other stage of our life. However, how do we know when to use social information and whose information to use and how does this ability develop across adolescence? Here, we review the social learning literature from a behavioral, neural and computational viewpoint, focusing on the development of brain systems related to executive functioning, value-based decision-making and social cognition. We put forward a Bayesian reinforcement learning framework that incorporates social learning about value associated with particular behavior and uncertainty in our environment and experiences. We discuss how this framework can inform us about developmental changes in social learning, including how the assessment of uncertainty and the ability to adaptively discriminate between information from different social sources change across adolescence. By combining reward-based decision-making in the domains of both informational and normative influence, this framework explains both negative and positive social peer influence in adolescence.
Collapse
Affiliation(s)
- Lieke Hofmans
- Department of Developmental Psychology, University of Amsterdam, Amsterdam, the Netherlands,Correspondence to: Nieuwe Achtergracht 129, room G1.05, 1018WS Amsterdam, the Netherlands.
| | - Wouter van den Bos
- Department of Developmental Psychology, University of Amsterdam, Amsterdam, the Netherlands,Amsterdam Brain and Cognition Center, University of Amsterdam, Amsterdam, the Netherlands,Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
| |
Collapse
|
22
|
Chimento M, Barrett BJ, Kandler A, Aplin LM. Cultural diffusion dynamics depend on behavioural production rules. Proc Biol Sci 2022; 289:20221001. [PMID: 35946158 PMCID: PMC9363993 DOI: 10.1098/rspb.2022.1001] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Culture is an outcome of both the acquisition of knowledge about behaviour through social transmission, and its subsequent production by individuals. Acquisition and production are often discussed or modelled interchangeably, yet to date no study has explored the consequences of their interaction for cultural diffusions. We present a generative model that integrates the two, and ask how variation in production rules might influence diffusion dynamics. Agents make behavioural choices that change as they learn from their productions. Their repertoires may also change, and the acquisition of behaviour is conditioned on its frequency. We analyse the diffusion of a novel behaviour through social networks, yielding generalizable predictions of how individual-level behavioural production rules influence population-level diffusion dynamics. We then investigate how linking acquisition and production might affect the performance of two commonly used inferential models for social learning; network-based diffusion analysis, and experience-weighted attraction models. We find that the influence that production rules have on diffusion dynamics has consequences for how inferential methods are applied to empirical data. Our model illuminates the differences between social learning and social influence, demonstrates the overlooked role of reinforcement learning in cultural diffusions, and allows for clearer discussions about social learning strategies.
Collapse
Affiliation(s)
- Michael Chimento
- Cognitive and Cultural Ecology Research Group, Max Planck Institute of Animal Behavior, Am Obstberg 1, Radolfzell 78315, Germany.,Centre for the Advanced Study of Collective Behaviour, University of Konstanz, Universitätsstraße 10, Konstanz 78464, Germany.,Department of Biology, University of Konstanz, Universitätsstraße 10, Konstanz 78464, Germany
| | - Brendan J Barrett
- Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Am Obstberg 1, Radolfzell 78315, Germany.,Centre for the Advanced Study of Collective Behaviour, University of Konstanz, Universitätsstraße 10, Konstanz 78464, Germany.,Department of Biology, University of Konstanz, Universitätsstraße 10, Konstanz 78464, Germany.,Department of Human Behavior, Ecology and Culture, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, Leipzig 04103, Germany
| | - Anne Kandler
- Department of Human Behavior, Ecology and Culture, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, Leipzig 04103, Germany
| | - Lucy M Aplin
- Cognitive and Cultural Ecology Research Group, Max Planck Institute of Animal Behavior, Am Obstberg 1, Radolfzell 78315, Germany.,Centre for the Advanced Study of Collective Behaviour, University of Konstanz, Universitätsstraße 10, Konstanz 78464, Germany.,Division of Ecology and Evolution, Research School of Biology, The Australian National University, 46 Sullivan Creek Road, Canberra, Australian Capital Territory 2600, Australia
| |
Collapse
|
23
|
Matsuo Y, LeCun Y, Sahani M, Precup D, Silver D, Sugiyama M, Uchibe E, Morimoto J. Deep learning, reinforcement learning, and world models. Neural Netw 2022; 152:267-275. [DOI: 10.1016/j.neunet.2022.03.037] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 02/19/2022] [Accepted: 03/28/2022] [Indexed: 12/01/2022]
|
24
|
Naito A, Katahira K, Kameda T. Insights about the common generative rule underlying an information foraging task can be facilitated via collective search. Sci Rep 2022; 12:8047. [PMID: 35577854 PMCID: PMC9110753 DOI: 10.1038/s41598-022-12126-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 04/04/2022] [Indexed: 11/09/2022] Open
Abstract
Social learning is beneficial for efficient information search in unfamiliar environments ("within-task" learning). In the real world, however, possible search spaces are often so large that decision makers are incapable of covering all options, even if they pool their information collectively. One strategy to handle such overload is developing generalizable knowledge that extends to multiple related environments ("across-task" learning). However, it is unknown whether and how social information may facilitate such across-task learning. Here, we investigated participants' social learning processes across multiple laboratory foraging sessions in spatially correlated reward landscapes that were generated according to a common rule. The results showed that paired participants were able to improve efficiency in information search across sessions more than solo participants. Computational analysis of participants' choice-behaviors revealed that such improvement across sessions was related to better understanding of the common generative rule. Rule understanding was correlated within a pair, suggesting that social interaction is a key to the improvement of across-task learning.
Collapse
Affiliation(s)
- Aoi Naito
- Department of Social Psychology, The University of Tokyo, Tokyo, 113-0033, Japan
- Japan Society for the Promotion of Science, Tokyo, 102-0083, Japan
| | - Kentaro Katahira
- Human Informatics and Interaction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, 305-8566, Japan
| | - Tatsuya Kameda
- Department of Social Psychology, The University of Tokyo, Tokyo, 113-0033, Japan.
- Brain Science Institute, Tamagawa University, Tokyo, 194-8610, Japan.
- Center for Experimental Research in Social Sciences, Hokkaido University, Sapporo, 060-0810, Japan.
| |
Collapse
|
25
|
Toyokawa W, Gaissmaier W. Conformist social learning leads to self-organised prevention against adverse bias in risky decision making. eLife 2022; 11:75308. [PMID: 35535494 PMCID: PMC9090329 DOI: 10.7554/elife.75308] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 04/01/2022] [Indexed: 11/13/2022] Open
Abstract
Given the ubiquity of potentially adverse behavioural bias owing to myopic trial-and-error learning, it seems paradoxical that improvements in decision-making performance through conformist social learning, a process widely considered to be bias amplification, still prevail in animal collective behaviour. Here we show, through model analyses and large-scale interactive behavioural experiments with 585 human subjects, that conformist influence can indeed promote favourable risk taking in repeated experience-based decision making, even though many individuals are systematically biased towards adverse risk aversion. Although strong positive feedback conferred by copying the majority's behaviour could result in unfavourable informational cascades, our differential equation model of collective behavioural dynamics identified a key role for increasing exploration by negative feedback arising when a weak minority influence undermines the inherent behavioural bias. This 'collective behavioural rescue', emerging through coordination of positive and negative feedback, highlights a benefit of collective learning in a broader range of environmental conditions than previously assumed and resolves the ostensible paradox of adaptive collective behavioural flexibility under conformist influences.
Collapse
Affiliation(s)
- Wataru Toyokawa
- Department of Psychology, University of Konstanz, Konstanz, Germany
| | - Wolfgang Gaissmaier
- Department of Psychology, University of Konstanz, Konstanz, Germany.,Centre for the Advanced Study of Collective Behaviour, University of Konstanz,, Konstanz, Germany
| |
Collapse
|
26
|
Mahmoodi A, Nili H, Bang D, Mehring C, Bahrami B. Distinct neurocomputational mechanisms support informational and socially normative conformity. PLoS Biol 2022; 20:e3001565. [PMID: 35239647 PMCID: PMC8893340 DOI: 10.1371/journal.pbio.3001565] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 02/02/2022] [Indexed: 11/18/2022] Open
Abstract
A change of mind in response to social influence could be driven by informational conformity to increase accuracy, or by normative conformity to comply with social norms such as reciprocity. Disentangling the behavioural, cognitive, and neurobiological underpinnings of informational and normative conformity have proven elusive. Here, participants underwent fMRI while performing a perceptual task that involved both advice-taking and advice-giving to human and computer partners. The concurrent inclusion of 2 different social roles and 2 different social partners revealed distinct behavioural and neural markers for informational and normative conformity. Dorsal anterior cingulate cortex (dACC) BOLD response tracked informational conformity towards both human and computer but tracked normative conformity only when interacting with humans. A network of brain areas (dorsomedial prefrontal cortex (dmPFC) and temporoparietal junction (TPJ)) that tracked normative conformity increased their functional coupling with the dACC when interacting with humans. These findings enable differentiating the neural mechanisms by which different types of conformity shape social changes of mind.
Collapse
Affiliation(s)
- Ali Mahmoodi
- Bernstein Centre Freiburg, University of Freiburg, Freiburg, Germany
- Faculty of Biology, University of Freiburg, Freiburg, Germany
- Wellcome Centre for Integrative Neuroimaging, Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
- * E-mail: (AM); (BB)
| | - Hamed Nili
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the Brain, University of Oxford, Oxford, United Kingdom
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Dan Bang
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| | - Carsten Mehring
- Bernstein Centre Freiburg, University of Freiburg, Freiburg, Germany
- Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Bahador Bahrami
- Faculty of Psychology and Educational Sciences, Ludwig Maximilian University, Munich, Germany
- Department of Psychology, Royal Holloway, University of London, Egham, United Kingdom
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- * E-mail: (AM); (BB)
| |
Collapse
|
27
|
Hertz U, Bell V, Raihani N. Trusting and learning from others: immediate and long-term effects of learning from observation and advice. Proc Biol Sci 2021; 288:20211414. [PMID: 34666522 PMCID: PMC8527195 DOI: 10.1098/rspb.2021.1414] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 09/29/2021] [Indexed: 11/16/2022] Open
Abstract
Social learning underpins our species's extraordinary success. Learning through observation has been investigated in several species, but learning from advice-where information is intentionally broadcast-is less understood. We used a pre-registered, online experiment (n = 1492) combined with computational modelling to examine learning through observation and advice. Participants were more likely to immediately follow advice than to copy an observed choice, but this was dependent upon trust in the adviser: highly paranoid participants were less likely to follow advice in the short term. Reinforcement learning modelling revealed two distinct patterns regarding the long-term effects of social information: some individuals relied fully on social information, whereas others reverted to trial-and-error learning. This variation may affect the prevalence and fidelity of socially transmitted information. Our results highlight the privileged status of advice relative to observation and how the assimilation of intentionally broadcast information is affected by trust in others.
Collapse
Affiliation(s)
- Uri Hertz
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Vaughan Bell
- Department of Clinical, Education and Health Psychology, University College London, London, UK
| | - Nichola Raihani
- Department of Experimental Psychology, University College London, WC1H 0AP, London, UK
| |
Collapse
|
28
|
Najar A, Chetouani M. Reinforcement Learning With Human Advice: A Survey. Front Robot AI 2021; 8:584075. [PMID: 34141726 PMCID: PMC8205518 DOI: 10.3389/frobt.2021.584075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 03/03/2021] [Indexed: 11/13/2022] Open
Abstract
In this paper, we provide an overview of the existing methods for integrating human advice into a reinforcement learning process. We first propose a taxonomy of the different forms of advice that can be provided to a learning agent. We then describe the methods that can be used for interpreting advice when its meaning is not determined beforehand. Finally, we review different approaches for integrating advice into the learning process.
Collapse
Affiliation(s)
- Anis Najar
- Laboratoire de Neurosciences Cognitives Computationnelles, INSERM U960, Paris, France
| | - Mohamed Chetouani
- Institute for Intelligent Systems and Robotics, Sorbonne Université, CNRS UMR 7222, Paris, France
| |
Collapse
|