1
|
Glynatsi NE, Akin E, Nowak MA, Hilbe C. Conditional cooperation with longer memory. Proc Natl Acad Sci U S A 2024; 121:e2420125121. [PMID: 39642203 DOI: 10.1073/pnas.2420125121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 11/04/2024] [Indexed: 12/08/2024] Open
Abstract
Direct reciprocity is a wide-spread mechanism for the evolution of cooperation. In repeated interactions, players can condition their behavior on previous outcomes. A well-known approach is given by reactive strategies, which respond to the coplayer's previous move. Here, we extend reactive strategies to longer memories. A reactive-n strategy takes into account the sequence of the last n moves of the coplayer. A reactive-n counting strategy responds to how often the coplayer cooperated during the last n rounds. We derive an algorithm to identify the partner strategies within these strategy sets. Partner strategies are those that ensure mutual cooperation without exploitation. We give explicit conditions for all partner strategies among reactive-2, reactive-3 strategies, and reactive-n counting strategies. To further explore the role of memory, we perform evolutionary simulations. We vary several key parameters, such as the cost-to-benefit ratio of cooperation, the error rate, and the strength of selection. Within the strategy sets we consider, we find that longer memory tends to promote cooperation. This positive effect of memory is particularly pronounced when individuals take into account the precise sequence of moves.
Collapse
Affiliation(s)
- Nikoleta E Glynatsi
- Max Planck Research Group Dynamics of Social Behavior, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Ethan Akin
- Department of Mathematics, The City College of New York, New York, NY 10031
| | - Martin A Nowak
- Department of Mathematics, Harvard University, Cambridge, MA 02138
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
| | - Christian Hilbe
- Max Planck Research Group Dynamics of Social Behavior, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| |
Collapse
|
2
|
Pulcu E. Individualistic attitudes in Iterated Prisoner's Dilemma undermine evolutionary fitness and may drive cooperative human players to extinction. ROYAL SOCIETY OPEN SCIENCE 2024; 11:230867. [PMID: 38550758 PMCID: PMC10977385 DOI: 10.1098/rsos.230867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/22/2023] [Accepted: 02/09/2024] [Indexed: 04/26/2024]
Abstract
Inarguably, humans perform the richest plethora of prosocial behaviours in the animal kingdom, and these are important for understanding how humans navigate their social environment. The success and failure of strategies human players devise also have implications for determining long-term socio-economic/evolutionary fitness. Following the footsteps of Press and Dyson (2012), I implemented their evolutionary game-theoretic modelling from Iterated Prisoner's Dilemma (a behavioural economic probe of interpersonal cooperation) and re-analysed already published data on human proposer behaviour in the Ultimatum Game (a behavioural economic probe of altruistic punishment) involving 50 human participants versus stochastic computerized opponents with prosocial and individualistic social value orientations. Although the results indicate that it is more likely to break cycles of mutual defection in ecosystems in which humans interact with individualistic opponents, analysis of social-economic fitness at the Markov stationary states suggested that this comes at an evolutionary cost. Overall, human players acted in a significantly more cooperative manner than their opponents, but they failed to overcome extortion from individualistic agents, risking 'extinction' in 70% of the cases. These findings demonstrate human players might be short-sighted, and social interactive decision strategies they devise while adjusting to different types of opponents may not be optimal in the long run.
Collapse
Affiliation(s)
- Erdem Pulcu
- Department of Psychiatry, Psychopharmacology and Emotion Research Lab, Computational Psychiatry Lab, University of Oxford, Oxford, UK
| |
Collapse
|
3
|
Leng A, Lian Z, Lien JW, Zheng J. Revisiting the Asymmetric Matching Pennies Contradiction in China. Behav Sci (Basel) 2023; 13:757. [PMID: 37754035 PMCID: PMC10525248 DOI: 10.3390/bs13090757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/28/2023] [Accepted: 08/30/2023] [Indexed: 09/28/2023] Open
Abstract
The asymmetric matching pennies contradiction posits that contrary to the prediction of mixed-strategy Nash equilibrium, experimental subjects' choices are, in practice, based heavily on the magnitudes of their own payoffs. Own-payoff effects are robustly confirmed in the literature. Closely following the experimental setups in the literature which support the contradiction, we conduct a series of asymmetric matching pennies games in China, hypothesizing play which is closer to equilibrium frequencies than previously found. Contrary to previous experiments which were conducted in the United States, we find that there are essentially no own-payoff effects among Row players who face large payoff asymmetry. In a Quantal Response Equilibrium framework allowing for altruism or spite, the behavior of our subjects corresponded to a positive spite parameter, whereas the results of previous studies corresponded to altruism. Our results may be consistent with recent psychology literature that finds people from collectivist cultures are substantially more adept at taking the perspective of others compared with people from individualist cultures, a feature of the reasoning needed to obtain mixed-strategy equilibrium.
Collapse
Affiliation(s)
- Ailin Leng
- Center for Economic Research, Shandong University, Jinan 250100, China (J.Z.)
| | - Zeng Lian
- International Business School, Beijing Foreign Studies University, Beijing 100089, China
| | - Jaimie W. Lien
- Center for Economic Research, Shandong University, Jinan 250100, China (J.Z.)
| | - Jie Zheng
- Center for Economic Research, Shandong University, Jinan 250100, China (J.Z.)
| |
Collapse
|
4
|
Chen X, Fu F. Outlearning extortioners: unbending strategies can foster reciprocal fairness and cooperation. PNAS NEXUS 2023; 2:pgad176. [PMID: 37287707 PMCID: PMC10244001 DOI: 10.1093/pnasnexus/pgad176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 06/09/2023]
Abstract
Recent theory shows that extortioners taking advantage of the zero-determinant (ZD) strategy can unilaterally claim an unfair share of the payoffs in the Iterated Prisoner's Dilemma. It is thus suggested that against a fixed extortioner, any adapting coplayer should be subdued with full cooperation as their best response. In contrast, recent experiments demonstrate that human players often choose not to accede to extortion out of concern for fairness, actually causing extortioners to suffer more loss than themselves. In light of this, here we reveal fair-minded strategies that are unbending to extortion such that any payoff-maximizing extortioner ultimately will concede in their own interest by offering a fair split in head-to-head matches. We find and characterize multiple general classes of such unbending strategies, including generous ZD strategies and Win-Stay, Lose-Shift (WSLS) as particular examples. When against fixed unbending players, extortioners are forced with consequentially increasing losses whenever intending to demand a more unfair share. Our analysis also pivots to the importance of payoff structure in determining the superiority of ZD strategies and in particular their extortion ability. We show that an extortionate ZD player can be even outperformed by, for example, WSLS, if the total payoff of unilateral cooperation is smaller than that of mutual defection. Unbending strategies can be used to outlearn evolutionary extortioners and catalyze the evolution of Tit-for-Tat-like strategies out of ZD players. Our work has implications for promoting fairness and resisting extortion so as to uphold a just and cooperative society.
Collapse
Affiliation(s)
- Xingru Chen
- School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
- Department of Mathematics, Dartmouth College, Hanover, 03755 NH, USA
| | - Feng Fu
- Department of Mathematics, Dartmouth College, Hanover, 03755 NH, USA
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Lebanon, 03756 NH, USA
| |
Collapse
|
5
|
McAvoy A, Kates-Harbeck J, Chatterjee K, Hilbe C. Evolutionary instability of selfish learning in repeated games. PNAS NEXUS 2022; 1:pgac141. [PMID: 36714856 PMCID: PMC9802390 DOI: 10.1093/pnasnexus/pgac141] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 07/22/2022] [Indexed: 02/01/2023]
Abstract
Across many domains of interaction, both natural and artificial, individuals use past experience to shape future behaviors. The results of such learning processes depend on what individuals wish to maximize. A natural objective is one's own success. However, when two such "selfish" learners interact with each other, the outcome can be detrimental to both, especially when there are conflicts of interest. Here, we explore how a learner can align incentives with a selfish opponent. Moreover, we consider the dynamics that arise when learning rules themselves are subject to evolutionary pressure. By combining extensive simulations and analytical techniques, we demonstrate that selfish learning is unstable in most classical two-player repeated games. If evolution operates on the level of long-run payoffs, selection instead favors learning rules that incorporate social (other-regarding) preferences. To further corroborate these results, we analyze data from a repeated prisoner's dilemma experiment. We find that selfish learning is insufficient to explain human behavior when there is a trade-off between payoff maximization and fairness.
Collapse
Affiliation(s)
| | | | | | - Christian Hilbe
- Max Planck Research Group: Dynamics of Social Behavior, Max Planck Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
6
|
Miyagawa D, Mamiya A, Ichinose G. Adapting paths against zero-determinant strategies in repeated prisoner's dilemma games. J Theor Biol 2022; 549:111211. [PMID: 35810777 DOI: 10.1016/j.jtbi.2022.111211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 06/23/2022] [Accepted: 06/28/2022] [Indexed: 10/17/2022]
Abstract
Long-term cooperation, competition, or exploitation among individuals can be modeled through repeated games. In repeated games, Press and Dyson discovered zero-determinant (ZD) strategies that enforce a special relationship between two players. This special relationship implies that a ZD player can unilaterally impose a linear payoff relationship to the opponent regardless of the opponent's strategies. A ZD player also has a property that can lead the opponent to an unconditional cooperation if the opponent tries to improve its payoff. This property has been mathematically confirmed by Chen and Zinger. Humans often underestimate a payoff obtained in the future. However, such discounting was not considered in their analysis. Here, we mathematically explored whether a ZD player can lead the opponent to an unconditional cooperation even if a discount factor is incorporated. Specifically, we represented the expected payoff with a discount factor as the form of determinants and calculated whether the values obtained by partially differentiating each factor in the strategy vector become positive. As a result, we proved that the strategy vector ends up as an unconditional cooperation even when starting from any initial strategy. This result was confirmed through numerical calculations. We extended the applicability of ZD strategies to real world problems.
Collapse
Affiliation(s)
- Daiki Miyagawa
- Department of Mathematical and Systems Engineering, Shizuoka University, 3-5-1 Johoku, Naka-ku, Hamamatsu 432-8561, Japan.
| | - Azumi Mamiya
- Nagoya Works, Mitsubishi Electric Corporation, 5-1-14, Yada-minami, Higashi-ku, Nagoya 461-8670, Japan
| | - Genki Ichinose
- Department of Mathematical and Systems Engineering, Shizuoka University, 3-5-1 Johoku, Naka-ku, Hamamatsu 432-8561, Japan
| |
Collapse
|
7
|
Self-Serving Dishonesty Partially Substitutes Fairness in Motivating Cooperation When People Are Treated Fairly. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19106326. [PMID: 35627863 PMCID: PMC9140579 DOI: 10.3390/ijerph19106326] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 05/14/2022] [Accepted: 05/17/2022] [Indexed: 02/05/2023]
Abstract
Fairness is a key expectation in social interactions. Its violation leads to adverse reactions, including non-cooperation and dishonesty. The present study aimed to examine how (1) fair (unfair) treatment may drive cooperation (defection) and honesty (self-serving dishonesty), (2) dishonesty primes further moral disengagement and reduced cooperation, and (3) dishonesty weakens (substitutes) the effect of fairness on cooperation. The prisoner’s dilemma (Experiment 1 and 2) and die-rolling task (Experiment 2) were employed for capturing cooperation and dishonest behaviors, respectively. To manipulate perceived unfairness, participants were randomly assigned to play the prisoner’s dilemma game, where players either choose more cooperation (fair condition) or defection (unfair condition). Results of Experiment 1 (n = 102) suggested that participants perceive higher unfairness and behave less cooperatively when the other player primarily chooses defection. Results of Exp. 2 (n = 240) (a) confirmed Exp. 1 results, (b) showed that players in the unfair condition also show more self-serving dishonest behavior, and (c) that dishonest behavior weakens the effect of fairness on cooperation. Together, these results extended previous work by highlighting the self-serving lies when the opponent is fair trigger higher cooperation, presumably as a means to alleviate self-reflective moral emotions or restore justice.
Collapse
|
8
|
Cheng Z, Chen G, Hong Y. Misperception influence on zero-determinant strategies in iterated Prisoner's Dilemma. Sci Rep 2022; 12:5174. [PMID: 35338188 PMCID: PMC8956668 DOI: 10.1038/s41598-022-08750-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 02/17/2022] [Indexed: 11/09/2022] Open
Abstract
Zero-determinant (ZD) strategies have attracted wide attention in Iterated Prisoner's Dilemma (IPD) games, since the player equipped with ZD strategies can unilaterally enforce the two players' expected utilities subjected to a linear relation. On the other hand, uncertainties, which may be caused by misperception, occur in IPD inevitably in practical circumstances. To better understand the situation, we consider the influence of misperception on ZD strategies in IPD, where the two players, player X and player Y, have different cognitions, but player X detects the misperception and it is believed to make ZD strategies by player Y. We provide a necessary and sufficient condition for the ZD strategies in IPD with misperception, where there is also a linear relationship between players' utilities in player X's cognition. Then we explore bounds of players' expected utility deviation from a linear relationship in player X's cognition with also improving its own utility.
Collapse
Affiliation(s)
- Zhaoyang Cheng
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Beijing, 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100190, China
| | - Guanpu Chen
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Beijing, 100190, China
- JD Explore Academy, Beijing, 100176, China
| | - Yiguang Hong
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Beijing, 100190, China.
- Department of Control Science and Engineering, Tongji University, Shanghai, 201804, China.
| |
Collapse
|
9
|
Quan J, Zhou Y, Ma X, Wang X, Yang JB. Integrating emotion-imitating into strategy learning improves cooperation in social dilemmas with extortion. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107550] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Peng M, Wang X, Chen W, Chen T, Cai M, Sun X, Wang Y. Cooperate or aggress? An opponent's tendency to cooperate modulates the neural dynamics of interpersonal cooperation. Neuropsychologia 2021; 162:108025. [PMID: 34560141 DOI: 10.1016/j.neuropsychologia.2021.108025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 09/14/2021] [Accepted: 09/15/2021] [Indexed: 11/16/2022]
Abstract
Humans are social animals and need to cooperate to survive. However, individuals are not cooperative in every social interaction, and their cooperation may depend on social context. The present study used a social dilemma game to investigate whether an opponent's tendency to be cooperative over time influenced a player's behavior and neural response to outcomes in the game. University students ("players") thought they were playing against other students ("opponents") in the Chicken Game but were actually playing against a programmed computer. Participants were randomly assigned to play with an opponent who tended to be competitive (cooperative 20% of the time) or who tended to be cooperative (cooperative 80% of the time). The results showed that early in the game, participants in both groups adopted a "tit-for-tat" strategy. However, as the game progressed and the opponent's behavioral tendency became more noticeable, players in the competitive-opponent group became generally more cooperative to limit their losses. ERPs analyses indicated that players had a higher P300 and larger theta power in response to the opponent's aggression but not to the opponent's cooperation when their opponent showed a tendency to be cooperative vs. competitive. The results suggest that people adjust their cooperative behavior based on their opponent's behavior in social interaction, and aggression captures more attention than cooperation in this process.
Collapse
Affiliation(s)
- Ming Peng
- Key Laboratory of Adolescent Cyberpsychology and Behavior of the Ministry of Education and School of Psychology, Central China Normal University, Wuhan, China; School of Psychology, Central China Normal University, Wuhan, China
| | - Xiaohui Wang
- School of Psychology, Central China Normal University, Wuhan, China
| | - Wang Chen
- School of Economics and Management, Fuzhou University, Fuzhou, 350108, China; Institute of Psychological and Cognitive Sciences, Fuzhou University, Fuzhou, China
| | - Tianlong Chen
- School of Psychology, Central China Normal University, Wuhan, China
| | - Mengfei Cai
- Department of Psychology, Manhattanville College, New York, NY, USA
| | - Xiaojun Sun
- Key Laboratory of Adolescent Cyberpsychology and Behavior of the Ministry of Education and School of Psychology, Central China Normal University, Wuhan, China; School of Psychology, Central China Normal University, Wuhan, China
| | - Yiwen Wang
- School of Economics and Management, Fuzhou University, Fuzhou, 350108, China; Institute of Psychological and Cognitive Sciences, Fuzhou University, Fuzhou, China.
| |
Collapse
|
11
|
Extortion - A voracious prosocial strategy. Curr Opin Psychol 2021; 44:196-201. [PMID: 34710851 DOI: 10.1016/j.copsyc.2021.08.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 08/25/2021] [Accepted: 08/30/2021] [Indexed: 11/20/2022]
Abstract
Recently Press and Dyson have dramatically changed our view on the Prisoner's Dilemma by proposing a new class of strategies that enforce a linear relationship between the two players' scores. Players adopting 'extortion' respond with cooperation to cooperation in most cases, defect in other rounds, but respond to defection with defection. In this way, extortion enforces full cooperation of the partner who accedes to it because he profits from doing so. This unbeatable strategy is nevertheless prosocial because it is mostly cooperative and induces cooperation even though it gains most itself. Experiments show that about 40% of humans choose to use extortion in competitive situations or when they have the power to exchange coplayers. On being punished in egalitarian situations, they use a generous strategy.
Collapse
|
12
|
D'Arcangelo C, Andreozzi L, Faillo M. Human players manage to extort more than the mutual cooperation payoff in repeated social dilemmas. Sci Rep 2021; 11:16820. [PMID: 34413364 PMCID: PMC8377025 DOI: 10.1038/s41598-021-96061-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 07/31/2021] [Indexed: 11/13/2022] Open
Abstract
Social dilemmas are mixed-motive games. Although the players have a common interest in maintaining cooperation, each may try to obtain a larger payoff by cooperating less than the other. This phenomenon received increased attention after Press and Dyson discovered a class of strategies for the repeated prisoner's dilemma (extortionate strategies) that secure for themselves a payoff that is never smaller, but can be larger, than the opponent's payoff. We conducted an experiment to test whether humans adopt extortionate strategies when playing a social dilemma. Our results reveal that human subjects do try to extort a larger payoff from their opponents. However, they are only successful when extortionate strategies are part of a Nash equilibrium. In settings where extortionate strategies do not appear in any Nash equilibrium, attempts at extortion only result in a breakdown of cooperation. Our subjects recognized the different incentives implied by the two settings, and they were ready to "extort" the opponent when allowed to do so. This suggests that deviations from mutually cooperative equilibria, which are usually attributed to players' impatience, coordination problems, or lack of information, can instead be driven by subjects trying to reach more favorable outcomes.
Collapse
Affiliation(s)
- Chiara D'Arcangelo
- Dipartimento di Economia, Università degli Studi G. D'Annunzio Chieti-Pescara, 65127, Pescara, Italy.
| | - Luciano Andreozzi
- Dipartimento di Economia e Management, Università di Trento, 38122, Trento, Italy
| | - Marco Faillo
- Dipartimento di Economia e Management, Università di Trento, 38122, Trento, Italy
| |
Collapse
|
13
|
Conditions for the existence of zero-determinant strategies under observation errors in repeated games. J Theor Biol 2021; 526:110810. [PMID: 34119498 DOI: 10.1016/j.jtbi.2021.110810] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 06/04/2021] [Accepted: 06/07/2021] [Indexed: 11/24/2022]
Abstract
Repeated games are useful models to analyze long term interactions of living species and complex social phenomena. Zero-determinant (ZD) strategies in repeated games discovered by Press and Dyson in 2012 enforce a linear payoff relationship between a focal player and the opponent. This linear relationship can be set arbitrarily by a ZD player. Hence, a subclass of ZD strategies can fix the opponent's expected payoff and another subclass of the strategies can exceed the opponent for the expected payoff. Since this discovery, theories for ZD strategies are extended to cope with various natural situations. It is especially important to consider the theory of ZD strategies for repeated games with a discount factor and observation errors because it allows the theory to be applicable in the real world. Recent studies revealed their existence of ZD strategies even in repeated games with both factors. However, the conditions for the existence has not been sufficiently analyzed. Here, we mathematically analyzed the conditions in repeated games with both factors. First, we derived the thresholds of a discount factor and observation errors which ensure the existence of Equalizer and positively correlated ZD (pcZD) strategies, which are well-known subclasses of ZD strategies. We found that ZD strategies exist only when a discount factor remains high as the error rates increase. Next, we derived the conditions for the expected payoff of the opponent enforced by Equalizer as well as the conditions for the slope and base line payoff of linear lines enforced by pcZD. As a result, we found that, as error rates increase or a discount factor decreases, the conditions for the linear line that Equalizer or pcZD can enforce become strict.
Collapse
|
14
|
Ueda M. Memory-two zero-determinant strategies in repeated games. ROYAL SOCIETY OPEN SCIENCE 2021; 8:202186. [PMID: 34084544 PMCID: PMC8150048 DOI: 10.1098/rsos.202186] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 05/04/2021] [Indexed: 06/12/2023]
Abstract
Repeated games have provided an explanation of how mutual cooperation can be achieved even if defection is more favourable in a one-shot game in the Prisoner's Dilemma situation. Recently found zero-determinant (ZD) strategies have substantially been investigated in evolutionary game theory. The original memory-one ZD strategies unilaterally enforce linear relationships between average pay-offs of players. Here, we extend the concept of ZD strategies to memory-two strategies in repeated games. Memory-two ZD strategies unilaterally enforce linear relationships between correlation functions of pay-offs and pay-offs of the previous round. Examples of memory-two ZD strategy in the repeated Prisoner's Dilemma game are provided, some of which generalize the tit-for-tat strategy to a memory-two case. Extension of ZD strategies to memory-n case with n ≥ ~2 is also straightforward.
Collapse
Affiliation(s)
- Masahiko Ueda
- Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Yamaguchi 753-8511, Japan
| |
Collapse
|
15
|
Mamiya A, Ichinose G. Zero-determinant strategies under observation errors in repeated games. Phys Rev E 2020; 102:032115. [PMID: 33075945 DOI: 10.1103/physreve.102.032115] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 08/25/2020] [Indexed: 05/08/2023]
Abstract
Zero-determinant (ZD) strategies are a novel class of strategies in the repeated prisoner's dilemma (RPD) game discovered by Press and Dyson. This strategy set enforces a linear payoff relationship between a focal player and the opponent regardless of the opponent's strategy. In the RPD game, games with discounting and observation errors represent an important generalization, because they are better able to capture real life interactions which are often noisy. However, they have not been considered in the original discovery of ZD strategies. In some preceding studies, each of them has been considered independently. Here, we analytically study the strategies that enforce linear payoff relationships in the RPD game considering both a discount factor and observation errors. As a result, we first reveal that the payoffs of two players can be represented by the form of determinants as shown by Press and Dyson even with the two factors. Then, we search for all possible strategies that enforce linear payoff relationships and find that both ZD strategies and unconditional strategies are the only strategy sets to satisfy the condition. We also show that neither Extortion nor Generous strategies, which are subsets of ZD strategies, exist when there are errors. Finally, we numerically derive the threshold values above which the subsets of ZD strategies exist. These results contribute to a deep understanding of ZD strategies in society.
Collapse
Affiliation(s)
- Azumi Mamiya
- Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu 432-8561, Japan
| | - Genki Ichinose
- Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu 432-8561, Japan
| |
Collapse
|
16
|
Strategically influencing an uncertain future. Sci Rep 2020; 10:12169. [PMID: 32699305 PMCID: PMC7376051 DOI: 10.1038/s41598-020-69006-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 06/16/2020] [Indexed: 11/08/2022] Open
Abstract
Many of today's most pressing societal concerns require decisions which take into account a distant and uncertain future. Recent developments in strategic decision-making suggest that individuals, or a small group of individuals, can unilaterally influence the collective outcome of such complex social dilemmas. However, these results do not account for the extent to which decisions are moderated by uncertainty in the probability or timing of future outcomes that characterise the valuation of a (distant) uncertain future. Here we develop a general framework that captures interactions among uncertainty, the resulting time-inconsistent discounting, and their consequences for decision-making processes. In deterministic limits, existing theories can be recovered. More importantly, new insights are obtained into the possibilities for strategic influence when the valuation of the future is uncertain. We show that in order to unilaterally promote and sustain cooperation in social dilemmas, decisions of generous and extortionate strategies should be adjusted to the level of uncertainty. In particular, generous payoff relations cannot be enforced during periods of greater risk (which we term the "generosity gap"), unless the strategic enforcer orients their strategy towards a more distant future by consistently choosing "selfless" cooperative decisions; likewise, the possibilities for extortion are directly limited by the level of uncertainty. Our results have implications for policies that aim to solve societal concerns with consequences for a distant future and provides a theoretical starting point for investigating how collaborative decision-making can help solve long-standing societal dilemmas.
Collapse
|
17
|
Ueda M, Tanaka T. Linear algebraic structure of zero-determinant strategies in repeated games. PLoS One 2020; 15:e0230973. [PMID: 32240215 PMCID: PMC7117786 DOI: 10.1371/journal.pone.0230973] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 03/12/2020] [Indexed: 11/18/2022] Open
Abstract
Zero-determinant (ZD) strategies, a recently found novel class of strategies in repeated games, has attracted much attention in evolutionary game theory. A ZD strategy unilaterally enforces a linear relation between average payoffs of players. Although existence and evolutional stability of ZD strategies have been studied in simple games, their mathematical properties have not been well-known yet. For example, what happens when more than one players employ ZD strategies have not been clarified. In this paper, we provide a general framework for investigating situations where more than one players employ ZD strategies in terms of linear algebra. First, we theoretically prove that a set of linear relations of average payoffs enforced by ZD strategies always has solutions, which implies that incompatible linear relations are impossible. Second, we prove that linear payoff relations are independent of each other under some conditions. These results hold for general games with public monitoring including perfect-monitoring games. Furthermore, we provide a simple example of a two-player game in which one player can simultaneously enforce two linear relations, that is, simultaneously control her and her opponent's average payoffs. All of these results elucidate general mathematical properties of ZD strategies.
Collapse
Affiliation(s)
- Masahiko Ueda
- Department of Systems Science, Graduate School of Informatics, Kyoto University, Kyoto, Japan
- * E-mail:
| | - Toshiyuki Tanaka
- Department of Systems Science, Graduate School of Informatics, Kyoto University, Kyoto, Japan
| |
Collapse
|
18
|
Strategies that enforce linear payoff relationships under observation errors in Repeated Prisoner’s Dilemma game. J Theor Biol 2019; 477:63-76. [DOI: 10.1016/j.jtbi.2019.06.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 05/24/2019] [Accepted: 06/11/2019] [Indexed: 11/23/2022]
|
19
|
Becks L, Milinski M. Extortion strategies resist disciplining when higher competitiveness is rewarded with extra gain. Nat Commun 2019; 10:783. [PMID: 30770819 PMCID: PMC6377637 DOI: 10.1038/s41467-019-08671-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Revised: 01/18/2019] [Accepted: 01/24/2019] [Indexed: 11/09/2022] Open
Abstract
Cooperative strategies are predicted for repeated social interactions. The recently described Zero Determinant (ZD) strategies enforce the partner's cooperation because the 'generous' ZD players help their cooperative partners while 'extortionate' ZD players exploit their partners' cooperation. Partners may accede to extortion because it pays them to do so, but the partner can sabotage his own and his extortioner's score by defecting to discipline the extortioner. Thus, extortion is predicted to turn into generous and disappear. Here, we show with human volunteers that an additional monetary incentive (bonus) paid to the finally competitively superior player maintains extortion. Unexpectedly, extortioners refused to become disciplined, thus forcing partners to accede. Occasional opposition reduced the extortioners' gain so that using extortion paid off only because of the bonus. With no bonus incentive, players used the generous ZD strategy. Our findings suggest that extortion strategies can prevail when higher competitiveness is rewarded with extra gain.
Collapse
Affiliation(s)
- Lutz Becks
- Community Dynamics Group, Department of Evolutionary Ecology, Max-Planck-Institute for Evolutionary Biology, August-Thienemann-Strasse 2, 24306, Plön, Germany
- University of Konstanz, Mainaustraße 252, 78464, Konstanz, Germany
| | - Manfred Milinski
- Department of Evolutionary Ecology, Max-Planck-Institute for Evolutionary Biology, August-Thienemann-Strasse 2, 24306, Plön, Germany.
| |
Collapse
|
20
|
Social Closure and the Evolution of Cooperation via Indirect Reciprocity. Sci Rep 2018; 8:11149. [PMID: 30042391 PMCID: PMC6057955 DOI: 10.1038/s41598-018-29290-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 07/05/2018] [Indexed: 11/24/2022] Open
Abstract
Direct and indirect reciprocity are good candidates to explain the fundamental problem of evolution of cooperation. We explore the conditions under which different types of reciprocity gain dominance and their performances in sustaining cooperation in the PD played on simple networks. We confirm that direct reciprocity gains dominance over indirect reciprocity strategies also in larger populations, as long as it has no memory constraints. In the absence of direct reciprocity, or when its memory is flawed, different forms of indirect reciprocity strategies are able to dominate and to support cooperation. We show that indirect reciprocity relying on social capital inherent in closed triads is the best competitor among them, outperforming indirect reciprocity that uses information from any source. Results hold in a wide range of conditions with different evolutionary update rules, extent of evolutionary pressure, initial conditions, population size, and density.
Collapse
|
21
|
Partners and rivals in direct reciprocity. Nat Hum Behav 2018; 2:469-477. [PMID: 31097794 DOI: 10.1038/s41562-018-0320-9] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Accepted: 02/14/2018] [Indexed: 11/08/2022]
Abstract
Reciprocity is a major factor in human social life and accounts for a large part of cooperation in our communities. Direct reciprocity arises when repeated interactions occur between the same individuals. The framework of iterated games formalizes this phenomenon. Despite being introduced more than five decades ago, the concept keeps offering beautiful surprises. Recent theoretical research driven by new mathematical tools has proposed a remarkable dichotomy among the crucial strategies: successful individuals either act as partners or as rivals. Rivals strive for unilateral advantages by applying selfish or extortionate strategies. Partners aim to share the payoff for mutual cooperation, but are ready to fight back when being exploited. Which of these behaviours evolves depends on the environment. Whereas small population sizes and a limited number of rounds favour rivalry, partner strategies are selected when populations are large and relationships stable. Only partners allow for evolution of cooperation, while the rivals' attempt to put themselves first leads to defection.
Collapse
|
22
|
Ichinose G, Masuda N. Zero-determinant strategies in finitely repeated games. J Theor Biol 2018; 438:61-77. [DOI: 10.1016/j.jtbi.2017.11.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 11/05/2017] [Accepted: 11/06/2017] [Indexed: 11/25/2022]
|
23
|
Xu X, Rong Z, Wu ZX, Zhou T, Tse CK. Extortion provides alternative routes to the evolution of cooperation in structured populations. Phys Rev E 2017; 95:052302. [PMID: 28618489 DOI: 10.1103/physreve.95.052302] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Indexed: 06/07/2023]
Abstract
In this paper, we study the evolution of cooperation in structured populations (individuals are located on either a regular lattice or a scale-free network) in the context of repeated games by involving three types of strategies, namely, unconditional cooperation, unconditional defection, and extortion. The strategy updating of the players is ruled by the replicator-like dynamics. We find that extortion strategies can act as catalysts to promote the emergence of cooperation in structured populations via different mechanisms. Specifically, on regular lattice, extortioners behave as both a shield, which can enwrap cooperators inside and keep them away from defectors, and a spear, which can defeat those surrounding defectors with the help of the neighboring cooperators. Particularly, the enhancement of cooperation displays a resonance-like behavior, suggesting the existence of optimal extortion strength mostly favoring the evolution of cooperation, which is in good agreement with the predictions from the generalized mean-field approximation theory. On scale-free network, the hubs, who are likely occupied by extortioners or defectors at the very beginning, are then prone to be conquered by cooperators on small-degree nodes as time elapses, thus establishing a bottom-up mechanism for the emergence and maintenance of cooperation.
Collapse
Affiliation(s)
- Xiongrui Xu
- CompleX Lab, Web Sciences Center, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Zhihai Rong
- CompleX Lab, Web Sciences Center, University of Electronic Science and Technology of China, Chengdu 611731, China
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
| | - Zhi-Xi Wu
- Institute of Computational Physics and Complex Systems, Lanzhou University, Lanzhou, Gansu 730000, People's Republic of China
| | - Tao Zhou
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Chi Kong Tse
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
| |
Collapse
|
24
|
McAvoy A, Hauert C. Autocratic strategies for alternating games. Theor Popul Biol 2017; 113:13-22. [DOI: 10.1016/j.tpb.2016.09.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Revised: 08/25/2016] [Accepted: 09/10/2016] [Indexed: 11/27/2022]
|
25
|
Hilbe C, Hagel K, Milinski M. Asymmetric Power Boosts Extortion in an Economic Experiment. PLoS One 2016; 11:e0163867. [PMID: 27701427 PMCID: PMC5049762 DOI: 10.1371/journal.pone.0163867] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 09/15/2016] [Indexed: 11/19/2022] Open
Abstract
Direct reciprocity is a major mechanism for the evolution of cooperation. Several classical studies have suggested that humans should quickly learn to adopt reciprocal strategies to establish mutual cooperation in repeated interactions. On the other hand, the recently discovered theory of ZD strategies has found that subjects who use extortionate strategies are able to exploit and subdue cooperators. Although such extortioners have been predicted to succeed in any population of adaptive opponents, theoretical follow-up studies questioned whether extortion can evolve in reality. However, most of these studies presumed that individuals have similar strategic possibilities and comparable outside options, whereas asymmetries are ubiquitous in real world applications. Here we show with a model and an economic experiment that extortionate strategies readily emerge once subjects differ in their strategic power. Our experiment combines a repeated social dilemma with asymmetric partner choice. In our main treatment there is one randomly chosen group member who is unilaterally allowed to exchange one of the other group members after every ten rounds of the social dilemma. We find that this asymmetric replacement opportunity generally promotes cooperation, but often the resulting payoff distribution reflects the underlying power structure. Almost half of the subjects in a better strategic position turn into extortioners, who quickly proceed to exploit their peers. By adapting their cooperation probabilities consistent with ZD theory, extortioners force their co-players to cooperate without being similarly cooperative themselves. Comparison to non-extortionate players under the same conditions indicates a substantial net gain to extortion. Our results thus highlight how power asymmetries can endanger mutually beneficial interactions, and transform them into exploitative relationships. In particular, our results indicate that the extortionate strategies predicted from ZD theory could play a more prominent role in our daily interactions than previously thought.
Collapse
Affiliation(s)
- Christian Hilbe
- Program for Evolutionary Dynamics, Department of Organismic and Evolutionary Biology and Department of Mathematics, Harvard University, Cambridge MA, United States of America
- IST Austria, Klosterneuburg, Austria
| | - Kristin Hagel
- Department of Evolutionary Ecology, Max-Planck-Institute for Evolutionary Biology, Plön, Germany
| | - Manfred Milinski
- Department of Evolutionary Ecology, Max-Planck-Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|