1
|
Feng T, Ma J, Yang Y, Mi Z. Synergistic effects of air pollution control policies: Evidence from China. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2025; 373:123581. [PMID: 39647296 DOI: 10.1016/j.jenvman.2024.123581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Revised: 11/09/2024] [Accepted: 12/01/2024] [Indexed: 12/10/2024]
Abstract
To control severe air pollution, multiple policies have been adopted in China. However, the combined effect of these policies on reducing air pollution remains uncertain. By systematically evaluating the complementary and substitution effects of policy combinations across different regions, this study aims to identify the optimal combinations of multiple air pollution control policies and to explore the mechanisms of policy synergy, maximizing improvements in air quality. Using a panel data from 334 Chinese cities collected between 2015 and 2019, we assess the effectiveness of fifteen air quality policies across eight categories. Our findings indicate that combining diverse policy measures yields more significant improvements in air quality due to the complementary effects among policies. Conversely, combinations of similar policy types exhibit weaker effects because of substitution effects. A comprehensive policy package, integrating stringent industrial emissions standards, residential energy transition strategies, market-based trading mechanisms, and multi-level supervision, may represent the most effective approach for air quality enhancement.
Collapse
Affiliation(s)
- Tong Feng
- School of Public Finance and Administration, Tianjin University of Finance and Economics, Tianjin, 300222, China
| | - Jie Ma
- School of Public Finance and Administration, Tianjin University of Finance and Economics, Tianjin, 300222, China
| | - Yuanjian Yang
- School of Atmospheric Physics, Nanjing University of Information Science & Technology, Nanjing, China
| | - Zhifu Mi
- The Bartlett School of Sustainable Construction, University College London, London, UK.
| |
Collapse
|
2
|
Sawicki J, Berner R, Loos SAM, Anvari M, Bader R, Barfuss W, Botta N, Brede N, Franović I, Gauthier DJ, Goldt S, Hajizadeh A, Hövel P, Karin O, Lorenz-Spreen P, Miehl C, Mölter J, Olmi S, Schöll E, Seif A, Tass PA, Volpe G, Yanchuk S, Kurths J. Perspectives on adaptive dynamical systems. CHAOS (WOODBURY, N.Y.) 2023; 33:071501. [PMID: 37486668 DOI: 10.1063/5.0147231] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 05/24/2023] [Indexed: 07/25/2023]
Abstract
Adaptivity is a dynamical feature that is omnipresent in nature, socio-economics, and technology. For example, adaptive couplings appear in various real-world systems, such as the power grid, social, and neural networks, and they form the backbone of closed-loop control strategies and machine learning algorithms. In this article, we provide an interdisciplinary perspective on adaptive systems. We reflect on the notion and terminology of adaptivity in different disciplines and discuss which role adaptivity plays for various fields. We highlight common open challenges and give perspectives on future research directions, looking to inspire interdisciplinary approaches.
Collapse
Affiliation(s)
- Jakub Sawicki
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Akademie Basel, Fachhochschule Nordwestschweiz FHNW, Leonhardsstrasse 6, 4009 Basel, Switzerland
| | - Rico Berner
- Department of Physics, Humboldt-Universität zu Berlin, Newtonstraße 15, 12489 Berlin, Germany
| | - Sarah A M Loos
- DAMTP, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, United Kingdom
| | - Mehrnaz Anvari
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53757 Sankt-Augustin, Germany
| | - Rolf Bader
- Institute of Systematic Musicology, University of Hamburg, Hamburg, Germany
| | - Wolfram Barfuss
- Transdisciplinary Research Area: Sustainable Futures, University of Bonn, 53113 Bonn, Germany
- Center for Development Research (ZEF), University of Bonn, 53113 Bonn, Germany
| | - Nicola Botta
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Göteborg, Sweden
| | - Nuria Brede
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Department of Computer Science, University of Potsdam, An der Bahn 2, 14476 Potsdam, Germany
| | - Igor Franović
- Scientific Computing Laboratory, Center for the Study of Complex Systems, Institute of Physics Belgrade, University of Belgrade, Pregrevica 118, 11080 Belgrade, Serbia
| | - Daniel J Gauthier
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
| | - Sebastian Goldt
- Department of Physics, International School of Advanced Studies (SISSA), Trieste, Italy
| | - Aida Hajizadeh
- Research Group Comparative Neuroscience, Leibniz Institute for Neurobiology, Magdeburg, Germany
| | - Philipp Hövel
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
| | - Omer Karin
- Department of Mathematics, Imperial College London, London SW7 2AZ, United Kingdom
| | - Philipp Lorenz-Spreen
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin, Germany
| | - Christoph Miehl
- Akademie Basel, Fachhochschule Nordwestschweiz FHNW, Leonhardsstrasse 6, 4009 Basel, Switzerland
| | - Jan Mölter
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Boltzmannstraße 3, 85748 Garching bei München, Germany
| | - Simona Olmi
- Akademie Basel, Fachhochschule Nordwestschweiz FHNW, Leonhardsstrasse 6, 4009 Basel, Switzerland
| | - Eckehard Schöll
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Akademie Basel, Fachhochschule Nordwestschweiz FHNW, Leonhardsstrasse 6, 4009 Basel, Switzerland
| | - Alireza Seif
- Pritzker School of Molecular Engineering, The University of Chicago, Chicago, Illinois 60637, USA
| | - Peter A Tass
- Department of Neurosurgery, Stanford University School of Medicine, Stanford, California 94304, USA
| | - Giovanni Volpe
- Department of Physics, University of Gothenburg, Gothenburg, Sweden
| | - Serhiy Yanchuk
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Department of Physics, Humboldt-Universität zu Berlin, Newtonstraße 15, 12489 Berlin, Germany
| | - Jürgen Kurths
- Potsdam Institute for Climate Impact Research, Telegrafenberg, 14473 Potsdam, Germany
- Department of Physics, Humboldt-Universität zu Berlin, Newtonstraße 15, 12489 Berlin, Germany
| |
Collapse
|
3
|
Barfuss W, Meylahn JM. Intrinsic fluctuations of reinforcement learning promote cooperation. Sci Rep 2023; 13:1309. [PMID: 36693872 PMCID: PMC9873645 DOI: 10.1038/s41598-023-27672-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 01/05/2023] [Indexed: 01/26/2023] Open
Abstract
In this work, we ask for and answer what makes classical temporal-difference reinforcement learning with [Formula: see text]-greedy strategies cooperative. Cooperating in social dilemma situations is vital for animals, humans, and machines. While evolutionary theory revealed a range of mechanisms promoting cooperation, the conditions under which agents learn to cooperate are contested. Here, we demonstrate which and how individual elements of the multi-agent learning setting lead to cooperation. We use the iterated Prisoner's dilemma with one-period memory as a testbed. Each of the two learning agents learns a strategy that conditions the following action choices on both agents' action choices of the last round. We find that next to a high caring for future rewards, a low exploration rate, and a small learning rate, it is primarily intrinsic stochastic fluctuations of the reinforcement learning process which double the final rate of cooperation to up to 80%. Thus, inherent noise is not a necessary evil of the iterative learning process. It is a critical asset for the learning of cooperation. However, we also point out the trade-off between a high likelihood of cooperative behavior and achieving this in a reasonable amount of time. Our findings are relevant for purposefully designing cooperative algorithms and regulating undesired collusive effects.
Collapse
Affiliation(s)
- Wolfram Barfuss
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Janusz M Meylahn
- Department of Applied Mathematics, University of Twente, Enschede, The Netherlands. .,Dutch Institute of Emergent Phenomena, University of Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
4
|
Abstract
The core commitment of strong sustainability, SS, is that nature really is different: there are strict limits to the substitutability of natural and other kinds of capital. Initially, the threat to sustainability was perceived as human greed and impatience, and the goal of SS to address resource scarcity was to sustain resource stocks, the flow of environmental services, and/or the harvest for human benefit. For landscapes and ecosystems, the SS goal was preservation, often in a gestalt framing: preserved or not. Two developments beginning around the mid-20th century—increasing awareness of the variability of natural systems, and the revolutionary changes in thinking motivated by the study of complex dynamic systems, CDS—re-oriented SS toward Safety, i.e., minimizing exposure to risk defined as threat of harm. Around 2010, the sustainability agenda for CDS shifted from identifying early warning indicators enabling timely interventions to forestall adverse regime change to promoting resilience by expanding scale and encouraging patchwork patterns of systems in various stages of their adaptive cycles. Nevertheless, the need for natural resources to substitute for depleted exhaustibles suggests a continuing role for commercial agriculture, plantation forestry, and managed fisheries. I conclude with a paradox still to be resolved: the need for continued and increased production from renewable resources to replace depleted exhaustibles suggests SS-motivated management practices that seem obsolete from a CDS perspective.
Collapse
|
5
|
Barfuss W, Mann RP. Modeling the effects of environmental and perceptual uncertainty using deterministic reinforcement learning dynamics with partial observability. Phys Rev E 2022; 105:034409. [PMID: 35428165 DOI: 10.1103/physreve.105.034409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 02/24/2022] [Indexed: 11/07/2022]
Abstract
Assessing the systemic effects of uncertainty that arises from agents' partial observation of the true states of the world is critical for understanding a wide range of scenarios, from navigation and foraging behavior to the provision of renewable resources and public infrastructures. Yet previous modeling work on agent learning and decision-making either lacks a systematic way to describe this source of uncertainty or puts the focus on obtaining optimal policies using complex models of the world that would impose an unrealistically high cognitive demand on real agents. In this work we aim to efficiently describe the emergent behavior of biologically plausible and parsimonious learning agents faced with partially observable worlds. Therefore we derive and present deterministic reinforcement learning dynamics where the agents observe the true state of the environment only partially. We showcase the broad applicability of our dynamics across different classes of partially observable agent-environment systems. We find that partial observability creates unintuitive benefits in several specific contexts, pointing the way to further research on a general understanding of such effects. For instance, partially observant agents can learn better outcomes faster, in a more stable way, and even overcome social dilemmas. Furthermore, our method allows the application of dynamical systems theory to partially observable multiagent leaning. In this regard we find the emergence of catastrophic limit cycles, a critical slowing down of the learning processes between reward regimes, and the separation of the learning dynamics into fast and slow directions, all caused by partial observability. Therefore, the presented dynamics have the potential to become a formal, yet practical, lightweight and robust tool for researchers in biology, social science, and machine learning to systematically investigate the effects of interacting partially observant agents.
Collapse
Affiliation(s)
- Wolfram Barfuss
- Institute for Theoretical Physics, University of Tübingen, 72076 Tübingen, Germany.,Department of Statistics, School of Mathematics, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Richard P Mann
- Department of Statistics, School of Mathematics, University of Leeds, Leeds LS2 9JT, United Kingdom
| |
Collapse
|
6
|
Barfuss W. Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics. Neural Comput Appl 2022; 34:1653-1671. [PMID: 35221541 PMCID: PMC8827307 DOI: 10.1007/s00521-021-06117-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 05/11/2021] [Indexed: 01/02/2023]
Abstract
A dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems account of multi-agent learning should be interpreted. In this article, I propose to embed the dynamical systems description of multi-agent learning into different abstraction levels of cognitive analysis. The purpose of this work is to make the connections between these levels explicit in order to gain improved insight into multi-agent learning. I demonstrate the usefulness of this framework with the general and widespread class of temporal-difference reinforcement learning. I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty. I then propose an on-line sample-batch temporal-difference algorithm which is characterized by the combination of applying a memory-batch and separated state-action value estimation. I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under large batch sizes. Ultimately, this framework of embedding a dynamical systems description into different abstraction levels gives guidance on how to unleash the full potential of the dynamical systems approach to multi-agent learning.
Collapse
Affiliation(s)
- Wolfram Barfuss
- School of Mathematics, University of Leeds, Leeds, UK.,Tübingen AI Center, University of Tübingen, Tübingen, Germany
| |
Collapse
|
7
|
Coordination and equilibrium selection in games: the role of local effects. Sci Rep 2022; 12:3373. [PMID: 35233046 PMCID: PMC8888577 DOI: 10.1038/s41598-022-07195-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 02/11/2022] [Indexed: 01/28/2023] Open
Abstract
We study the role of local effects and finite size effects in reaching coordination and in equilibrium selection in two-player coordination games. We investigate three update rules - the replicator dynamics (RD), the best response (BR), and the unconditional imitation (UI). For the pure coordination game with two equivalent strategies we find a transition from a disordered state to coordination for a critical value of connectivity. The transition is system-size-independent for the BR and RD update rules. For the IU it is system-size-dependent, but coordination can always be reached below the connectivity of a complete graph. We also consider the general coordination game which covers a range of games, such as the stag hunt. For these games there is a payoff-dominant strategy and a risk-dominant strategy with associated states of equilibrium coordination. We analyse equilibrium selection analytically and numerically. For the RD and BR update rules mean-field predictions agree with simulations and the risk-dominant strategy is evolutionary favoured independently of local effects. When players use the unconditional imitation, however, we observe coordination in the payoff-dominant strategy. Surprisingly, the selection of pay-off dominant equilibrium only occurs below a critical value of the network connectivity and disappears in complete graphs. As we show, it is a combination of local effects and update rule that allows for coordination on the payoff-dominant strategy.
Collapse
|
8
|
Resource Scarcity and Sustainability—The Shapes Have Shifted but the Stakes Keep Rising. SUSTAINABILITY 2021. [DOI: 10.3390/su13105751] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The objective is to provide an interpretive reading of the literature in resource scarcity and sustainability theory from the nineteenth century to the present time, focusing on shifts that have occurred in problem definition, conceptual framing, research tools applied, findings, and their implications. My reading shows, as one would expect, that the discourse has become more technical and the analysis more sophisticated; special cases have been incorporated into the mainstream of theory; and, where relevant, dynamic formulations have largely supplanted static analysis. However, that is barely scratching the surface. Here, I focus on more fundamental shifts. Exhaustible and renewable resource analyses were incorporated into the mainstream theory of financial and capital markets. Parallels between the resources and environmental spheres were discovered: market failure concepts, fundamental to environmental policy, found applications in the resources sector (e.g., fisheries), and renewable resource management concepts and approaches (e.g., waste assimilation capacity) were adopted in environmental policy. To motivate sustainability theory and assessment, there has been a foundational problem shift from restraining human greed to dealing with risk viewed as chance of harm, and a newfound willingness to look beyond stochastic risk to uncertainty, ambiguity, and gross ignorance. Newtonian dynamics, which seeks a stable equilibrium following a shock, gave way to a new dynamics of complexity that valued resilience in the face of shocks, warned of potential for regime shifts, and focused on the possibility of systemic collapse and recovery, perhaps incomplete. New concepts of sustainability (a safe minimum standard of conservation, the precautionary principle, and planetary boundaries) emerged, along with hybrid approaches such as WS-plus which treats weak sustainability (WS) as the default but may impose strong sustainability restrictions on a few essential but threatened resources. The strong sustainability objective has evolved from maintaining baseline flows of resource services to safety defined as minimizing the chance of irreversible collapse. New tools for management and policy (sustainability indicators and downscaled planetary boundaries) have proliferated, and still struggle to keep up with the emerging understanding of complex systems.
Collapse
|
9
|
Yafei W, Jie F, Jiuyi L, Bing-Bing Z, Qiang W. Methodological framework for identifying sustainability intervention priority areas on coastal landscapes and its application in China. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 766:142603. [PMID: 33601669 DOI: 10.1016/j.scitotenv.2020.142603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 09/17/2020] [Accepted: 09/21/2020] [Indexed: 06/12/2023]
Abstract
In regional sustainability evaluation and policy analysis, the paradigm of safe operating spaces (SOS) has been widely applied. Yet, SOS is not readily useful for informing policy interventions toward sustainability transition. This study reports on a methodological framework that operationalizes SOS at the regional scale for designing spatially targeted sustainability interventions. In particular, this framework accounts for teleology by integrating policy orientations of the place-variant "major function" of development, and provides early-warnings by integrating long-term social-environmental trends. The framework we proposed has been applied by the Chinese government in a coastal province (Liaoning) for a landscape sustainability project, which is introduced here step-by-step. The four main steps include: (1) Quantifying SOS status across multiple "what to sustain" dimensions, e.g., land scarcity, water scarcity, pollutant discharge, and ecosystem health for the inland, and coastal exploitation intensity, marine environmental quality, and marine ecosystem biodiversity for the sea. (2) Quantifying SOS status in terms of the place-variant "what to develop" dimensions, e.g., urbanization-oriented, agriculture-stock-oriented, versus conservation-oriented development. (3) Integrating the two as a composite indicator of three ordinal levels to classify the current SOS status. (4) Developing a multi-level sustainability early-warning system by cross-analysis of the SOS status and social-environmental interaction trends (e.g., changes in, e.g., resource utilization efficiency, pollutant discharge, and eco-environmental quality). The potential use of the framework is demonstrated through the case of Liaoning Province, China, which helps policy-makers to identify priority areas for sustainability interventions. Methodological robustness and future directions of applying this multi-level sustainability early-warning system are further discussed.
Collapse
Affiliation(s)
- Wang Yafei
- Key Laboratory of Regional Sustainable Development Modeling, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fan Jie
- Key Laboratory of Regional Sustainable Development Modeling, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Li Jiuyi
- Key Laboratory of Regional Sustainable Development Modeling, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhou Bing-Bing
- School of Sustainability, Arizona State University, Tempe, AZ 85287, USA.
| | - Wang Qiang
- School of Geographical Sciences, Fujian Normal University, Fuzhou, Fujian 350007, China
| |
Collapse
|
10
|
Monitoring Sustainability and Targeting Interventions: Indicators, Planetary Boundaries, Benefits and Costs. SUSTAINABILITY 2021. [DOI: 10.3390/su13063181] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This article shows how sustainability indicators (SIs) which have proliferated, and downscaled planetary boundaries (DPBs) which have recently emerged, can be used to target remedial interventions. I offer an integrative analysis drawing upon the existing literature, challenging, clarifying, and amending it in some ways, and extending it with new insights. The exposition is couched in the example of pollution control, but the analysis also applies to resource management with only modest amendments. Key conclusions are summarized. (i) In a default case where damage is indifferent to location within the problem shed and transactions costs are trivial, minimizing abatement costs requires that all units face the same marginal price of emissions and can be implemented by price setting at the jurisdictional level or cap and trade in pollution reduction credits. Larger geographic scale tends to reduce the average cost of abatement, an argument for coordination at the problem-shed level. Deviations from the default policy may be appropriate for addressing large point sources and local hot spots where damage is concentrated. (ii) A framework winnowing the proliferation of SIs includes the following principles: for quantitative target setting, SIs should address sustainability in its long-term context; SIs should be measured in ratio scale, whereas ordinal-scale SIs are common; and SIs should be selected for their usefulness in mapping the relationships among emissions, ambient concentrations, and damage. (iii) Target setting requires science-based empirical relationships and social values to assess trade-offs between abatement and its opportunity costs and suggest upper limits on tolerable damage. (iv) PBs that address global public goods can usefully be downscaled to set abatement targets. The PBs are science based and, in their original form, propose replacing social values with imperatives: violating the PB will doom the planet, which is unacceptable given any plausible value system. Given that PB = ∑DPB over all jurisdictions, global trading of credits would minimize costs of honoring the PB. Trade among a willing subset of jurisdictions could minimize the costs of meeting its aggregate DPB. (v) In contrast to most SI approaches, a cost–benefit (CB) approach can deal with substitutability and complementarity among sustainability objectives and evaluate multi-component policies. Net benefits are maximized when the marginal cost of abatement equals the marginal benefit for all units in the problem shed. This can be attained by price setting at the jurisdictional level or trade in credits. (vi) A major advantage of the CB approach is its well-defined relationship to weak sustainability. However, its value measures over-weight the preferences of the well-off. Equity considerations suggest relief from strict CB criteria in the case of essentials such as human health and nutrition, and subsidization by rich countries of sustainability projects in low-income countries.
Collapse
|
11
|
|
12
|
Abstract
This article examines sustainability from a policy perspective rooted in environmental economics and environmental ethics. Endorsing the Brundtland Commission stance that each generation should have undiminished opportunity to meet its own needs, I emphasize the foundational status of the intergenerational commitment. The standard concepts of weak and strong sustainability, WS and SS, are sketched and critiqued simply and intuitively, along with the more recent concept of WS-plus. A recently proposed model of a society dependent on a renewable but vulnerable resource (Barfuss et al. 2018) is introduced as an expositional tool, as its authors intended, and used as a platform for thought experiments exploring the role of risk management tools in reducing the need for safety. Key conclusions include: (i) Safety, in this case, the elimination of risk in uncertain production systems, comes at an opportunity cost that is often non-trivial. (ii) Welfare shocks can be cushioned by savings and diversification, which are enhanced by scale. Scale increases with geographic area, diversity of production, organizational complexity, and openness to trade and human migration. (iii) Increasing scale enables enhancement of sustainable welfare via local and regional specialization, and the need for safety and its attendant opportunity costs is reduced. (iv) When generational welfare is stochastic, the intergenerational commitment should not be abandoned but may need to be adapted to uncertainty, e.g., by expecting less from hard-luck generations and correspondingly more from more fortunate ones. (v) Intergenerational commitments must be resolved in the context of intragenerational obligations to each other in the here and now, and compensation of those asked to make sacrifices for sustainability has both ethical and pragmatic virtue. (vi) Finally, the normative domains of sustainability and safety can be distinguished—sustainability always, but safety only when facing daunting threats.
Collapse
|
13
|
Caring for the future can turn tragedy into comedy for long-term collective action under risk of collapse. Proc Natl Acad Sci U S A 2020; 117:12915-12922. [PMID: 32434908 DOI: 10.1073/pnas.1916545117] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We will need collective action to avoid catastrophic climate change, and this will require valuing the long term as well as the short term. Shortsightedness and uncertainty have hindered progress in resolving this collective action problem and have been recognized as important barriers to cooperation among humans. Here, we propose a coupled social-ecological dilemma to investigate the interdependence of three well-identified components of this cooperation problem: 1) timescales of collapse and recovery in relation to time preferences regarding future outcomes, 2) the magnitude of the impact of collapse, and 3) the number of actors in the collective. We find that, under a sufficiently severe and time-distant collapse, how much the actors care for the future can transform the game from a tragedy of the commons into one of coordination, and even into a comedy of the commons in which cooperation dominates. Conversely, we also find conditions under which even strong concern for the future still does not transform the problem from tragedy to comedy. For a large number of participating actors, we find that the critical collapse impact, at which these game regime changes happen, converges to a fixed value of collapse impact per actor that is independent of the enhancement factor of the public good, which is usually regarded as the driver of the dilemma. Our results not only call for experimental testing but also help explain why polarization in beliefs about human-caused climate change can threaten global cooperation agreements.
Collapse
|
14
|
Strnad FM, Barfuss W, Donges JF, Heitzig J. Deep reinforcement learning in World-Earth system models to discover sustainable management strategies. CHAOS (WOODBURY, N.Y.) 2019; 29:123122. [PMID: 31893656 DOI: 10.1063/1.5124673] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 11/20/2019] [Indexed: 06/10/2023]
Abstract
Increasingly complex nonlinear World-Earth system models are used for describing the dynamics of the biophysical Earth system and the socioeconomic and sociocultural World of human societies and their interactions. Identifying pathways toward a sustainable future in these models for informing policymakers and the wider public, e.g., pathways leading to robust mitigation of dangerous anthropogenic climate change, is a challenging and widely investigated task in the field of climate research and broader Earth system science. This problem is particularly difficult when constraints on avoiding transgressions of planetary boundaries and social foundations need to be taken into account. In this work, we propose to combine recently developed machine learning techniques, namely, deep reinforcement learning (DRL), with classical analysis of trajectories in the World-Earth system. Based on the concept of the agent-environment interface, we develop an agent that is generally able to act and learn in variable manageable environment models of the Earth system. We demonstrate the potential of our framework by applying DRL algorithms to two stylized World-Earth system models. Conceptually, we explore thereby the feasibility of finding novel global governance policies leading into a safe and just operating space constrained by certain planetary and socioeconomic boundaries. The artificially intelligent agent learns that the timing of a specific mix of taxing carbon emissions and subsidies on renewables is of crucial relevance for finding World-Earth system trajectories that are sustainable in the long term.
Collapse
Affiliation(s)
- Felix M Strnad
- FutureLab on Game Theory and Networks of Interacting Agents, Research Department 4: Complexity Science, Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany
| | - Wolfram Barfuss
- FutureLab on Earth Resilience in the Anthropocene, Research Department 1: Earth System Analysis, Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany
| | - Jonathan F Donges
- FutureLab on Earth Resilience in the Anthropocene, Research Department 1: Earth System Analysis, Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany
| | - Jobst Heitzig
- FutureLab on Game Theory and Networks of Interacting Agents, Research Department 4: Complexity Science, Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany
| |
Collapse
|
15
|
Palos R, Gutiérrez A, Hita I, Castaño P, Thybaut JW, Arandes JM, Bilbao J. Kinetic Modeling of Hydrotreating for Enhanced Upgrading of Light Cycle Oil. Ind Eng Chem Res 2019. [DOI: 10.1021/acs.iecr.9b02095] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Roberto Palos
- Department of Chemical Engineering, University of the Basque Country UPV/EHU, P.O. Box 644, 48080 Bilbao, Spain
| | - Alazne Gutiérrez
- Department of Chemical Engineering, University of the Basque Country UPV/EHU, P.O. Box 644, 48080 Bilbao, Spain
| | - Idoia Hita
- Department of Chemical Engineering, University of the Basque Country UPV/EHU, P.O. Box 644, 48080 Bilbao, Spain
| | - Pedro Castaño
- Department of Chemical Engineering, University of the Basque Country UPV/EHU, P.O. Box 644, 48080 Bilbao, Spain
| | - Joris W. Thybaut
- Laboratory for Chemical Technology, Ghent University, Technologiepark 125, B-9052 Ghent, Belgium
| | - José M. Arandes
- Department of Chemical Engineering, University of the Basque Country UPV/EHU, P.O. Box 644, 48080 Bilbao, Spain
| | - Javier Bilbao
- Department of Chemical Engineering, University of the Basque Country UPV/EHU, P.O. Box 644, 48080 Bilbao, Spain
| |
Collapse
|
16
|
Barfuss W, Donges JF, Kurths J. Deterministic limit of temporal difference reinforcement learning for stochastic games. Phys Rev E 2019; 99:043305. [PMID: 31108579 DOI: 10.1103/physreve.99.043305] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Indexed: 11/07/2022]
Abstract
Reinforcement learning in multiagent systems has been studied in the fields of economic game theory, artificial intelligence, and statistical physics by developing an analytical understanding of the learning dynamics (often in relation to the replicator dynamics of evolutionary game theory). However, the majority of these analytical studies focuses on repeated normal form games, which only have a single environmental state. Environmental dynamics, i.e., changes in the state of an environment affecting the agents' payoffs has received less attention, lacking a universal method to obtain deterministic equations from established multistate reinforcement learning algorithms. In this work we present a methodological extension, separating the interaction from the adaptation timescale, to derive the deterministic limit of a general class of reinforcement learning algorithms, called temporal difference learning. This form of learning is equipped to function in more realistic multistate environments by using the estimated value of future environmental states to adapt the agent's behavior. We demonstrate the potential of our method with the three well-established learning algorithms Q learning, SARSA learning, and actor-critic learning. Illustrations of their dynamics on two multiagent, multistate environments reveal a wide range of different dynamical regimes, such as convergence to fixed points, limit cycles, and even deterministic chaos.
Collapse
Affiliation(s)
- Wolfram Barfuss
- Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany.,Department of Physics, Humboldt University Berlin, 12489 Berlin, Germany
| | - Jonathan F Donges
- Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany.,Stockholm Resilience Centre, Stockholm University, 104 05 Stockholm, Sweden
| | - Jürgen Kurths
- Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany.,Department of Physics, Humboldt University Berlin, 12489 Berlin, Germany.,Saratov State University, 410012 Saratov, Russia
| |
Collapse
|