1
|
Gillman E, Rose DC, Garrahan JP. Combining Reinforcement Learning and Tensor Networks, with an Application to Dynamical Large Deviations. PHYSICAL REVIEW LETTERS 2024; 132:197301. [PMID: 38804929 DOI: 10.1103/physrevlett.132.197301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 02/28/2024] [Accepted: 04/04/2024] [Indexed: 05/29/2024]
Abstract
We present a framework to integrate tensor network (TN) methods with reinforcement learning (RL) for solving dynamical optimization tasks. We consider the RL actor-critic method, a model-free approach for solving RL problems, and introduce TNs as the approximators for its policy and value functions. Our "actor-critic with tensor networks" (ACTeN) method is especially well suited to problems with large and factorizable state and action spaces. As an illustration of the applicability of ACTeN we solve the exponentially hard task of sampling rare trajectories in two paradigmatic stochastic models, the East model of glasses and the asymmetric simple exclusion process, the latter being particularly challenging to other methods due to the absence of detailed balance. With substantial potential for further integration with the vast array of existing RL methods, the approach introduced here is promising both for applications in physics and to multi-agent RL problems more generally.
Collapse
Affiliation(s)
- Edward Gillman
- School of Physics and Astronomy, University of Nottingham, Nottingham NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| | - Dominic C Rose
- Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, United Kingdom
| | - Juan P Garrahan
- School of Physics and Astronomy, University of Nottingham, Nottingham NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| |
Collapse
|
2
|
Causer L, Bañuls MC, Garrahan JP. Optimal Sampling of Dynamical Large Deviations in Two Dimensions via Tensor Networks. PHYSICAL REVIEW LETTERS 2023; 130:147401. [PMID: 37084432 DOI: 10.1103/physrevlett.130.147401] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 03/20/2023] [Indexed: 05/03/2023]
Abstract
We use projected entangled-pair states (PEPS) to calculate the large deviation statistics of the dynamical activity of the two-dimensional East model, and the two-dimensional symmetric simple exclusion process (SSEP) with open boundaries, in lattices of up to 40×40 sites. We show that at long times both models have phase transitions between active and inactive dynamical phases. For the 2D East model we find that this trajectory transition is of the first order, while for the SSEP we find indications of a second order transition. We then show how the PEPS can be used to implement a trajectory sampling scheme capable of directly accessing rare trajectories. We also discuss how the methods described here can be extended to study rare events at finite times.
Collapse
Affiliation(s)
- Luke Causer
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| | - Mari Carmen Bañuls
- Max-Planck-Institut für Quantenoptik, Hans-Kopfermann-Str. 1, D-85748 Garching, Germany
- Munich Center for Quantum Science and Technology (MCQST), Schellingstr. 4, D-80799 München
| | - Juan P Garrahan
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| |
Collapse
|
3
|
Coghi F, Touchette H. Adaptive power method for estimating large deviations in Markov chains. Phys Rev E 2023; 107:034137. [PMID: 37073072 DOI: 10.1103/physreve.107.034137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 01/29/2023] [Indexed: 04/20/2023]
Abstract
We study the performance of a stochastic algorithm based on the power method that adaptively learns the large deviation functions characterizing the fluctuations of additive functionals of Markov processes, used in physics to model nonequilibrium systems. This algorithm was introduced in the context of risk-sensitive control of Markov chains and was recently adapted to diffusions evolving continuously in time. Here we provide an in-depth study of the convergence of this algorithm close to dynamical phase transitions, exploring the speed of convergence as a function of the learning rate and the effect of including transfer learning. We use as a test example the mean degree of a random walk on an Erdős-Rényi random graph, which shows a transition between high-degree trajectories of the random walk evolving in the bulk of the graph and low-degree trajectories evolving in dangling edges of the graph. The results show that the adaptive power method is efficient close to dynamical phase transitions, while having many advantages in terms of performance and complexity compared to other algorithms used to compute large deviation functions.
Collapse
Affiliation(s)
- Francesco Coghi
- Nordita, KTH Royal Institute of Technology and Stockholm University, Stockholm, Sweden
| | - Hugo Touchette
- Department of Mathematical Sciences, Stellenbosch University, Stellenbosch, South Africa
| |
Collapse
|
4
|
Causer L, Bañuls MC, Garrahan JP. Finite Time Large Deviations via Matrix Product States. PHYSICAL REVIEW LETTERS 2022; 128:090605. [PMID: 35302837 DOI: 10.1103/physrevlett.128.090605] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 02/15/2022] [Indexed: 06/14/2023]
Abstract
Recent work has shown the effectiveness of tensor network methods for computing large deviation functions in constrained stochastic models in the infinite time limit. Here we show that these methods can also be used to study the statistics of dynamical observables at arbitrary finite time. This is a harder problem because, in contrast to the infinite time case, where only the extremal eigenstate of a tilted Markov generator is relevant, for finite time the whole spectrum plays a role. We show that finite time dynamical partition sums can be computed efficiently and accurately in one dimension using matrix product states and describe how to use such results to generate rare event trajectories on demand. We apply our methods to the Fredrickson-Andersen and East kinetically constrained models and to the symmetric simple exclusion process, unveiling dynamical phase diagrams in terms of counting field and trajectory time. We also discuss extensions of this method to higher dimensions.
Collapse
Affiliation(s)
- Luke Causer
- School of Physics and Astronomy, University of Nottingham, Nottingham NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| | - Mari Carmen Bañuls
- Max-Planck-Institut für Quantenoptik, Hans-Kopfermann-Strasse 1, D-85748 Garching, Germany
- Munich Center for Quantum Science and Technology (MCQST), Schellingstrasse 4, D-80799 München, Germany
| | - Juan P Garrahan
- School of Physics and Astronomy, University of Nottingham, Nottingham NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| |
Collapse
|
5
|
Wilkinson JWP, Prosen T, Garrahan JP. Exact solution of the "Rule 150" reversible cellular automaton. Phys Rev E 2022; 105:034124. [PMID: 35428052 DOI: 10.1103/physreve.105.034124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 02/18/2022] [Indexed: 06/14/2023]
Abstract
We study the dynamics and statistics of the Rule 150 reversible cellular automaton (RCA). This is a one-dimensional lattice system of binary variables with synchronous (Floquet) dynamics that corresponds to a bulk deterministic and reversible discretized version of the kinetically constrained "exclusive one-spin facilitated" (XOR) Fredrickson-Andersen (FA) model, where the local dynamics is restricted: A site flips if and only if its adjacent sites are in different states from each other. Similar to other RCA that have been recently studied, such as Rule 54 and Rule 201, the Rule 150 RCA is integrable, however, in contrast is noninteracting: The emergent quasiparticles, which are identified by the domain walls, behave as free fermions. This property allows us to solve the model by means of matrix product ansatz. In particular, we find the exact equilibrium and nonequilibrium stationary states for systems with closed (periodic) and open (stochastic) boundaries, respectively, resolve the full spectrum of the time evolution operator and, therefore, gain access to the relaxation dynamics, and obtain the exact large deviation statistics of dynamical observables in the long-time limit.
Collapse
Affiliation(s)
- Joseph W P Wilkinson
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| | - Tomaž Prosen
- Department of Physics, Faculty of Mathematics and Physics, University of Ljubljana, SI-1000 Ljubljana, Slovenia
| | - Juan P Garrahan
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
- Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| |
Collapse
|
6
|
Das A, Rose DC, Garrahan JP, Limmer DT. Reinforcement learning of rare diffusive dynamics. J Chem Phys 2021; 155:134105. [PMID: 34624994 DOI: 10.1063/5.0057323] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present a method to probe rare molecular dynamics trajectories directly using reinforcement learning. We consider trajectories that are conditioned to transition between regions of configuration space in finite time, such as those relevant in the study of reactive events, and trajectories exhibiting rare fluctuations of time-integrated quantities in the long time limit, such as those relevant in the calculation of large deviation functions. In both cases, reinforcement learning techniques are used to optimize an added force that minimizes the Kullback-Leibler divergence between the conditioned trajectory ensemble and a driven one. Under the optimized added force, the system evolves the rare fluctuation as a typical one, affording a variational estimate of its likelihood in the original trajectory ensemble. Low variance gradients employing value functions are proposed to increase the convergence of the optimal force. The method we develop employing these gradients leads to efficient and accurate estimates of both the optimal force and the likelihood of the rare event for a variety of model systems.
Collapse
Affiliation(s)
- Avishek Das
- Department of Chemistry, University of California, Berkeley, California 94609, USA
| | - Dominic C Rose
- School of Physics and Astronomy, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| | - Juan P Garrahan
- School of Physics and Astronomy, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| | - David T Limmer
- Department of Chemistry, University of California, Berkeley, California 94609, USA
| |
Collapse
|
7
|
Whitelam S, Jacobson D, Tamblyn I. Evolutionary reinforcement learning of dynamical large deviations. J Chem Phys 2020; 153:044113. [PMID: 32752661 DOI: 10.1063/5.0015301] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
We show how to bound and calculate the likelihood of dynamical large deviations using evolutionary reinforcement learning. An agent, a stochastic model, propagates a continuous-time Monte Carlo trajectory and receives a reward conditioned upon the values of certain path-extensive quantities. Evolution produces progressively fitter agents, potentially allowing the calculation of a piece of a large-deviation rate function for a particular model and path-extensive quantity. For models with small state spaces, the evolutionary process acts directly on rates, and for models with large state spaces, the process acts on the weights of a neural network that parameterizes the model's rates. This approach shows how path-extensive physics problems can be considered within a framework widely used in machine learning.
Collapse
Affiliation(s)
- Stephen Whitelam
- Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California 94720, USA
| | - Daniel Jacobson
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Isaac Tamblyn
- National Research Council of Canada, Ottawa, Ontario K1N 5A2, Canada
| |
Collapse
|
8
|
Ferré G, Stoltz G. Large deviations of empirical measures of diffusions in weighted topologies. ELECTRON J PROBAB 2020. [DOI: 10.1214/20-ejp514] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
9
|
Buča B, Garrahan JP, Prosen T, Vanicat M. Exact large deviation statistics and trajectory phase transition of a deterministic boundary driven cellular automaton. Phys Rev E 2019; 100:020103. [PMID: 31574613 DOI: 10.1103/physreve.100.020103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Indexed: 06/10/2023]
Abstract
We study the statistical properties of the long-time dynamics of the rule 54 reversible cellular automaton (CA), driven stochastically at its boundaries. This CA can be considered as a discrete-time and deterministic version of the Fredrickson-Andersen kinetically constrained model (KCM). By means of a matrix product ansatz, we compute the exact large deviation cumulant generating functions for a wide range of time-extensive observables of the dynamics, together with their associated rate functions and conditioned long-time distributions over configurations. We show that for all instances of boundary driving the CA dynamics occurs at the point of phase coexistence between competing active and inactive dynamical phases, similar to what happens in more standard KCMs. We also find the exact finite size scaling behavior of these trajectory transitions, and provide the explicit "Doob-transformed" dynamics that optimally realizes rare dynamical events.
Collapse
Affiliation(s)
- Berislav Buča
- Clarendon Laboratory, University of Oxford, Parks Road, Oxford OX1 3PU, United Kingdom
| | - Juan P Garrahan
- School of Physics and Astronomy and Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| | - Tomaž Prosen
- Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, SI-1000 Ljubljana, Slovenia
| | - Matthieu Vanicat
- Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, SI-1000 Ljubljana, Slovenia
| |
Collapse
|