1
|
Jurgens AM, Brodu N. Inferring kernel ϵ-machines: Discovering structure in complex systems. CHAOS (WOODBURY, N.Y.) 2025; 35:033162. [PMID: 40163394 DOI: 10.1063/5.0242981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 03/10/2025] [Indexed: 04/02/2025]
Abstract
Previously, we showed that computational mechanic's causal states-predictively equivalent trajectory classes for a stochastic dynamical system-can be cast into a reproducing kernel Hilbert space. The result is a widely applicable method that infers causal structure directly from very different kinds of observations and systems. Here, we expand this method to explicitly introduce the causal diffusion components it produces. These encode the kernel causal state estimates as a set of coordinates in a reduced dimension space. We show how each component extracts predictive features from data and demonstrate their application on four examples: first, a simple pendulum-an exactly solvable system; second, a molecular-dynamic trajectory of n-butane-a high-dimensional system with a well-studied energy landscape; third, the monthly sunspot sequence-the longest-running available time series of direct observations; and fourth, multi-year observations of an active crop field-a set of heterogeneous observations of the same ecosystem taken for over a decade. In this way, we demonstrate that the empirical kernel causal state algorithm robustly discovers predictive structures for systems with widely varying dimensionality and stochasticity.
Collapse
Affiliation(s)
| | - Nicolas Brodu
- INRIA Bordeaux Sud Ouest, 33405 Talence Cedex, France
| |
Collapse
|
2
|
Tsutsumi N, Nakai K, Saiki Y. Data-driven ordinary-differential-equation modeling of high-frequency complex dynamics via a low-frequency dynamics model. Phys Rev E 2025; 111:014212. [PMID: 39972814 DOI: 10.1103/physreve.111.014212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 12/03/2024] [Indexed: 02/21/2025]
Abstract
In our previous paper [N. Tsutsumi et al., Chaos 32, 091101 (2022)10.1063/5.0100166], we proposed a method for constructing a system of differential equations of chaotic behavior from only observable deterministic time series, which we call the radial function-based regression (RfR) method. However, when the targeted variable's behavior is rather complex, the direct application of the RfR method does not function well. In this study, we propose a method of modeling such dynamics, including the high-frequency intermittent behavior of a fluid flow, by considering another variable (base variable) showing relatively simple, less intermittent behavior. We construct an autonomous joint model composed of two parts: the first is an autonomous system of a base variable, and the other concerns the targeted variable being affected by a term involving the base variable to demonstrate complex dynamics. The constructed joint model succeeded in not only inferring a short trajectory but also reconstructing chaotic sets and statistical properties obtained from a long trajectory such as the density distributions of the actual dynamics.
Collapse
Affiliation(s)
- Natsuki Tsutsumi
- Hitotsubashi University, Faculty of Commerce and Management, Tokyo 186-8601, Japan
| | - Kengo Nakai
- Okayama University, The Graduate School of Environment, Life, Natural Science and Technology, Okayama 700-0082, Japan
| | - Yoshitaka Saiki
- Hitotsubashi University, Graduate School of Business Administration, Tokyo 186-8601, Japan
| |
Collapse
|
3
|
Froyland G, Giannakis D, Luna E, Slawinska J. Revealing trends and persistent cycles of non-autonomous systems with autonomous operator-theoretic techniques. Nat Commun 2024; 15:4268. [PMID: 38769111 PMCID: PMC11106270 DOI: 10.1038/s41467-024-48033-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Accepted: 04/16/2024] [Indexed: 05/22/2024] Open
Abstract
An important problem in modern applied science is to characterize the behavior of systems with complex internal dynamics subjected to external forcings. Many existing approaches rely on ensembles to generate information from the external forcings, making them unsuitable to study natural systems where only a single realization is observed. A prominent example is climate dynamics, where an objective identification of signals in the observational record attributable to natural variability and climate change is crucial for making climate projections for the coming decades. Here, we show that operator-theoretic techniques previously developed to identify slowly decorrelating observables of autonomous dynamical systems provide a powerful means for identifying nonlinear trends and persistent cycles of non-autonomous systems using data from a single trajectory of the system. We apply our framework to real-world examples from climate dynamics: Variability of sea surface temperature over the industrial era and the mid-Pleistocene transition of Quaternary glaciation cycles.
Collapse
Affiliation(s)
- Gary Froyland
- School of Mathematics and Statistics, University of New South Wales, Sydney, NSW, 2052, Australia.
| | - Dimitrios Giannakis
- Department of Mathematics, Dartmouth College, Hanover, NH, 03755, USA
- Department of Physics and Astronomy, Dartmouth College, Hanover, NH, 03755, USA
| | - Edoardo Luna
- Department of Physics, University of Texas at Austin, Austin, TX, 78712, USA
| | - Joanna Slawinska
- Department of Mathematics, Dartmouth College, Hanover, NH, 03755, USA
| |
Collapse
|
4
|
Rydzewski J, Gökdemir T. Learning Markovian dynamics with spectral maps. J Chem Phys 2024; 160:091102. [PMID: 38436438 DOI: 10.1063/5.0189241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 02/05/2024] [Indexed: 03/05/2024] Open
Abstract
The long-time behavior of many complex molecular systems can often be described by Markovian dynamics in a slow subspace spanned by a few reaction coordinates referred to as collective variables (CVs). However, determining CVs poses a fundamental challenge in chemical physics. Depending on intuition or trial and error to construct CVs can lead to non-Markovian dynamics with long memory effects, hindering analysis. To address this problem, we continue to develop a recently introduced deep-learning technique called spectral map [J. Rydzewski, J. Phys. Chem. Lett. 14, 5216-5220 (2023)]. Spectral map learns slow CVs by maximizing a spectral gap of a Markov transition matrix describing anisotropic diffusion. Here, to represent heterogeneous and multiscale free-energy landscapes with spectral map, we implement an adaptive algorithm to estimate transition probabilities. Through a Markov state model analysis, we validate that spectral map learns slow CVs related to the dominant relaxation timescales and discerns between long-lived metastable states.
Collapse
Affiliation(s)
- Jakub Rydzewski
- Institute of Physics, Faculty of Physics, Astronomy and Informatics, Nicolaus Copernicus University, Grudziadzka 5, 87-100 Toruń, Poland
| | - Tuğçe Gökdemir
- Institute of Physics, Faculty of Physics, Astronomy and Informatics, Nicolaus Copernicus University, Grudziadzka 5, 87-100 Toruń, Poland
| |
Collapse
|
5
|
Lücke M, Winkelmann S, Heitzig J, Molkenthin N, Koltai P. Learning interpretable collective variables for spreading processes on networks. Phys Rev E 2024; 109:L022301. [PMID: 38491651 DOI: 10.1103/physreve.109.l022301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 12/28/2023] [Indexed: 03/18/2024]
Abstract
Collective variables (CVs) are low-dimensional projections of high-dimensional system states. They are used to gain insights into complex emergent dynamical behaviors of processes on networks. The relation between CVs and network measures is not well understood and its derivation typically requires detailed knowledge of both the dynamical system and the network topology. In this Letter, we present a data-driven method for algorithmically learning and understanding CVs for binary-state spreading processes on networks of arbitrary topology. We demonstrate our method using four example networks: the stochastic block model, a ring-shaped graph, a random regular graph, and a scale-free network generated by the Albert-Barabási model. Our results deliver evidence for the existence of low-dimensional CVs even in cases that are not yet understood theoretically.
Collapse
Affiliation(s)
- Marvin Lücke
- Modeling and Simulation of Complex Processes, Zuse Institute Berlin, 14195 Berlin, Germany
| | - Stefanie Winkelmann
- Modeling and Simulation of Complex Processes, Zuse Institute Berlin, 14195 Berlin, Germany
| | - Jobst Heitzig
- FutureLab on Game Theory and Networks of Interacting Agents, Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany and Zuse Institute Berlin, 14195 Berlin, Germany
| | - Nora Molkenthin
- Complexity Science Department, Potsdam Institute for Climate Impact Research, 14473 Potsdam, Germany
| | - Péter Koltai
- Department of Mathematics, University of Bayreuth, 95447 Bayreuth, Germany
| |
Collapse
|
6
|
Dyballa L, Zucker SW. IAN: Iterated Adaptive Neighborhoods for Manifold Learning and Dimensionality Estimation. Neural Comput 2023; 35:453-524. [PMID: 36746146 DOI: 10.1162/neco_a_01566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 10/25/2022] [Indexed: 02/08/2023]
Abstract
Invoking the manifold assumption in machine learning requires knowledge of the manifold's geometry and dimension, and theory dictates how many samples are required. However, in most applications, the data are limited, sampling may not be uniform, and the manifold's properties are unknown; this implies that neighborhoods must adapt to the local structure. We introduce an algorithm for inferring adaptive neighborhoods for data given by a similarity kernel. Starting with a locally conservative neighborhood (Gabriel) graph, we sparsify it iteratively according to a weighted counterpart. In each step, a linear program yields minimal neighborhoods globally, and a volumetric statistic reveals neighbor outliers likely to violate manifold geometry. We apply our adaptive neighborhoods to nonlinear dimensionality reduction, geodesic computation, and dimension estimation. A comparison against standard algorithms using, for example, k-nearest neighbors, demonstrates the usefulness of our approach.
Collapse
Affiliation(s)
- Luciano Dyballa
- Department of Computer Science, Yale University, New Haven, CT 06511, U.S.A.
| | - Steven W Zucker
- Departments of Computer Science and of Biomedical Engineering, Yale University, New Haven, CT 06511, U.S.A.
| |
Collapse
|
7
|
Dietrich F, Makeev A, Kevrekidis G, Evangelou N, Bertalan T, Reich S, Kevrekidis IG. Learning effective stochastic differential equations from microscopic simulations: Linking stochastic numerics to deep learning. CHAOS (WOODBURY, N.Y.) 2023; 33:023121. [PMID: 36859209 DOI: 10.1063/5.0113632] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 11/16/2022] [Indexed: 06/18/2023]
Abstract
We identify effective stochastic differential equations (SDEs) for coarse observables of fine-grained particle- or agent-based simulations; these SDEs then provide useful coarse surrogate models of the fine scale dynamics. We approximate the drift and diffusivity functions in these effective SDEs through neural networks, which can be thought of as effective stochastic ResNets. The loss function is inspired by, and embodies, the structure of established stochastic numerical integrators (here, Euler-Maruyama and Milstein); our approximations can thus benefit from backward error analysis of these underlying numerical schemes. They also lend themselves naturally to "physics-informed" gray-box identification when approximate coarse models, such as mean field equations, are available. Existing numerical integration schemes for Langevin-type equations and for stochastic partial differential equations can also be used for training; we demonstrate this on a stochastically forced oscillator and the stochastic wave equation. Our approach does not require long trajectories, works on scattered snapshot data, and is designed to naturally handle different time steps per snapshot. We consider both the case where the coarse collective observables are known in advance, as well as the case where they must be found in a data-driven manner.
Collapse
Affiliation(s)
- Felix Dietrich
- Department of Informatics, School of Computation, Information and Technology, Technical University of Munich, 80333 Munich, Germany
| | - Alexei Makeev
- Faculty of Computational Mathematics and Cybernetics, Moscow State University, 119991 Moscow, Russia
| | - George Kevrekidis
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Nikolaos Evangelou
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Tom Bertalan
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Sebastian Reich
- Institute of Mathematics, University of Potsdam, 14469 Potsdam, Germany
| | - Ioannis G Kevrekidis
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
8
|
Evans L, Cameron MK, Tiwary P. Computing committors via Mahalanobis diffusion maps with enhanced sampling data. J Chem Phys 2022; 157:214107. [PMID: 36511548 DOI: 10.1063/5.0122990] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
The study of phenomena such as protein folding and conformational changes in molecules is a central theme in chemical physics. Molecular dynamics (MD) simulation is the primary tool for the study of transition processes in biomolecules, but it is hampered by a huge timescale gap between the processes of interest and atomic vibrations that dictate the time step size. Therefore, it is imperative to combine MD simulations with other techniques in order to quantify the transition processes taking place on large timescales. In this work, the diffusion map with Mahalanobis kernel, a meshless approach for approximating the Backward Kolmogorov Operator (BKO) in collective variables, is upgraded to incorporate standard enhanced sampling techniques, such as metadynamics. The resulting algorithm, which we call the target measure Mahalanobis diffusion map (tm-mmap), is suitable for a moderate number of collective variables in which one can approximate the diffusion tensor and free energy. Imposing appropriate boundary conditions allows use of the approximated BKO to solve for the committor function and utilization of transition path theory to find the reactive current delineating the transition channels and the transition rate. The proposed algorithm, tm-mmap, is tested on the two-dimensional Moro-Cardin two-well system with position-dependent diffusion coefficient and on alanine dipeptide in two collective variables where the committor, the reactive current, and the transition rate are compared to those computed by the finite element method (FEM). Finally, tm-mmap is applied to alanine dipeptide in four collective variables where the use of finite elements is infeasible.
Collapse
Affiliation(s)
- L Evans
- Department of Mathematics, University of Maryland, College Park, Maryland 20742, USA
| | - M K Cameron
- Department of Mathematics, University of Maryland, College Park, Maryland 20742, USA
| | - P Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
9
|
Maity P, Koltai P, Schumacher J. Large-scale flow in a cubic Rayleigh-Bénard cell: long-term turbulence statistics and Markovianity of macrostate transitions. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2022; 380:20210042. [PMID: 35465712 DOI: 10.1098/rsta.2021.0042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 12/23/2021] [Indexed: 06/14/2023]
Abstract
We investigate the large-scale circulation (LSC) in a turbulent Rayleigh-Bénard convection flow in a cubic closed convection cell by means of direct numerical simulations at a Rayleigh number Ra = 106. The numerical studies are conducted for single flow trajectories up to 105 convective free-fall times to obtain a sufficient sampling of the four discrete LSC states, which can be summarized to one macrostate, and the two crossover configurations which are taken by the flow in between for short periods. We find that large-scale dynamics depends strongly on the Prandtl number Pr of the fluid which has values of 0.1, 0.7, and 10. Alternatively, we run an ensemble of 3600 short-term direct numerical simulations to study the transition probabilities between the discrete LSC states. This second approach is also used to probe the Markov property of the dynamics. Our ensemble analysis gave strong indication of Markovianity of the transition process from one LSC state to another, even though the data are still accompanied by considerable noise. It is based on the eigenvalue spectrum of the transition probability matrix, further on the distribution of persistence times and the joint distribution of two successive microstate persistence times. This article is part of the theme issue 'Mathematical problems in physical fluid dynamics (part 1)'.
Collapse
Affiliation(s)
- Priyanka Maity
- Institute of Thermodynamics and Fluid Mechanics, Technische Universität Ilmenau, Postfach 100565, Ilmenau 98684, Germany
| | - Péter Koltai
- Department of Mathematics, Freie Universität Berlin, Arnimallee 6, Berlin 14195, Germany
| | - Jörg Schumacher
- Institute of Thermodynamics and Fluid Mechanics, Technische Universität Ilmenau, Postfach 100565, Ilmenau 98684, Germany
- Tandon School of Engineering, New York University, New York, NY 11201, USA
| |
Collapse
|
10
|
Brodu N, Crutchfield JP. Discovering causal structure with reproducing-kernel Hilbert space ε-machines. CHAOS (WOODBURY, N.Y.) 2022; 32:023103. [PMID: 35232043 DOI: 10.1063/5.0062829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 01/10/2022] [Indexed: 06/14/2023]
Abstract
We merge computational mechanics' definition of causal states (predictively equivalent histories) with reproducing-kernel Hilbert space (RKHS) representation inference. The result is a widely applicable method that infers causal structure directly from observations of a system's behaviors whether they are over discrete or continuous events or time. A structural representation-a finite- or infinite-state kernel ϵ-machine-is extracted by a reduced-dimension transform that gives an efficient representation of causal states and their topology. In this way, the system dynamics are represented by a stochastic (ordinary or partial) differential equation that acts on causal states. We introduce an algorithm to estimate the associated evolution operator. Paralleling the Fokker-Planck equation, it efficiently evolves causal-state distributions and makes predictions in the original data space via an RKHS functional mapping. We demonstrate these techniques, together with their predictive abilities, on discrete-time, discrete-value infinite Markov-order processes generated by finite-state hidden Markov models with (i) finite or (ii) uncountably infinite causal states and (iii) continuous-time, continuous-value processes generated by thermally driven chaotic flows. The method robustly estimates causal structure in the presence of varying external and measurement noise levels and for very high-dimensional data.
Collapse
Affiliation(s)
- Nicolas Brodu
- Geostat Team-Geometry and Statistics in Acquisition Data, INRIA Bordeaux Sud Ouest, 200 rue de la Vieille Tour, 33405 Talence Cedex, France
| | - James P Crutchfield
- Complexity Sciences Center and Department of Physics and Astronomy, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
11
|
Froyland G, Giannakis D, Lintner BR, Pike M, Slawinska J. Spectral analysis of climate dynamics with operator-theoretic approaches. Nat Commun 2021; 12:6570. [PMID: 34772916 PMCID: PMC8589855 DOI: 10.1038/s41467-021-26357-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 09/27/2021] [Indexed: 11/13/2022] Open
Abstract
The Earth's climate system is a classical example of a multiscale, multiphysics dynamical system with an extremely large number of active degrees of freedom, exhibiting variability on scales ranging from micrometers and seconds in cloud microphysics, to thousands of kilometers and centuries in ocean dynamics. Yet, despite this dynamical complexity, climate dynamics is known to exhibit coherent modes of variability. A primary example is the El Niño Southern Oscillation (ENSO), the dominant mode of interannual (3-5 yr) variability in the climate system. The objective and robust characterization of this and other important phenomena presents a long-standing challenge in Earth system science, the resolution of which would lead to improved scientific understanding and prediction of climate dynamics, as well as assessment of their impacts on human and natural systems. Here, we show that the spectral theory of dynamical systems, combined with techniques from data science, provides an effective means for extracting coherent modes of climate variability from high-dimensional model and observational data, requiring no frequency prefiltering, but recovering multiple timescales and their interactions. Lifecycle composites of ENSO are shown to improve upon results from conventional indices in terms of dynamical consistency and physical interpretability. In addition, the role of combination modes between ENSO and the annual cycle in ENSO diversity is elucidated.
Collapse
Affiliation(s)
- Gary Froyland
- School of Mathematics and Statistics, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Dimitrios Giannakis
- Department of Mathematics and Center for Atmosphere Ocean Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, 10012, USA.
- Department of Mathematics, Dartmouth College, Hanover, NH, 03755, USA.
| | - Benjamin R Lintner
- Department of Environmental Sciences, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Maxwell Pike
- Department of Environmental Sciences, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Joanna Slawinska
- Center for Climate Physics, Institute for Basic Science (IBS), Busan, South Korea
- Pusan National University, Busan, South Korea
- Finnish Center for Artificial Intelligence, Department of Computer Science, University of Helsinki, 00560, Helsinki, Finland
| |
Collapse
|
12
|
Arias Velásquez RM, Mejía Lara JV. Gaussian approach for probability and correlation between the number of COVID-19 cases and the air pollution in Lima. URBAN CLIMATE 2020; 33:100664. [PMID: 32834964 PMCID: PMC7332952 DOI: 10.1016/j.uclim.2020.100664] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 06/18/2020] [Accepted: 06/22/2020] [Indexed: 05/07/2023]
Abstract
At the end of February 2020, Peru started the first cases of pneumonia associated with coronavirus (COVID-19), they were reported in Lima, Peru (Rodriguez-Morales et al., 2020). Therefore, the first week on March started with 72 infected people, the government published new law for a national crisis by COVID-19 pandemic (Vizcarra et al., 2020), with a quarantine in each city of Peru. Our analysis has considered March and April 2020, for air quality measurement and infections in Lima, the data collected on 6 meteorological stations with CO (carbon monoxide), NO2 (nitrogen oxide), O3 (ozone), SO2 (sulfur dioxide), PM10 and PM2.5 (particle matter with diameter aerodynamic less than 2.5 and 10 m respectively). As a result, the average of these concentrations and the hospital information is recollected per hour. This analysis is executed during the quarantine an important correlation is discovered in the zone with highest infection by COVID-19, NO2 and PM10, even though in a reduction of air pollution in Lima. In this paper, we proposed a classification model by Reduced-Space Gaussian Process Regression for air pollution and infections; with technological and environmental dynamics and global change associated COVID-19. An evaluation of zones in Lima city, results have demonstrated influence of industrial influence in air pollution and infections by COVID-19 before and after quarantine during the last 28 days since the first infection in Peru; the problems relating to data management were validated with a successful classification and cluster analysis for future works in COVID-19 influence by environmental conditions.
Collapse
|
13
|
Arias Velásquez RM, Mejía Lara JV. Forecast and evaluation of COVID-19 spreading in USA with reduced-space Gaussian process regression. CHAOS, SOLITONS, AND FRACTALS 2020; 136:109924. [PMID: 32501372 PMCID: PMC7242925 DOI: 10.1016/j.chaos.2020.109924] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Revised: 04/17/2020] [Accepted: 05/20/2020] [Indexed: 05/18/2023]
Abstract
In this report, we analyze historical and forecast infections for COVID-19 death based on Reduced-Space Gaussian Process Regression associated to chaotic Dynamical Systems with information obtained in 82 days with continuous learning, day by day, from January 21 th , 2020 to April 12 th . According last results, COVID-19 could be predicted with Gaussian models mean-field models can be meaning- fully used to gather a quantitative picture of the epidemic spreading, with infections, fatality and recovery rate. The forecast places the peak in USA around July 14 th 2020, with a peak number of 132,074 death with infected individuals of about 1,157,796 and a number of deaths at the end of the epidemics of about 132,800. Late on January, USA confirmed the first patient with COVID-19, who had recently traveled to China, however, an evaluation of states in USA have demonstrated a fatality rate in China (4%) is lower than New York (4.56%), but lower than Michigan (5.69%). Mean estimates and uncertainty bounds for both USA and his cities and other provinces have increased in the last three months, with focus on New York, New Jersey, Michigan, California, Massachusetts, ... (January e April 12 th ). Besides, we propose a Reduced-Space Gaussian Process Regression model predicts that the epidemic will reach saturation in USA on July 2020. Our findings suggest, new quarantine actions with more restrictions for containment strategies implemented in USA could be successfully, but in a late period, it could generate critical rate infections and death for the next 2 month.
Collapse
|
14
|
Karn TK, Petrone S, Griffin C. Modeling a recurrent, hidden dynamical system using energy minimization and kernel density estimates. Phys Rev E 2019; 100:042137. [PMID: 31770961 DOI: 10.1103/physreve.100.042137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Indexed: 11/07/2022]
Abstract
In this paper we develop a kernel density estimation (KDE) approach to modeling and forecasting recurrent trajectories on a suitable manifold. For the purposes of this paper, a trajectory is a sequence of coordinates in a phase space defined by an underlying hidden dynamical system. Our work is inspired by earlier work on the use of KDE to detect shipping anomalies using high-density, high-quality automated information system data as well as our own earlier work in trajectory modeling. We focus specifically on the sparse, noisy trajectory reconstruction problem in which the data are (i) sparsely sampled and (ii) subject to an imperfect observer that introduces noise. Under certain regularity assumptions, we show that the constructed estimator minimizes a specific energy function defined over the trajectory as the number of samples obtained grows.
Collapse
Affiliation(s)
- Trevor K Karn
- Applied Research Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Steven Petrone
- Applied Research Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Christopher Griffin
- Applied Research Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
15
|
Abstract
A framework for data assimilation combining aspects of operator-theoretic ergodic theory and quantum mechanics is developed. This framework adapts the Dirac-von Neumann formalism of quantum dynamics and measurement to perform sequential data assimilation (filtering) of a partially observed, measure-preserving dynamical system, using the Koopman operator on the L^{2} space associated with the invariant measure as an analog of the Heisenberg evolution operator in quantum mechanics. In addition, the state of the data assimilation system is represented by a trace-class operator analogous to the quantum mechanical density operator, and the assimilated observables by self-adjoint multiplication operators. An averaging approach is also introduced, rendering the spectrum of the assimilated observables discrete and thus amenable to numerical approximation. We present a data-driven formulation of the quantum mechanical data assimilation approach, utilizing kernel methods from machine learning and delay-coordinate maps of dynamical systems to represent the evolution and measurement operators via matrices in a data-driven basis. The data-driven formulation is structurally similar to its infinite-dimensional counterpart and shown to converge in a limit of large data under mild assumptions. Applications to periodic oscillators and the Lorenz 63 system demonstrate that the framework is able to naturally handle highly non-Gaussian statistics, complex state space geometries, and chaotic dynamics.
Collapse
Affiliation(s)
- Dimitrios Giannakis
- Center for Atmosphere Ocean Science, Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| |
Collapse
|
16
|
Vincent P, Parr T, Benrimoh D, Friston KJ. With an eye on uncertainty: Modelling pupillary responses to environmental volatility. PLoS Comput Biol 2019; 15:e1007126. [PMID: 31276488 PMCID: PMC6636765 DOI: 10.1371/journal.pcbi.1007126] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2018] [Revised: 07/17/2019] [Accepted: 05/23/2019] [Indexed: 01/04/2023] Open
Abstract
Living creatures must accurately infer the nature of their environments. They do this despite being confronted by stochastic and context sensitive contingencies—and so must constantly update their beliefs regarding their uncertainty about what might come next. In this work, we examine how we deal with uncertainty that evolves over time. This prospective uncertainty (or imprecision) is referred to as volatility and has previously been linked to noradrenergic signals that originate in the locus coeruleus. Using pupillary dilatation as a measure of central noradrenergic signalling, we tested the hypothesis that changes in pupil diameter reflect inferences humans make about environmental volatility. To do so, we collected pupillometry data from participants presented with a stream of numbers. We generated these numbers from a process with varying degrees of volatility. By measuring pupillary dilatation in response to these stimuli—and simulating the inferences made by an ideal Bayesian observer of the same stimuli—we demonstrate that humans update their beliefs about environmental contingencies in a Bayes optimal way. We show this by comparing general linear (convolution) models that formalised competing hypotheses about the causes of pupillary changes. We found greater evidence for models that included Bayes optimal estimates of volatility than those without. We additionally explore the interaction between different causes of pupil dilation and suggest a quantitative approach to characterising a person’s prior beliefs about volatility. Humans are constantly confronted with surprising events. To navigate such a world, we must understand the chances of an unexpected event occurring at any given point in time. We do this by creating a model of the world around us, in which we allow for these unexpected events to occur by holding beliefs about how volatile our environment is. In this work we explore the way in which we update our beliefs, demonstrating that this updating relies on the number of unexpected events in relation to the expected number. We do this by examining the pupil diameter, since—in controlled environments—changes in pupil diameter reflect our response to unexpected observations. Finally, we show that our methodology is appropriate for assessing the individual participant’s prior expectations about the amount of uncertainty in their environment.
Collapse
Affiliation(s)
- Peter Vincent
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
- * E-mail:
| | - Thomas Parr
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
| | - David Benrimoh
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
| | - Karl J Friston
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
| |
Collapse
|
17
|
Thiede EH, Giannakis D, Dinner AR, Weare J. Galerkin approximation of dynamical quantities using trajectory data. J Chem Phys 2019; 150:244111. [PMID: 31255053 PMCID: PMC6824902 DOI: 10.1063/1.5063730] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Accepted: 05/13/2019] [Indexed: 11/14/2022] Open
Abstract
Understanding chemical mechanisms requires estimating dynamical statistics such as expected hitting times, reaction rates, and committors. Here, we present a general framework for calculating these dynamical quantities by approximating boundary value problems using dynamical operators with a Galerkin expansion. A specific choice of basis set in the expansion corresponds to the estimation of dynamical quantities using a Markov state model. More generally, the boundary conditions impose restrictions on the choice of basis sets. We demonstrate how an alternative basis can be constructed using ideas from diffusion maps. In our numerical experiments, this basis gives results of comparable or better accuracy to Markov state models. Additionally, we show that delay embedding can reduce the information lost when projecting the system's dynamics for model construction; this improves estimates of dynamical statistics considerably over the standard practice of increasing the lag time.
Collapse
Affiliation(s)
- Erik H Thiede
- Department of Chemistry and James Franck Institute, The University of Chicago, Chicago, Illinois 60637, USA
| | - Dimitrios Giannakis
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| | - Aaron R Dinner
- Department of Chemistry and James Franck Institute, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| |
Collapse
|
18
|
Wan ZY, Vlachas P, Koumoutsakos P, Sapsis T. Data-assisted reduced-order modeling of extreme events in complex dynamical systems. PLoS One 2018; 13:e0197704. [PMID: 29795631 PMCID: PMC5967742 DOI: 10.1371/journal.pone.0197704] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 05/07/2018] [Indexed: 12/05/2022] Open
Abstract
The prediction of extreme events, from avalanches and droughts to tsunamis and epidemics, depends on the formulation and analysis of relevant, complex dynamical systems. Such dynamical systems are characterized by high intrinsic dimensionality with extreme events having the form of rare transitions that are several standard deviations away from the mean. Such systems are not amenable to classical order-reduction methods through projection of the governing equations due to the large intrinsic dimensionality of the underlying attractor as well as the complexity of the transient events. Alternatively, data-driven techniques aim to quantify the dynamics of specific, critical modes by utilizing data-streams and by expanding the dimensionality of the reduced-order model using delayed coordinates. In turn, these methods have major limitations in regions of the phase space with sparse data, which is the case for extreme events. In this work, we develop a novel hybrid framework that complements an imperfect reduced order model, with data-streams that are integrated though a recurrent neural network (RNN) architecture. The reduced order model has the form of projected equations into a low-dimensional subspace that still contains important dynamical information about the system and it is expanded by a long short-term memory (LSTM) regularization. The LSTM-RNN is trained by analyzing the mismatch between the imperfect model and the data-streams, projected to the reduced-order space. The data-driven model assists the imperfect model in regions where data is available, while for locations where data is sparse the imperfect model still provides a baseline for the prediction of the system state. We assess the developed framework on two challenging prototype systems exhibiting extreme events. We show that the blended approach has improved performance compared with methods that use either data streams or the imperfect model alone. Notably the improvement is more significant in regions associated with extreme events, where data is sparse.
Collapse
Affiliation(s)
- Zhong Yi Wan
- Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | | | | | - Themistoklis Sapsis
- Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| |
Collapse
|
19
|
Ferguson AL. Machine learning and data science in soft materials engineering. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2018; 30:043002. [PMID: 29111979 DOI: 10.1088/1361-648x/aa98bd] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.
Collapse
Affiliation(s)
- Andrew L Ferguson
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, IL 61801, United States of America. Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, IL 61801, United States of America. Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, IL 61801, United States of America. Frederick Seitz Materials Research Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America. Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America
| |
Collapse
|
20
|
Farazmand M, Sapsis TP. Dynamical indicators for the prediction of bursting phenomena in high-dimensional systems. Phys Rev E 2016; 94:032212. [PMID: 27739820 DOI: 10.1103/physreve.94.032212] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Indexed: 11/07/2022]
Abstract
Drawing upon the bursting mechanism in slow-fast systems, we propose indicators for the prediction of such rare extreme events which do not require a priori known slow and fast coordinates. The indicators are associated with functionals defined in terms of optimally time-dependent (OTD) modes. One such functional has the form of the largest eigenvalue of the symmetric part of the linearized dynamics reduced to these modes. In contrast to other choices of subspaces, the proposed modes are flow invariant and therefore a projection onto them is dynamically meaningful. We illustrate the application of these indicators on three examples: a prototype low-dimensional model, a body-forced turbulent fluid flow, and a unidirectional model of nonlinear water waves. We use Bayesian statistics to quantify the predictive power of the proposed indicators.
Collapse
Affiliation(s)
- Mohammad Farazmand
- Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02115, USA
| | - Themistoklis P Sapsis
- Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02115, USA
| |
Collapse
|
21
|
Kondrashov D, Chekroun MD, Ghil M. Comment on "Nonparametric forecasting of low-dimensional dynamical systems ". Phys Rev E 2016; 93:036201. [PMID: 27078490 DOI: 10.1103/physreve.93.036201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Indexed: 11/07/2022]
Abstract
The comparison performed in Berry et al. [Phys. Rev. E 91, 032915 (2015)] between the skill in predicting the El Niño-Southern Oscillation climate phenomenon by the prediction method of Berry et al. and the "past-noise" forecasting method of Chekroun et al. [Proc. Natl. Acad. Sci. USA 108, 11766 (2011)] is flawed. Three specific misunderstandings in Berry et al. are pointed out and corrected.
Collapse
Affiliation(s)
- Dmitri Kondrashov
- Department of Atmospheric and Oceanic Sciences, 405 Hilgard Ave., Box 951565, 7127 Math Sciences Bldg., University of California, Los Angeles, California 90095-1565, USA
| | - Mickaël D Chekroun
- Department of Atmospheric and Oceanic Sciences, 405 Hilgard Ave., Box 951565, 7127 Math Sciences Bldg., University of California, Los Angeles, California 90095-1565, USA
| | - Michael Ghil
- Department of Atmospheric and Oceanic Sciences, 405 Hilgard Ave., Box 951565, 7127 Math Sciences Bldg., University of California, Los Angeles, California 90095-1565, USA
| |
Collapse
|
22
|
Berry T, Giannakis D, Harlim J. Reply to "Comment on 'Nonparametric forecasting of low-dimensional dynamical systems' ". Phys Rev E 2016; 93:036202. [PMID: 27078491 DOI: 10.1103/physreve.93.036202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Indexed: 11/07/2022]
Abstract
In this Reply we provide additional results which allow a better comparison of the diffusion forecast and the "past-noise" forecasting (PNF) approach for the El Niño index. We remark on some qualitative differences between the diffusion forecast and PNF, and we suggest an alternative use of the diffusion forecast for the purposes of forecasting the probabilities of extreme events.
Collapse
Affiliation(s)
- Tyrus Berry
- Department of Mathematical Sciences, George Mason University, Fairfax, Virginia 22030, USA
| | - Dimitrios Giannakis
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| | - John Harlim
- Department of Mathematics, The Pennsylvania State University, University Park, Pennsylvania 16802-6400, USA.,Department of Meteorology, The Pennsylvania State University, University Park, Pennsylvania 16802-5013, USA
| |
Collapse
|