1
|
Leong KH, Xiu Y, Chen B, Chan WK(V. Neural Causal Information Extractor for Unobserved Causes. ENTROPY (BASEL, SWITZERLAND) 2023; 26:46. [PMID: 38248172 PMCID: PMC11154551 DOI: 10.3390/e26010046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/18/2023] [Accepted: 12/22/2023] [Indexed: 01/23/2024]
Abstract
Causal inference aims to faithfully depict the causal relationships between given variables. However, in many practical systems, variables are often partially observed, and some unobserved variables could carry significant information and induce causal effects on a target. Identifying these unobserved causes remains a challenge, and existing works have not considered extracting the unobserved causes while retaining the causes that have already been observed and included. In this work, we aim to construct the implicit variables with a generator-discriminator framework named the Neural Causal Information Extractor (NCIE), which can complement the information of unobserved causes and thus provide a complete set of causes with both observed causes and the representations of unobserved causes. By maximizing the mutual information between the targets and the union of observed causes and implicit variables, the implicit variables we generate could complement the information that the unobserved causes should have provided. The synthetic experiments show that the implicit variables preserve the information and dynamics of the unobserved causes. In addition, extensive real-world time series prediction tasks show improved precision after introducing implicit variables, thus indicating their causality to the targets.
Collapse
Affiliation(s)
- Keng-Hou Leong
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (K.-H.L.); (Y.X.)
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen 518055, China
| | - Yuxuan Xiu
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (K.-H.L.); (Y.X.)
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen 518055, China
| | - Bokui Chen
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (K.-H.L.); (Y.X.)
- Peng Cheng Laboratory, Shenzhen 518055, China
| | - Wai Kin (Victor) Chan
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (K.-H.L.); (Y.X.)
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen 518055, China
- International Science and Technology Information Center, Shenzhen 518055, China
| |
Collapse
|
2
|
Hernandez Rodriguez LC, Kumar P. Causal interaction in high frequency turbulence at the biosphere-atmosphere interface: Structure-function coupling. CHAOS (WOODBURY, N.Y.) 2023; 33:073144. [PMID: 37466423 DOI: 10.1063/5.0131469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 06/06/2023] [Indexed: 07/20/2023]
Abstract
At the biosphere-atmosphere interface, nonlinear interdependencies among components of an ecohydrological complex system can be inferred using multivariate high frequency time series observations. Information flow among these interacting variables allows us to represent the causal dependencies in the form of a directed acyclic graph (DAG). We use high frequency multivariate data at 10 Hz from an eddy covariance instrument located at 25 m above agricultural land in the Midwestern US to quantify the evolutionary dynamics of this complex system using a sequence of DAGs by examining the structural dependency of information flow and the associated functional response. We investigate whether functional differences correspond to structural differences or if there are no functional variations despite the structural differences. We base our analysis on the hypothesis that causal dependencies are instigated through information flow, and the resulting interactions sustain the dynamics and its functionality. To test our hypothesis, we build upon causal structure analysis in the companion paper to characterize the information flow in similarly clustered DAGs from 3-min non-overlapping contiguous windows in the observational data. We characterize functionality as the nature of interactions as discerned through redundant, unique, and synergistic components of information flow. Through this analysis, we find that in turbulence at the biosphere-atmosphere interface, the variables that control the dynamic character of the atmosphere as well as the thermodynamics are driven by non-local conditions, while the scalar transport associated with CO2 and H2O is mainly driven by short-term local conditions.
Collapse
Affiliation(s)
| | - Praveen Kumar
- Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
- Department of Atmospheric Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| |
Collapse
|
3
|
Kim J, Goldstein AH, Chakraborty R, Jardine K, Weber R, Sorensen PO, Wang S, Faybishenko B, Misztal PK, Brodie EL. Measurement of Volatile Compounds for Real-Time Analysis of Soil Microbial Metabolic Response to Simulated Snowmelt. Front Microbiol 2021; 12:679671. [PMID: 34248891 PMCID: PMC8261151 DOI: 10.3389/fmicb.2021.679671] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 05/31/2021] [Indexed: 11/24/2022] Open
Abstract
Snowmelt dynamics are a significant determinant of microbial metabolism in soil and regulate global biogeochemical cycles of carbon and nutrients by creating seasonal variations in soil redox and nutrient pools. With an increasing concern that climate change accelerates both snowmelt timing and rate, obtaining an accurate characterization of microbial response to snowmelt is important for understanding biogeochemical cycles intertwined with soil. However, observing microbial metabolism and its dynamics non-destructively remains a major challenge for systems such as soil. Microbial volatile compounds (mVCs) emitted from soil represent information-dense signatures and when assayed non-destructively using state-of-the-art instrumentation such as Proton Transfer Reaction-Time of Flight-Mass Spectrometry (PTR-TOF-MS) provide time resolved insights into the metabolism of active microbiomes. In this study, we used PTR-TOF-MS to investigate the metabolic trajectory of microbiomes from a subalpine forest soil, and their response to a simulated wet-up event akin to snowmelt. Using an information theory approach based on the partitioning of mutual information, we identified mVC metabolite pairs with robust interactions, including those that were non-linear and with time lags. The biological context for these mVC interactions was evaluated by projecting the connections onto the Kyoto Encyclopedia of Genes and Genomes (KEGG) network of known metabolic pathways. Simulated snowmelt resulted in a rapid increase in the production of trimethylamine (TMA) suggesting that anaerobic degradation of quaternary amine osmo/cryoprotectants, such as glycine betaine, may be important contributors to this resource pulse. Unique and synergistic connections between intermediates of methylotrophic pathways such as dimethylamine, formaldehyde and methanol were observed upon wet-up and indicate that the initial pulse of TMA was likely transformed into these intermediates by methylotrophs. Increases in ammonia oxidation signatures (transformation of hydroxylamine to nitrite) were observed in parallel, and while the relative role of nitrifiers or methylotrophs cannot be confirmed, the inferred connection to TMA oxidation suggests either a direct or indirect coupling between these processes. Overall, it appears that such mVC time-series from PTR-TOF-MS combined with causal inference represents an attractive approach to non-destructively observe soil microbial metabolism and its response to environmental perturbation.
Collapse
Affiliation(s)
- Junhyeong Kim
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States
| | - Allen H Goldstein
- Department of Environmental Science, Policy and Management, University of California, Berkeley, Berkeley, CA, United States
| | - Romy Chakraborty
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States
| | - Kolby Jardine
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States
| | - Robert Weber
- Department of Environmental Science, Policy and Management, University of California, Berkeley, Berkeley, CA, United States
| | - Patrick O Sorensen
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States
| | - Shi Wang
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States
| | - Boris Faybishenko
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States
| | - Pawel K Misztal
- Department of Environmental Science, Policy and Management, University of California, Berkeley, Berkeley, CA, United States
| | - Eoin L Brodie
- Lawrence Berkeley National Laboratory, Climate and Ecosystems Sciences, Earth and Environmental Sciences, Berkeley, CA, United States.,Department of Environmental Science, Policy and Management, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
4
|
Jiang P, Kumar P. Bundled Causal History Interaction. ENTROPY 2020; 22:e22030360. [PMID: 33286134 PMCID: PMC7516833 DOI: 10.3390/e22030360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 03/14/2020] [Accepted: 03/18/2020] [Indexed: 11/17/2022]
Abstract
Complex systems arise as a result of the nonlinear interactions between components. In particular, the evolutionary dynamics of a multivariate system encodes the ways in which different variables interact with each other individually or in groups. One fundamental question that remains unanswered is: How do two non-overlapping multivariate subsets of variables interact to causally determine the outcome of a specific variable? Here, we provide an information-based approach to address this problem. We delineate the temporal interactions between the bundles in a probabilistic graphical model. The strength of the interactions, captured by partial information decomposition, then exposes complex behavior of dependencies and memory within the system. The proposed approach successfully illustrated complex dependence between cations and anions as determinants of pH in an observed stream chemistry system. In the studied catchment, the dynamics of pH is a result of both cations and anions through mainly synergistic effects of the two and their individual influences as well. This example demonstrates the potentially broad applicability of the approach, establishing the foundation to study the interaction between groups of variables in a range of complex systems.
Collapse
|
5
|
Runge J, Nowack P, Kretschmer M, Flaxman S, Sejdinovic D. Detecting and quantifying causal associations in large nonlinear time series datasets. SCIENCE ADVANCES 2019; 5:eaau4996. [PMID: 31807692 PMCID: PMC6881151 DOI: 10.1126/sciadv.aau4996] [Citation(s) in RCA: 152] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Accepted: 09/17/2019] [Indexed: 05/07/2023]
Abstract
Identifying causal relationships and quantifying their strength from observational time series data are key problems in disciplines dealing with complex dynamical systems such as the Earth system or the human body. Data-driven causal inference in such systems is challenging since datasets are often high dimensional and nonlinear with limited sample sizes. Here, we introduce a novel method that flexibly combines linear or nonlinear conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets. We validate the method on time series of well-understood physical mechanisms in the climate system and the human heart and using large-scale synthetic datasets mimicking the typical properties of real-world data. The experiments demonstrate that our method outperforms state-of-the-art techniques in detection power, which opens up entirely new possibilities to discover and quantify causal networks from time series across a range of research fields.
Collapse
Affiliation(s)
- Jakob Runge
- German Aerospace Center, Institute of Data Science, 07745 Jena, Germany
- Grantham Institute, Imperial College, London SW7 2AZ, UK
- Corresponding author.
| | - Peer Nowack
- Grantham Institute, Imperial College, London SW7 2AZ, UK
- Department of Physics, Blackett Laboratory, Imperial College, London SW7 2AZ, UK
- Data Science Institute, Imperial College, London SW7 2AZ, UK
| | | | - Seth Flaxman
- Data Science Institute, Imperial College, London SW7 2AZ, UK
- Department of Mathematics, Imperial College, London SW7 2AZ, UK
| | - Dino Sejdinovic
- The Alan Turing Institute for Data Science, London NW1 3DB, UK
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| |
Collapse
|