1
|
Learning complex dependency structure of gene regulatory networks from high dimensional microarray data with Gaussian Bayesian networks. Sci Rep 2022; 12:18704. [PMID: 36333425 PMCID: PMC9636198 DOI: 10.1038/s41598-022-21957-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 10/06/2022] [Indexed: 11/06/2022] Open
Abstract
Reconstruction of Gene Regulatory Networks (GRNs) of gene expression data with Probabilistic Network Models (PNMs) is an open problem. Gene expression datasets consist of thousand of genes with relatively small sample sizes (i.e. are large-p-small-n). Moreover, dependencies of various orders coexist in the datasets. On the one hand transcription factor encoding genes act like hubs and regulate target genes, on the other hand target genes show local dependencies. In the field of Undirected Network Models (UNMs)-a subclass of PNMs-the Glasso algorithm has been proposed to deal with high dimensional microarray datasets forcing sparsity. To overcome the problem of the complex structure of interactions, modifications of the default Glasso algorithm have been developed that integrate the expected dependency structure in the UNMs beforehand. In this work we advocate the use of a simple score-based Hill Climbing algorithm (HC) that learns Gaussian Bayesian networks leaning on directed acyclic graphs. We compare HC with Glasso and variants in the UNM framework based on their capability to reconstruct GRNs from microarray data from the benchmarking synthetic dataset from the DREAM5 challenge and from real-world data from the Escherichia coli genome. We conclude that dependencies in complex data are learned best by the HC algorithm, presenting them most accurately and efficiently, simultaneously modelling strong local and weaker but significant global connections coexisting in the gene expression dataset. The HC algorithm adapts intrinsically to the complex dependency structure of the dataset, without forcing a specific structure in advance.
Collapse
|
2
|
Wodeyar A, Srinivasan R. Structural connectome constrained graphical lasso for MEG partial coherence. Netw Neurosci 2022; 6:1219-1242. [PMID: 38800455 PMCID: PMC11117092 DOI: 10.1162/netn_a_00267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 07/06/2022] [Indexed: 05/29/2024] Open
Abstract
Structural connectivity provides the backbone for communication between neural populations. Since axonal transmission occurs on a millisecond time scale, measures of M/EEG functional connectivity sensitive to phase synchronization, such as coherence, are expected to reflect structural connectivity. We develop a model of MEG functional connectivity whose edges are constrained by the structural connectome. The edge strengths are defined by partial coherence, a measure of conditional dependence. We build a new method-the adaptive graphical lasso (AGL)-to fit the partial coherence to perform inference on the hypothesis that the structural connectome is reflected in MEG functional connectivity. In simulations, we demonstrate that the structural connectivity's influence on the partial coherence can be inferred using the AGL. Further, we show that fitting the partial coherence is superior to alternative methods at recovering the structural connectome, even after the source localization estimates required to map MEG from sensors to the cortex. Finally, we show how partial coherence can be used to explore how distinct parts of the structural connectome contribute to MEG functional connectivity in different frequency bands. Partial coherence offers better estimates of the strength of direct functional connections and consequently a potentially better estimate of network structure.
Collapse
Affiliation(s)
- Anirudh Wodeyar
- Department of Cognitive Sciences, University of California, Irvine, California, USA
- Department of Statistics, University of California, Irvine, California, USA
- Department of Biomedical Engineering, University of California, Irvine, California, USA
- Department of Mathematics and Statistics, Boston University, Boston, Massachusetts, USA
| | - Ramesh Srinivasan
- Department of Statistics, University of California, Irvine, California, USA
| |
Collapse
|
3
|
Sparse precision matrix estimation with missing observations. Comput Stat 2022. [DOI: 10.1007/s00180-022-01265-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
4
|
Loske P, Schelter BO. Inferring the underlying multivariate structure from bivariate networks with highly correlated nodes. Sci Rep 2022; 12:12486. [PMID: 35864116 PMCID: PMC9304421 DOI: 10.1038/s41598-022-16296-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 07/07/2022] [Indexed: 11/09/2022] Open
Abstract
Complex systems are often described mathematically as networks. Inferring the actual interactions from observed dynamics of the nodes of the networks is a challenging inverse task. It is crucial to distinguish direct and indirect interactions to allow for a robust identification of the underlying network. If strong and weak links are simultaneously present in the observed network, typical multivariate approaches to address this challenge fail. By means of correlation and partial correlation, we illustrate the challenges that arise and demonstrate how to overcome these. The challenge of strong and weak links translates into ill-conditioned matrices that need to be inverted to obtain the partial correlations, and therefore the correct network topology. Our novel procedure enables robust identification of multivariate network topologies in the presence of highly correlated processes. In applications, this is crucial to avoid erroneous conclusions about network structures and characteristics. Our novel approach applies to other types of interaction measures between processes in a network.
Collapse
Affiliation(s)
- Philipp Loske
- Aberdeen Biomedical Imaging Center, University of Aberdeen, Foresterhill, Aberdeen, UK.
| | - Bjoern O Schelter
- TauRx Therapeutics Ltd., Aberdeen, UK
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Aberdeen, UK
| |
Collapse
|
5
|
Network structure from a characterization of interactions in complex systems. Sci Rep 2022; 12:11742. [PMID: 35817803 PMCID: PMC9273794 DOI: 10.1038/s41598-022-14397-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/06/2022] [Indexed: 11/29/2022] Open
Abstract
Many natural and man-made complex dynamical systems can be represented by networks with vertices representing system units and edges the coupling between vertices. If edges of such a structural network are inaccessible, a widely used approach is to identify them with interactions between vertices, thereby setting up a functional network. However, it is an unsolved issue if and to what extent important properties of a functional network on the global and the local scale match those of the corresponding structural network. We address this issue by deriving functional networks from characterizing interactions in paradigmatic oscillator networks with widely-used time-series-analysis techniques for various factors that alter the collective network dynamics. Surprisingly, we find that particularly key constituents of functional networks—as identified with betweenness and eigenvector centrality—coincide with ground truth to a high degree, while global topological and spectral properties—clustering coefficient, average shortest path length, assortativity, and synchronizability—clearly deviate. We obtain similar concurrences for an empirical network. Our findings are of relevance for various scientific fields and call for conceptual and methodological refinements to further our understanding of the relationship between structure and function of complex dynamical systems.
Collapse
|
6
|
Huang Y, Kleindessner M, Munishkin A, Varshney D, Guo P, Wang J. Benchmarking of Data-Driven Causality Discovery Approaches in the Interactions of Arctic Sea Ice and Atmosphere. Front Big Data 2021; 4:642182. [PMID: 34505056 PMCID: PMC8421796 DOI: 10.3389/fdata.2021.642182] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 08/02/2021] [Indexed: 11/20/2022] Open
Abstract
The Arctic sea ice has retreated rapidly in the past few decades, which is believed to be driven by various dynamic and thermodynamic processes in the atmosphere. The newly open water resulted from sea ice decline in turn exerts large influence on the atmosphere. Therefore, this study aims to investigate the causality between multiple atmospheric processes and sea ice variations using three distinct data-driven causality approaches that have been proposed recently: Temporal Causality Discovery Framework Non-combinatorial Optimization via Trace Exponential and Augmented lagrangian for Structure learning (NOTEARS) and Directed Acyclic Graph-Graph Neural Networks (DAG-GNN). We apply these three algorithms to 39 years of historical time-series data sets, which include 11 atmospheric variables from ERA-5 reanalysis product and passive microwave satellite retrieved sea ice extent. By comparing the causality graph results of these approaches with what we summarized from the literature, it shows that the static graphs produced by NOTEARS and DAG-GNN are relatively reasonable. The results from NOTEARS indicate that relative humidity and precipitation dominate sea ice changes among all variables, while the results from DAG-GNN suggest that the horizontal and meridional wind are more important for driving sea ice variations. However, both approaches produce some unrealistic cause-effect relationships. Additionally, these three methods cannot well detect the delayed impact of one variable on another in the Arctic. It also turns out that the results are rather sensitive to the choice of hyperparameters of the three methods. As a pioneer study, this work paves the way to disentangle the complex causal relationships in the Earth system, by taking the advantage of cutting-edge Artificial Intelligence technologies.
Collapse
Affiliation(s)
- Yiyi Huang
- Department of Hydrology and Atmospheric Sciences, University of Arizona, Tucson, AZ, United States
| | - Matthäus Kleindessner
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, United States
| | - Alexey Munishkin
- Department of Computer Science and Engineering, University of California Santa Cruz, Santa Cruz, CA, United States
| | - Debvrat Varshney
- Department of Information Systems, University of Maryland, Baltimore, MD, United States
| | - Pei Guo
- Department of Information Systems, University of Maryland, Baltimore, MD, United States
| | - Jianwu Wang
- Department of Information Systems, University of Maryland, Baltimore, MD, United States
| |
Collapse
|
7
|
Avagyan V. D-Trace estimation of a precision matrix with eigenvalue control. COMMUN STAT-SIMUL C 2021. [DOI: 10.1080/03610918.2019.1580730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Vahe Avagyan
- Biometris, Wageningen University and Research, Wageningen, The Netherlands
| |
Collapse
|
8
|
Wodeyar A, Cassidy JM, Cramer SC, Srinivasan R. Damage to the structural connectome reflected in resting-state fMRI functional connectivity. Netw Neurosci 2021; 4:1197-1218. [PMID: 33409436 PMCID: PMC7781612 DOI: 10.1162/netn_a_00160] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 07/21/2020] [Indexed: 11/04/2022] Open
Abstract
The relationship between structural and functional connectivity has been mostly examined in intact brains. Fewer studies have examined how differences in structure as a result of injury alters function. In this study we analyzed the relationship of structure to function across patients with stroke among whom infarcts caused heterogenous structural damage. We estimated relationships between distinct brain regions of interest (ROIs) from functional MRI in two pipelines. In one analysis pipeline, we measured functional connectivity by using correlation and partial correlation between 114 cortical ROIs. We found fMRI-BOLD partial correlation was altered at more edges as a function of the structural connectome (SC) damage, relative to the correlation. In a second analysis pipeline, we limited our analysis to fMRI correlations between pairs of voxels for which we possess SC information. We found that voxel-level functional connectivity showed the effect of structural damage that we could not see when examining correlations between ROIs. Further, the effects of structural damage on functional connectivity are consistent with a model of functional connectivity, diffusion, which expects functional connectivity to result from activity spreading over multiple edge anatomical paths.
Collapse
Affiliation(s)
- Anirudh Wodeyar
- Department of Cognitive Sciences, University of California, Irvine, CA, USA
| | - Jessica M Cassidy
- Department of Allied Health Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Steven C Cramer
- Department of Neurology, University of California, Los Angeles, CA, USA
| | - Ramesh Srinivasan
- Department of Cognitive Sciences, University of California, Irvine, CA, USA
| |
Collapse
|
9
|
Lehnertz K, Bröhl T, Rings T. The Human Organism as an Integrated Interaction Network: Recent Conceptual and Methodological Challenges. Front Physiol 2020; 11:598694. [PMID: 33408639 PMCID: PMC7779628 DOI: 10.3389/fphys.2020.598694] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/30/2020] [Indexed: 12/30/2022] Open
Abstract
The field of Network Physiology aims to advance our understanding of how physiological systems and sub-systems interact to generate a variety of behaviors and distinct physiological states, to optimize the organism's functioning, and to maintain health. Within this framework, which considers the human organism as an integrated network, vertices are associated with organs while edges represent time-varying interactions between vertices. Likewise, vertices may represent networks on smaller spatial scales leading to a complex mixture of interacting homogeneous and inhomogeneous networks of networks. Lacking adequate analytic tools and a theoretical framework to probe interactions within and among diverse physiological systems, current approaches focus on inferring properties of time-varying interactions-namely strength, direction, and functional form-from time-locked recordings of physiological observables. To this end, a variety of bivariate or, in general, multivariate time-series-analysis techniques, which are derived from diverse mathematical and physical concepts, are employed and the resulting time-dependent networks can then be further characterized with methods from network theory. Despite the many promising new developments, there are still problems that evade from a satisfactory solution. Here we address several important challenges that could aid in finding new perspectives and inspire the development of theoretic and analytical concepts to deal with these challenges and in studying the complex interactions between physiological systems.
Collapse
Affiliation(s)
- Klaus Lehnertz
- Department of Epileptology, University of Bonn Medical Centre, Bonn, Germany
- Helmholtz Institute for Radiation and Nuclear Physics, University of Bonn, Bonn, Germany
- Interdisciplinary Center for Complex Systems, University of Bonn, Bonn, Germany
| | - Timo Bröhl
- Department of Epileptology, University of Bonn Medical Centre, Bonn, Germany
- Helmholtz Institute for Radiation and Nuclear Physics, University of Bonn, Bonn, Germany
| | - Thorsten Rings
- Department of Epileptology, University of Bonn Medical Centre, Bonn, Germany
- Helmholtz Institute for Radiation and Nuclear Physics, University of Bonn, Bonn, Germany
| |
Collapse
|
10
|
Hoang T, Lee J, Kim J. Differences in Dietary Patterns Identified by the Gaussian Graphical Model in Korean Adults With and Without a Self-Reported Cancer Diagnosis. J Acad Nutr Diet 2020; 121:1484-1496.e3. [PMID: 33288494 DOI: 10.1016/j.jand.2020.11.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 11/04/2020] [Accepted: 11/10/2020] [Indexed: 01/02/2023]
Abstract
BACKGROUND The synergistic effect of food groups on health outcomes is better captured by examining dietary patterns (DPs) than single food groups. Regarding this issue, a Gaussian graphical model (GGM) can identify pairwise correlations between food groups and adjust for the remaining items. However, the application of GGMs in the nutritional field has not been widely investigated, especially in Korean adults. OBJECTIVE The aim of this study was to identify the major DPs of Korean adults by using a GGM and to examine the associations between the DP scores and prevalence of self-reported cancer. DESIGN This cross-sectional study used baseline data from the 2007-2019 Cancer Screenee Cohort of the National Cancer Center, Korea. PARTICIPANTS/SETTING In total, 10,777 Korean adults who completed a questionnaire regarding their general medical history, including clinical test results, and a validated food frequency questionnaire were included. MAIN OUTCOME MEASURES The main outcome measure was the prevalence of self-reported cancer at baseline. STATISTICAL ANALYSIS DP networks were identified using a GGM. The GGM-identified networks were scored and categorized into tertiles, and their association with the prevalence of self-reported cancer was investigated using a multivariable logistic regression model. RESULTS The GGM identified the following 4 DP networks: principal, oil-sweet, meat, and fruit. After adjusting for covariates, the odds of moderate and high consumption of foods in the oil-sweet DP for participants who self-reported cancer were 25% and 34% lower than those for participants who did not report a cancer diagnosis (odds ratio [OR] = 0.75, 95% confidence interval [CI] = 0.62-0.90 and OR = 0.66, 95% CI = 0.53-0.81, respectively). Additionally, the odds of meat DP consumption in the self-reported cancer group was 29% lower than in participants who did not report a cancer diagnosis (OR = 0.71 and 95% CI = 0.57-0.88). In contrast, an increase in the odds of fruit DP consumption was observed for self-reported cancer participants (OR = 1.34 and 95% CI = 1.09-1.65). Similar results were observed among the female but not the male subjects. CONCLUSIONS GGM is a novel method that can distinguish the direct pairwise correlation of food groups and control for the indirect effect of other foods. Future large-scale longitudinal population-based studies are needed to build on these findings in general populations.
Collapse
|
11
|
The probabilistic backbone of data-driven complex networks: an example in climate. Sci Rep 2020; 10:11484. [PMID: 32661248 PMCID: PMC7359351 DOI: 10.1038/s41598-020-67970-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 06/17/2020] [Indexed: 11/08/2022] Open
Abstract
AbstractComplex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in forecasting, control, and data-driven modeling of complex systems. These statistical interrelations among the very many degrees of freedom are usually represented by the so-called correlation network, constructed by establishing links between variables (nodes) with pairwise correlations above a given threshold. Here, with the climate system as an example, we revisit correlation networks from a probabilistic perspective and show that they unavoidably include much redundant information, resulting in overfitted probabilistic (Gaussian) models. As an alternative, we propose here the use of more sophisticated probabilistic Bayesian networks, developed by the machine learning community, as a data-driven modeling and prediction tool. Bayesian networks are built from data including only the (pairwise and conditional) dependencies among the variables needed to explain the data (i.e., maximizing the likelihood of the underlying probabilistic Gaussian model). This results in much simpler, sparser, non-redundant, networks still encoding the complex structure of the dataset as revealed by standard complex measures. Moreover, the networks are capable to generalize to new data and constitute a truly probabilistic backbone of the system. When applied to climate data, it is shown that Bayesian networks faithfully reveal the various long-range teleconnections relevant in the dataset, in particular those emerging in El Niño periods.
Collapse
|
12
|
Huang WK, Cooley DS, Ebert-Uphoff I, Chen C, Chatterjee S. New Exploratory Tools for Extremal Dependence: $$\chi $$ Networks and Annual Extremal Networks. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2019. [DOI: 10.1007/s13253-019-00356-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
13
|
Cecchini G, Thiel M, Schelter B, Sommerlade L. Improving network inference: The impact of false positive and false negative conclusions about the presence or absence of links. J Neurosci Methods 2018; 307:31-36. [PMID: 29959000 DOI: 10.1016/j.jneumeth.2018.06.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Revised: 06/19/2018] [Accepted: 06/19/2018] [Indexed: 11/25/2022]
Abstract
BACKGROUND A reliable inference of networks from data is of key interest in the Neurosciences. Several methods have been suggested in the literature to reliably determine links in a network. To decide about the presence of links, these techniques rely on statistical inference, typically controlling the number of false positives, paying little attention to false negatives. NEW METHOD In this paper, by means of a comprehensive simulation study, we analyse the influence of false positive and false negative conclusions about the presence or absence of links in a network on the network topology. We show that different values to balance false positive and false negative conclusions about links should be used in order to reliably estimate network characteristics. We propose to run careful simulation studies prior to making potentially erroneous conclusion about the network topology. RESULTS Our analysis shows that optimal values to balance false positive and false negative conclusions about links depend on the network topology and characteristic of interest. COMPARISON WITH EXISTING METHODS Existing methods rely on a choice of the rate for false positive conclusions. They aim to be sure about individual links rather than the entire network. The rate of false negative conclusions is typically not investigated. CONCLUSIONS Our investigation shows that the balance of false positive and false negative conclusions about links in a network has to be tuned for any network topology that is to be estimated. Moreover, within the same network topology, the results are qualitatively the same for each network characteristic, but the actual values leading to reliable estimates of the characteristics are different.
Collapse
Affiliation(s)
- Gloria Cecchini
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Meston Building, Meston Walk, Aberdeen AB24 3UE, United Kingdom; Institute of Physics and Astronomy, University of Potsdam, Campus Golm, Karl-Liebknecht-Straße 24/25, 14476 Potsdam-Golm, Germany.
| | - Marco Thiel
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Meston Building, Meston Walk, Aberdeen AB24 3UE, United Kingdom.
| | - Björn Schelter
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Meston Building, Meston Walk, Aberdeen AB24 3UE, United Kingdom.
| | - Linda Sommerlade
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Meston Building, Meston Walk, Aberdeen AB24 3UE, United Kingdom.
| |
Collapse
|
14
|
Sanz-Garcia A, Rings T, Lehnertz K. Impact of type of intracranial EEG sensors on link strengths of evolving functional brain networks. Physiol Meas 2018; 39:074003. [PMID: 29932428 DOI: 10.1088/1361-6579/aace94] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Objective and Approach: Investigating properties of evolving functional brain networks has become a valuable tool to characterize the complex dynamics of the epileptic brain. Such networks are usually derived from electroencephalograms (EEG) recorded with sensors implanted chronically into deeper structures of the brain and/or placed onto the cortex. It is still unclear, however, whether the use of different sensors for an identification of network nodes affects properties of functional brain networks. We address this question by investigating properties of links of such networks that we characterize by assessing interactions in multi-sensor, multi-day EEG data recorded from 49 epilepsy patients during presurgical evaluation. These data allow us to study the impact of different types of sensors together with the impact of various physiologic and pathophysiologic activities on the properties of links. MAIN RESULTS We observe that different types of sensors differently impact on spatial means and temporal fluctuations of link strengths. Moreover, the impact depends on the relative anatomical location of sensors with respect to location and extent of sources of the prevailing activities. SIGNIFICANCE Type and location of sensors should be considered when constructing networks.
Collapse
Affiliation(s)
- Ancor Sanz-Garcia
- Instituto de Investigacion Sanitaria, Hospital Universitario De La Princesa, C/Diego de Leon 62, 28006 Madrid, Spain
| | | | | |
Collapse
|
15
|
|
16
|
Barfuss W, Massara GP, Di Matteo T, Aste T. Parsimonious modeling with information filtering networks. Phys Rev E 2016; 94:062306. [PMID: 28085404 DOI: 10.1103/physreve.94.062306] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2016] [Indexed: 06/06/2023]
Abstract
We introduce a methodology to construct parsimonious probabilistic models. This method makes use of information filtering networks to produce a robust estimate of the global sparse inverse covariance from a simple sum of local inverse covariances computed on small subparts of the network. Being based on local and low-dimensional inversions, this method is computationally very efficient and statistically robust, even for the estimation of inverse covariance of high-dimensional, noisy, and short time series. Applied to financial data our method results are computationally more efficient than state-of-the-art methodologies such as Glasso producing, in a fraction of the computation time, models that can have equivalent or better performances but with a sparser inference structure. We also discuss performances with sparse factor models where we notice that relative performances decrease with the number of factors. The local nature of this approach allows us to perform computations in parallel and provides a tool for dynamical adaptation by partial updating when the properties of some variables change without the need of recomputing the whole model. This makes this approach particularly suitable to handle big data sets with large numbers of variables. Examples of practical application for forecasting, stress testing, and risk allocation in financial systems are also provided.
Collapse
Affiliation(s)
- Wolfram Barfuss
- Department of Physics, FAU Erlangen-Nürnberg, Nägelsbachstrasse 49b, 91052 Erlangen, Germany
| | - Guido Previde Massara
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - T Di Matteo
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
- Department of Mathematics, King's College London, The Strand, London, WC2R 2LS, United Kingdom
- Systemic Risk Centre, London School of Economics and Political Sciences, London, WC2A2AE, United Kingdom
| | - Tomaso Aste
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
- Systemic Risk Centre, London School of Economics and Political Sciences, London, WC2A2AE, United Kingdom
| |
Collapse
|
17
|
Rings T, Lehnertz K. Distinguishing between direct and indirect directional couplings in large oscillator networks: Partial or non-partial phase analyses? CHAOS (WOODBURY, N.Y.) 2016; 26:093106. [PMID: 27781446 DOI: 10.1063/1.4962295] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We investigate the relative merit of phase-based methods for inferring directional couplings in complex networks of weakly interacting dynamical systems from multivariate time-series data. We compare the evolution map approach and its partialized extension to each other with respect to their ability to correctly infer the network topology in the presence of indirect directional couplings for various simulated experimental situations using coupled model systems. In addition, we investigate whether the partialized approach allows for additional or complementary indications of directional interactions in evolving epileptic brain networks using intracranial electroencephalographic recordings from an epilepsy patient. For such networks, both direct and indirect directional couplings can be expected, given the brain's connection structure and effects that may arise from limitations inherent to the recording technique. Our findings indicate that particularly in larger networks (number of nodes ≫10), the partialized approach does not provide information about directional couplings extending the information gained with the evolution map approach.
Collapse
Affiliation(s)
- Thorsten Rings
- Department of Epileptology, University of Bonn, Sigmund-Freud-Straße 25, 53105 Bonn, Germany
| | - Klaus Lehnertz
- Department of Epileptology, University of Bonn, Sigmund-Freud-Straße 25, 53105 Bonn, Germany
| |
Collapse
|
18
|
Unravelling the community structure of the climate system by using lags and symbolic time-series analysis. Sci Rep 2016; 6:29804. [PMID: 27406342 PMCID: PMC4942694 DOI: 10.1038/srep29804] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 06/20/2016] [Indexed: 11/24/2022] Open
Abstract
Many natural systems can be represented by complex networks of dynamical units with modular structure in the form of communities of densely interconnected nodes. Unraveling this community structure from observed data requires the development of appropriate tools, particularly when the nodes are embedded in a regular space grid and the datasets are short and noisy. Here we propose two methods to identify communities, and validate them with the analysis of climate datasets recorded at a regular grid of geographical locations covering the Earth surface. By identifying mutual lags among time-series recorded at different grid points, and by applying symbolic time-series analysis, we are able to extract meaningful regional communities, which can be interpreted in terms of large-scale climate phenomena. The methods proposed here are valuable tools for the study of other systems represented by networks of dynamical units, allowing the identification of communities, through time-series analysis of the observed output signals.
Collapse
|
19
|
Iqbal K, Buijsse B, Wirth J, Schulze MB, Floegel A, Boeing H. Gaussian Graphical Models Identify Networks of Dietary Intake in a German Adult Population. J Nutr 2016; 146:646-52. [PMID: 26817715 DOI: 10.3945/jn.115.221135] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 12/17/2015] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Data-reduction methods such as principal component analysis are often used to derive dietary patterns. However, such methods do not assess how foods are consumed in relation to each other. Gaussian graphical models (GGMs) are a set of novel methods that can address this issue. OBJECTIVE We sought to apply GGMs to derive sex-specific dietary intake networks representing consumption patterns in a German adult population. METHODS Dietary intake data from 10,780 men and 16,340 women of the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam cohort were cross-sectionally analyzed to construct dietary intake networks. Food intake for each participant was estimated using a 148-item food-frequency questionnaire that captured the intake of 49 food groups. GGMs were applied to log-transformed intakes (grams per day) of 49 food groups to construct sex-specific food networks. Semiparametric Gaussian copula graphical models (SGCGMs) were used to confirm GGM results. RESULTS In men, GGMs identified 1 major dietary network that consisted of intakes of red meat, processed meat, cooked vegetables, sauces, potatoes, cabbage, poultry, legumes, mushrooms, soup, and whole-grain and refined breads. For women, a similar network was identified with the addition of fried potatoes. Other identified networks consisted of dairy products and sweet food groups. SGCGMs yielded results comparable to those of GGMs. CONCLUSIONS GGMs are a powerful exploratory method that can be used to construct dietary networks representing dietary intake patterns that reveal how foods are consumed in relation to each other. GGMs indicated an apparent major role of red meat intake in a consumption pattern in the studied population. In the future, identified networks might be transformed into pattern scores for investigating their associations with health outcomes.
Collapse
Affiliation(s)
| | | | | | - Matthias B Schulze
- Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany; and German Center for Diabetes Research, Neuherberg, Germany
| | | | | |
Collapse
|
20
|
Runge J. Quantifying information transfer and mediation along causal pathways in complex systems. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 92:062829. [PMID: 26764766 DOI: 10.1103/physreve.92.062829] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Indexed: 06/05/2023]
Abstract
Measures of information transfer have become a popular approach to analyze interactions in complex systems such as the Earth or the human brain from measured time series. Recent work has focused on causal definitions of information transfer aimed at decompositions of predictive information about a target variable, while excluding effects of common drivers and indirect influences. While common drivers clearly constitute a spurious causality, the aim of the present article is to develop measures quantifying different notions of the strength of information transfer along indirect causal paths, based on first reconstructing the multivariate causal network. Another class of novel measures quantifies to what extent different intermediate processes on causal paths contribute to an interaction mechanism to determine pathways of causal information transfer. The proposed framework complements predictive decomposition schemes by focusing more on the interaction mechanism between multiple processes. A rigorous mathematical framework allows for a clear information-theoretic interpretation that can also be related to the underlying dynamics as proven for certain classes of processes. Generally, however, estimates of information transfer remain hard to interpret for nonlinearly intertwined complex systems. But if experiments or mathematical models are not available, then measuring pathways of information transfer within the causal dependency structure allows at least for an abstraction of the dynamics. The measures are illustrated on a climatological example to disentangle pathways of atmospheric flow over Europe.
Collapse
Affiliation(s)
- Jakob Runge
- Potsdam Institute for Climate Impact Research, P. O. Box 60 12 03, 14412 Potsdam, Germany and Department of Physics, Humboldt University, Newtonstr. 15, 12489 Berlin, Germany
| |
Collapse
|
21
|
Ebert-Uphoff I, Deng Y. Identifying Physical Interactions from Climate Data: Challenges and Opportunities. Comput Sci Eng 2015. [DOI: 10.1109/mcse.2015.129] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
22
|
Networks: On the relation of bi- and multivariate measures. Sci Rep 2015; 5:10805. [PMID: 26042994 PMCID: PMC4455284 DOI: 10.1038/srep10805] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Accepted: 04/28/2015] [Indexed: 12/03/2022] Open
Abstract
A reliable inference of networks from observations of the nodes’ dynamics is a major challenge in physics. Interdependence measures such as a the correlation coefficient or more advanced methods based on, e.g., analytic phases of signals are employed. For several of these interdependence measures, multivariate counterparts exist that promise to enable distinguishing direct and indirect connections. Here, we demonstrate analytically how bivariate measures relate to the respective multivariate ones; this knowledge will in turn be used to demonstrate the implications of thresholded bivariate measures for network inference. Particularly, we show, that random networks are falsely identified as small-world networks if observations thereof are treated by bivariate methods. We will employ the correlation coefficient as an example for such an interdependence measure. The results can be readily transferred to all interdependence measures partializing for information of thirds in their multivariate counterparts.
Collapse
|
23
|
Deza JI, Barreiro M, Masoller C. Assessing the direction of climate interactions by means of complex networks and information theoretic tools. CHAOS (WOODBURY, N.Y.) 2015; 25:033105. [PMID: 25833427 DOI: 10.1063/1.4914101] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
An estimate of the net direction of climate interactions in different geographical regions is made by constructing a directed climate network from a regular latitude-longitude grid of nodes, using a directionality index (DI) based on conditional mutual information (CMI). Two datasets of surface air temperature anomalies-one monthly averaged and another daily averaged-are analyzed and compared. The network links are interpreted in terms of known atmospheric tropical and extra-tropical variability patterns. Specific and relevant geographical regions are selected, the net direction of propagation of the atmospheric patterns is analyzed, and the direction of the inferred links is validated by recovering some well-known climate variability structures. These patterns are found to be acting at various time-scales, such as atmospheric waves in the extratropics or longer range events in the tropics. This analysis demonstrates the capability of the DI measure to infer the net direction of climate interactions and may contribute to improve the present understanding of climate phenomena and climate predictability. The work presented here also stands out as an application of advanced tools to the analysis of empirical, real-world data.
Collapse
Affiliation(s)
- J I Deza
- Departament de Física i Enginyeria Nuclear, Universitat Politècnica de Catalunya, Colom 11, E-08222 Terrassa, Barcelona, Spain
| | - M Barreiro
- Instituto de Física, Facultad de Ciencias, Universidad de la República, Iguá 4225, Montevideo, Uruguay
| | - C Masoller
- Departament de Física i Enginyeria Nuclear, Universitat Politècnica de Catalunya, Colom 11, E-08222 Terrassa, Barcelona, Spain
| |
Collapse
|
24
|
Porz S, Kiel M, Lehnertz K. Can spurious indications for phase synchronization due to superimposed signals be avoided? CHAOS (WOODBURY, N.Y.) 2014; 24:033112. [PMID: 25273192 DOI: 10.1063/1.4890568] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
We investigate the relative merit of phase-based methods-mean phase coherence, unweighted and weighted phase lag index-for estimating the strength of interactions between dynamical systems from empirical time series which are affected by common sources and noise. By numerically analyzing the interaction dynamics of coupled model systems, we compare these methods to each other with respect to their ability to distinguish between different levels of coupling for various simulated experimental situations. We complement our numerical studies by investigating consistency and temporal variations of the strength of interactions within and between brain regions using intracranial electroencephalographic recordings from an epilepsy patient. Our findings indicate that the unweighted and weighted phase lag index are less prone to the influence of common sources but that this advantage may lead to constrictions limiting the applicability of these methods.
Collapse
Affiliation(s)
- Stephan Porz
- Department of Epileptology, University of Bonn, Sigmund-Freud-Str. 25, 53105 Bonn, Germany
| | - Matthäus Kiel
- Department of Epileptology, University of Bonn, Sigmund-Freud-Str. 25, 53105 Bonn, Germany
| | - Klaus Lehnertz
- Department of Epileptology, University of Bonn, Sigmund-Freud-Str. 25, 53105 Bonn, Germany
| |
Collapse
|