1
|
Bayesian estimation of directed functional coupling from brain recordings. PLoS One 2017; 12:e0177359. [PMID: 28545066 PMCID: PMC5436686 DOI: 10.1371/journal.pone.0177359] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 04/14/2017] [Indexed: 02/05/2023] Open
Abstract
In many fields of science, there is the need of assessing the causal influences among time series. Especially in neuroscience, understanding the causal interactions between brain regions is of primary importance. A family of measures have been developed from the parametric implementation of the Granger criteria of causality based on the linear autoregressive modelling of the signals. We propose a new Bayesian method for linear model identification with a structured prior (GMEP) aiming to apply it as linear regression method in the context of the parametric Granger causal inference. GMEP assumes a Gaussian scale mixture distribution for the group sparsity prior and it enables flexible definition of the coefficient groups. Approximate posterior inference is achieved using Expectation Propagation for both the linear coefficients and the hyperparameters. GMEP is investigated both on simulated data and on empirical fMRI data in which we show how adding information on the sparsity structure of the coefficients positively improves the inference process. In the same simulation framework, GMEP is compared with others standard linear regression methods. Moreover, the causal inferences derived from GMEP estimates and from a standard Granger method are compared across simulated datasets of different dimensionality, density connection and level of noise. GMEP allows a better model identification and consequent causal inference when prior knowledge on the sparsity structure are integrated in the structured prior.
Collapse
|
2
|
Jia G, Stephanopoulos G, Gunawan R. Incremental parameter estimation of kinetic metabolic network models. BMC SYSTEMS BIOLOGY 2012; 6:142. [PMID: 23171810 PMCID: PMC3568022 DOI: 10.1186/1752-0509-6-142] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2012] [Accepted: 11/07/2012] [Indexed: 11/10/2022]
Abstract
BACKGROUND An efficient and reliable parameter estimation method is essential for the creation of biological models using ordinary differential equation (ODE). Most of the existing estimation methods involve finding the global minimum of data fitting residuals over the entire parameter space simultaneously. Unfortunately, the associated computational requirement often becomes prohibitively high due to the large number of parameters and the lack of complete parameter identifiability (i.e. not all parameters can be uniquely identified). RESULTS In this work, an incremental approach was applied to the parameter estimation of ODE models from concentration time profiles. Particularly, the method was developed to address a commonly encountered circumstance in the modeling of metabolic networks, where the number of metabolic fluxes (reaction rates) exceeds that of metabolites (chemical species). Here, the minimization of model residuals was performed over a subset of the parameter space that is associated with the degrees of freedom in the dynamic flux estimation from the concentration time-slopes. The efficacy of this method was demonstrated using two generalized mass action (GMA) models, where the method significantly outperformed single-step estimations. In addition, an extension of the estimation method to handle missing data is also presented. CONCLUSIONS The proposed incremental estimation method is able to tackle the issue on the lack of complete parameter identifiability and to significantly reduce the computational efforts in estimating model parameters, which will facilitate kinetic modeling of genome-scale cellular metabolism in the future.
Collapse
Affiliation(s)
- Gengjie Jia
- Chemical and Pharmaceutical Engineering, Singapore-MIT Alliance, Singapore 117576, Singapore
| | | | | |
Collapse
|
3
|
FUJITA ANDRÉ, SATO JOÃORICARDO, KOJIMA KANAME, GOMES LUCIANARODRIGUES, NAGASAKI MASAO, SOGAYAR MARICLEIDE, MIYANO SATORU. IDENTIFICATION OF GRANGER CAUSALITY BETWEEN GENE SETS. J Bioinform Comput Biol 2011. [DOI: 10.1142/s0219720010004860] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Wiener and Granger have introduced an intuitive concept of causality (Granger causality) between two variables which is based on the idea that an effect never occurs before its cause. Later, Geweke generalized this concept to a multivariate Granger causality, i.e. n variables Granger-cause another variable. Although Granger causality is not "effective causality" in the Aristothelic sense, this concept is useful to infer directionality and information flow in observational data. Granger causality is usually identified by using VAR (Vector Autoregressive) models due to their simplicity. In the last few years, several VAR-based models were presented in order to model gene regulatory networks. Here, we generalize the multivariate Granger causality concept in order to identify Granger causalities between sets of gene expressions, i.e. whether a set of n genes Granger-causes another set of m genes, aiming at identifying the flow of information between gene networks (or pathways). The concept of Granger causality for sets of variables is presented. Moreover, a method for its identification with a bootstrap test is proposed. This method is applied in simulated and also in actual biological gene expression data in order to model regulatory networks. This concept may be useful for the understanding of the complete information flow from one network or pathway to the other, mainly in regulatory networks. Linking this concept to graph theory, sink and source can be generalized to node sets. Moreover, hub and centrality for sets of genes can be defined based on total information flow. Another application is in annotation, when the functionality of a set of genes is unknown, but this set is Granger-caused by another set of genes which is well studied. Therefore, this information may be useful to infer or construct some hypothesis about the unknown set of genes.
Collapse
Affiliation(s)
- ANDRÉ FUJITA
- Computational Science Research Program, RIKEN, 2-1, Hirosawa, Wako, Saitama, 351-0198, Japan
| | - JOÃO RICARDO SATO
- Center of Mathematics, Computation and Cognition, Universidade Federal do ABC, Rua Santa Adélia, 166 – Santo André, Brazil
| | - KANAME KOJIMA
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - LUCIANA RODRIGUES GOMES
- Chemistry Institute, University of São Paulo, Av. Lineu Prestes, 748 – São Paulo, 05508-900, Brazil
| | - MASAO NAGASAKI
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - MARI CLEIDE SOGAYAR
- Chemistry Institute, University of São Paulo, Av. Lineu Prestes, 748 – São Paulo, 05508-900, Brazil
| | - SATORU MIYANO
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| |
Collapse
|
4
|
Krishna R, Li CT, Buchanan-Wollaston V. A temporal precedence based clustering method for gene expression microarray data. BMC Bioinformatics 2010; 11:68. [PMID: 20113513 PMCID: PMC2841598 DOI: 10.1186/1471-2105-11-68] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2009] [Accepted: 01/30/2010] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not. RESULTS A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system. CONCLUSIONS Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits.
Collapse
Affiliation(s)
- Ritesh Krishna
- Department of Computer Science, Warwick University, Coventry CV4 7AL, UK
| | - Chang-Tsun Li
- Department of Computer Science, Warwick University, Coventry CV4 7AL, UK
| | | |
Collapse
|
5
|
Granger Causality in Systems Biology: Modeling Gene Networks in Time Series Microarray Data Using Vector Autoregressive Models. ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2010. [DOI: 10.1007/978-3-642-15060-9_2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
|
6
|
Abstract
Granger causality (GC) and its extension have been used widely to infer causal relationships from multivariate time series generated from biological systems. GC is ideally suited for causal inference in bivariate vector autoregressive process (VAR). A zero magnitude of the upper or lower off-diagonal element(s) in a bivariate VAR is indicative of lack of causal relationship in that direction resulting in true acyclic structures. However, in experimental settings, statistical tests, such as F-test that rely on the ratio of the mean-squared forecast errors, are used to infer significant GC relationships. The present study investigates acyclic approximations within the context of bi-directional two-gene network motifs modeled as bivariate VAR. The fine interplay between the model parameters in the bivariate VAR, namely: (i) transcriptional noise variance, (ii) autoregulatory feedback, and (iii) transcriptional coupling strength that can give rise to discrepancies in the ratio of the mean-squared forecast errors is investigated. Subsequently, their impact on statistical power is investigated using Monte Carlo simulations. More importantly, it is shown that one can arrive at acyclic approximations even for bi-directional networks for suitable choice of process parameters, significance level and sample size. While the results are discussed within the framework of transcriptional network, the analytical treatment provided is generic and likely to have significant impact across distinct paradigms.
Collapse
|