1
|
Tian ZQK, Chen K, Li S, McLaughlin DW, Zhou D. Causal connectivity measures for pulse-output network reconstruction: Analysis and applications. Proc Natl Acad Sci U S A 2024; 121:e2305297121. [PMID: 38551842 PMCID: PMC10998614 DOI: 10.1073/pnas.2305297121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 03/03/2024] [Indexed: 04/08/2024] Open
Abstract
The causal connectivity of a network is often inferred to understand network function. It is arguably acknowledged that the inferred causal connectivity relies on the causality measure one applies, and it may differ from the network's underlying structural connectivity. However, the interpretation of causal connectivity remains to be fully clarified, in particular, how causal connectivity depends on causality measures and how causal connectivity relates to structural connectivity. Here, we focus on nonlinear networks with pulse signals as measured output, e.g., neural networks with spike output, and address the above issues based on four commonly utilized causality measures, i.e., time-delayed correlation coefficient, time-delayed mutual information, Granger causality, and transfer entropy. We theoretically show how these causality measures are related to one another when applied to pulse signals. Taking a simulated Hodgkin-Huxley network and a real mouse brain network as two illustrative examples, we further verify the quantitative relations among the four causality measures and demonstrate that the causal connectivity inferred by any of the four well coincides with the underlying network structural connectivity, therefore illustrating a direct link between the causal and structural connectivity. We stress that the structural connectivity of pulse-output networks can be reconstructed pairwise without conditioning on the global information of all other nodes in a network, thus circumventing the curse of dimensionality. Our framework provides a practical and effective approach for pulse-output network reconstruction.
Collapse
Affiliation(s)
- Zhong-qi K. Tian
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Ministry of Education Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao Tong University, Shanghai200240, China
| | - Kai Chen
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Ministry of Education Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao Tong University, Shanghai200240, China
| | - Songting Li
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Ministry of Education Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao Tong University, Shanghai200240, China
| | - David W. McLaughlin
- Courant Institute of Mathematical Sciences, New York University, New York, NY10012
- Center for Neural Science, New York University, New York, NY10012
- Institute of Mathematical Sciences, New York University Shanghai, Shanghai200122, China
- Neuroscience Institute of New York University Langone Health, New York University, New York, NY10016
| | - Douglas Zhou
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai200240, China
- Ministry of Education Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao Tong University, Shanghai200240, China
- Shanghai Frontier Science Center of Modern Analysis, Shanghai Jiao Tong University, Shanghai200240, China
| |
Collapse
|
2
|
Das P, Babadi B. Non-Asymptotic Guarantees for Reliable Identification of Granger Causality via the LASSO. IEEE TRANSACTIONS ON INFORMATION THEORY 2023; 69:7439-7460. [PMID: 38646067 PMCID: PMC11025718 DOI: 10.1109/tit.2023.3296336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Granger causality is among the widely used data-driven approaches for causal analysis of time series data with applications in various areas including economics, molecular biology, and neuroscience. Two of the main challenges of this methodology are: 1) over-fitting as a result of limited data duration, and 2) correlated process noise as a confounding factor, both leading to errors in identifying the causal influences. Sparse estimation via the LASSO has successfully addressed these challenges for parameter estimation. However, the classical statistical tests for Granger causality resort to asymptotic analysis of ordinary least squares, which require long data duration to be useful and are not immune to confounding effects. In this work, we address this disconnect by introducing a LASSO-based statistic and studying its non-asymptotic properties under the assumption that the true models admit sparse autoregressive representations. We establish fundamental limits for reliable identification of Granger causal influences using the proposed LASSO-based statistic. We further characterize the false positive error probability and test power of a simple thresholding rule for identifying Granger causal effects and provide two methods to set the threshold in a data-driven fashion. We present simulation studies and application to real data to compare the performance of our proposed method to ordinary least squares and existing LASSO-based methods in detecting Granger causal influences, which corroborate our theoretical results.
Collapse
Affiliation(s)
- Proloy Das
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA, 02114 USA
| | - Behtash Babadi
- Department of Electrical and Computer Engineering and the Institute for Systems Research, University of Maryland, College Park, MD, 20742 USA
| |
Collapse
|
3
|
Banerjee A, Chandra S, Ott E. Network inference from short, noisy, low time-resolution, partial measurements: Application to C. elegans neuronal calcium dynamics. Proc Natl Acad Sci U S A 2023; 120:e2216030120. [PMID: 36927154 PMCID: PMC10041139 DOI: 10.1073/pnas.2216030120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 02/04/2023] [Indexed: 03/18/2023] Open
Abstract
Network link inference from measured time series data of the behavior of dynamically interacting network nodes is an important problem with wide-ranging applications, e.g., estimating synaptic connectivity among neurons from measurements of their calcium fluorescence. Network inference methods typically begin by using the measured time series to assign to any given ordered pair of nodes a numerical score reflecting the likelihood of a directed link between those two nodes. In typical cases, the measured time series data may be subject to limitations, including limited duration, low sampling rate, observational noise, and partial nodal state measurement. However, it is unknown how the performance of link inference techniques on such datasets depends on these experimental limitations of data acquisition. Here, we utilize both synthetic data generated from coupled chaotic systems as well as experimental data obtained from Caenorhabditis elegans neural activity to systematically assess the influence of data limitations on the character of scores reflecting the likelihood of a directed link between a given node pair. We do this for three network inference techniques: Granger causality, transfer entropy, and, a machine learning-based method. Furthermore, we assess the ability of appropriate surrogate data to determine statistical confidence levels associated with the results of link-inference techniques.
Collapse
Affiliation(s)
- Amitava Banerjee
- Department of Physics, University of Maryland, College Park, MD20742
- Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD20742
| | - Sarthak Chandra
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Edward Ott
- Department of Physics, University of Maryland, College Park, MD20742
- Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD20742
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD20742
| |
Collapse
|
4
|
Sattari F, Srinivasan K, Puliyanda A, Prasad V. Data Fusion-Based Approach for the Investigation of Reaction Networks in Hydrous Pyrolysis of Biomass. Ind Eng Chem Res 2023. [DOI: 10.1021/acs.iecr.2c04309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Affiliation(s)
- Fereshteh Sattari
- Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G 1H9, Canada
| | - Karthik Srinivasan
- Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G 1H9, Canada
| | - Anjana Puliyanda
- Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G 1H9, Canada
| | - Vinay Prasad
- Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G 1H9, Canada
| |
Collapse
|
5
|
Sharma K, Dwivedi YK, Metri B. Incorporating causality in energy consumption forecasting using deep neural networks. ANNALS OF OPERATIONS RESEARCH 2022:1-36. [PMID: 35967838 PMCID: PMC9362444 DOI: 10.1007/s10479-022-04857-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 06/22/2022] [Indexed: 06/15/2023]
Abstract
Forecasting energy demand has been a critical process in various decision support systems regarding consumption planning, distribution strategies, and energy policies. Traditionally, forecasting energy consumption or demand methods included trend analyses, regression, and auto-regression. With advancements in machine learning methods, algorithms such as support vector machines, artificial neural networks, and random forests became prevalent. In recent times, with an unprecedented improvement in computing capabilities, deep learning algorithms are increasingly used to forecast energy consumption/demand. In this contribution, a relatively novel approach is employed to use long-term memory. Weather data was used to forecast the energy consumption from three datasets, with an additional piece of information in the deep learning architecture. This additional information carries the causal relationships between the weather indicators and energy consumption. This architecture with the causal information is termed as entangled long short term memory. The results show that the entangled long short term memory outperforms the state-of-the-art deep learning architecture (bidirectional long short term memory). The theoretical and practical implications of these results are discussed in terms of decision-making and energy management systems.
Collapse
Affiliation(s)
- Kshitij Sharma
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway
| | - Yogesh K. Dwivedi
- Emerging Markets Research Centre (EMaRC), School of Management, Swansea University, Room #323, Bay Campus, Fabian Bay, Swansea, SA1 8EN Wales, UK
- Department of Management, Symbiosis Institute of Business Management, Pune & Symbiosis International (Deemed University), Pune, Maharashtra India
| | | |
Collapse
|
6
|
Dong F, Li K, Li Y, Liu Y, Zheng L. Factors influencing public support for banning gasoline vehicles in newly industrialized countries for the sake of environmental improvement: a case study of China. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:43942-43954. [PMID: 35122648 DOI: 10.1007/s11356-022-18884-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 01/21/2022] [Indexed: 06/14/2023]
Abstract
In recent years, various countries have put forward Banning Gasoline Vehicle Sales Policy (BGVSP), and the degree of public support for BGVSP is crucial to its design and implementation. Taking China as an example, this study built a policy support index using network crawler technology and natural language processing technology. Then, multi-spatial convergence cross-mapping technology was used to study the interaction between public support and air pollution, electric vehicle (EV) infrastructure, EV technology, and use cost. The results showed that air pollution has a significant impact on public support; public support has a significant impact on the construction of the EV infrastructure and the level of EV technological research, and the use cost of traditional gasoline vehicles has a significant impact on public support. This study investigated the correlations between public support and the factors influencing public support, and the results can be used as a reference for the design and implementation of BGVSP in newly industrialized countries.
Collapse
Affiliation(s)
- Feng Dong
- School of Economics and Management, China University of Mining and Technology, Xuzhou, 221116, People's Republic of China.
| | - Kun Li
- School of Economics and Management, China University of Mining and Technology, Xuzhou, 221116, People's Republic of China
| | - Yangfan Li
- School of Economics and Management, China University of Mining and Technology, Xuzhou, 221116, People's Republic of China
| | - Yajie Liu
- School of Economics and Management, China University of Mining and Technology, Xuzhou, 221116, People's Republic of China
| | - Lu Zheng
- School of Economics and Management, China University of Mining and Technology, Xuzhou, 221116, People's Republic of China
| |
Collapse
|
7
|
Yue T, Zhang T. Bayesian network-based missing mechanism identification (BN-MMI) method in medical research. BMC Med Inform Decis Mak 2021; 21:316. [PMID: 34772422 PMCID: PMC8588712 DOI: 10.1186/s12911-021-01677-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 11/03/2021] [Indexed: 11/10/2022] Open
Abstract
Background Traditional approaches to identify missing mechanisms are usually based on the hypothesis test and confronted with both theoretical and practical challenges. It has been proved that the Bayesian network is powerful in integrating, analyzing and visualizing information, and some previous researches have verified the promising features of Bayesian network to deal with the aforementioned challenges in missing mechanism identification. Based on the above reasons, this paper explores the application of Bayesian network to the identification of missing mechanisms for the first time, and proposes a new method, the Bayesian network-based missing mechanism identification (BN-MMI) method, to identify missing mechanism in medical research. Methods The procedure of BN-MMI method consists three easy-to-implement steps: estimating the missing data structure by the Bayesian network; assessing the credibility of the estimated missing data structure; and identifying the missing mechanism from the estimated missing data structure. The BN-MMI method is verified by simulation research and empirical research. Results The simulation study verified the validity, consistency and robustness of BN-MMI method, and indicated its outperformance in contrast to the traditional logistic regression method. In addition, the empirical study illustrated the applicability of BN-MMI method in the real world by an example of medical record data. Conclusions It was confirmed that the BN-MMI method itself, together with human knowledge and expertise, could identify the missing mechanisms according to the probabilistic dependence/independence relations among variables of interest. At the same time, our research shed light upon the potential application of BN-MMI method to a broader range of missing data issues in medical studies.
Collapse
Affiliation(s)
- Tingyan Yue
- West China Second University Hospital, Sichuan University, Chengdu, China
| | - Tao Zhang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
8
|
Nicola G, Cerchiello P, Aste T. Information Network Modeling for U.S. Banking Systemic Risk. ENTROPY 2020; 22:e22111331. [PMID: 33266514 PMCID: PMC7711443 DOI: 10.3390/e22111331] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 11/15/2020] [Accepted: 11/16/2020] [Indexed: 11/24/2022]
Abstract
In this work we investigate whether information theory measures like mutual information and transfer entropy, extracted from a bank network, Granger cause financial stress indexes like LIBOR-OIS (London Interbank Offered Rate-Overnight Index Swap) spread, STLFSI (St. Louis Fed Financial Stress Index) and USD/CHF (USA Dollar/Swiss Franc) exchange rate. The information theory measures are extracted from a Gaussian Graphical Model constructed from daily stock time series of the top 74 listed US banks. The graphical model is calculated with a recently developed algorithm (LoGo) which provides very fast inference model that allows us to update the graphical model each market day. We therefore can generate daily time series of mutual information and transfer entropy for each bank of the network. The Granger causality between the bank related measures and the financial stress indexes is investigated with both standard Granger-causality and Partial Granger-causality conditioned on control measures representative of the general economy conditions.
Collapse
Affiliation(s)
- Giancarlo Nicola
- Department of Economics and Management, University of Pavia, 27100 Pavia, Italy; (G.N.); (P.C.)
| | - Paola Cerchiello
- Department of Economics and Management, University of Pavia, 27100 Pavia, Italy; (G.N.); (P.C.)
| | - Tomaso Aste
- Department of Computer Science, University College London, London WC1E 6EA, UK
- Systemic Risk Centre, London School of Economics, London WC2A 2AE, UK
- Correspondence:
| |
Collapse
|
9
|
Almpanis E, Siettos C. Construction of functional brain connectivity networks from fMRI data with driving and modulatory inputs: an extended conditional Granger causality approach. AIMS Neurosci 2020; 7:66-88. [PMID: 32607412 PMCID: PMC7321769 DOI: 10.3934/neuroscience.2020005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 03/25/2020] [Indexed: 11/29/2022] Open
Abstract
We propose a numerical-based approach extending the conditional MVAR Granger causality (MVGC) analysis for the construction of directed connectivity networks in the presence of both exogenous/stimuli and modulatory inputs. The performance of the proposed scheme is validated using both synthetic stochastic data considering also the influence of haemodynamics latencies and a benchmark fMRI dataset related to the role of attention in the perception of visual motion. The particular fMRI dataset has been used in many studies to evaluate alternative model hypotheses using the Dynamic Causal Modelling (DCM) approach. Based on the use of the Bayes factor, we show that the obtained GC connectivity network compares well to a reference model that has been selected through DCM analysis among other candidate models. Thus, our findings suggest that the proposed scheme can be successfully used as a stand-alone or complementary to DCM approach to find directed causal connectivity patterns in task-related fMRI studies.
Collapse
Affiliation(s)
- Evangelos Almpanis
- Section of Condensed Matter Physics, National and Kapodistrian University of Athens, Greece.,Institute of Nanoscience and Nanotechnology, NCSR "Demokritos," Athens, Greece
| | - Constantinos Siettos
- Dipartimento di Matematica e Applicazioni "Renato Caccioppoli", Università degli Studi di Napoli Federico II, Italy
| |
Collapse
|
10
|
Cirrone J, Brooks MD, Bonneau R, Coruzzi GM, Shasha DE. OutPredict: multiple datasets can improve prediction of expression and inference of causality. Sci Rep 2020; 10:6804. [PMID: 32321967 PMCID: PMC7176633 DOI: 10.1038/s41598-020-63347-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 03/26/2020] [Indexed: 01/09/2023] Open
Abstract
The ability to accurately predict the causal relationships from transcription factors to genes would greatly enhance our understanding of transcriptional dynamics. This could lead to applications in which one or more transcription factors could be manipulated to effect a change in genes leading to the enhancement of some desired trait. Here we present a method called OutPredict that constructs a model for each gene based on time series (and other) data and that predicts gene's expression in a previously unseen subsequent time point. The model also infers causal relationships based on the most important transcription factors for each gene model, some of which have been validated from previous physical experiments. The method benefits from known network edges and steady-state data to enhance predictive accuracy. Our results across B. subtilis, Arabidopsis, E.coli, Drosophila and the DREAM4 simulated in silico dataset show improved predictive accuracy ranging from 40% to 60% over other state-of-the-art methods. We find that gene expression models can benefit from the addition of steady-state data to predict expression values of time series. Finally, we validate, based on limited available data, that the influential edges we infer correspond to known relationships significantly more than expected by chance or by state-of-the-art methods.
Collapse
Affiliation(s)
- Jacopo Cirrone
- Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, NY, 10012, USA.
| | - Matthew D Brooks
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA
| | - Richard Bonneau
- Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, NY, 10012, USA
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, 10010, USA
| | - Gloria M Coruzzi
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA
| | - Dennis E Shasha
- Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, NY, 10012, USA
| |
Collapse
|
11
|
Qiu X, Rahimzamani A, Wang L, Ren B, Mao Q, Durham T, McFaline-Figueroa JL, Saunders L, Trapnell C, Kannan S. Inferring Causal Gene Regulatory Networks from Coupled Single-Cell Expression Dynamics Using Scribe. Cell Syst 2020; 10:265-274.e11. [PMID: 32135093 PMCID: PMC7223477 DOI: 10.1016/j.cels.2020.02.003] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 06/08/2019] [Accepted: 02/05/2020] [Indexed: 01/13/2023]
Abstract
Here, we present Scribe (https://github.com/aristoteleo/Scribe-py), a toolkit for detecting and visualizing causal regulatory interactions between genes and explore the potential for single-cell experiments to power network reconstruction. Scribe employs restricted directed information to determine causality by estimating the strength of information transferred from a potential regulator to its downstream target. We apply Scribe and other leading approaches for causal network reconstruction to several types of single-cell measurements and show that there is a dramatic drop in performance for "pseudotime"-ordered single-cell data compared with true time-series data. We demonstrate that performing causal inference requires temporal coupling between measurements. We show that methods such as "RNA velocity" restore some degree of coupling through an analysis of chromaffin cell fate commitment. These analyses highlight a shortcoming in experimental and computational methods for analyzing gene regulation at single-cell resolution and suggest ways of overcoming it.
Collapse
Affiliation(s)
- Xiaojie Qiu
- Molecular & Cellular Biology Program, University of Washington, Seattle, WA, USA; Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Arman Rahimzamani
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA
| | - Li Wang
- Department of Mathematics, University of Texas at Arlington, Arlington, TX, USA
| | - Bingcheng Ren
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Qi Mao
- HERE company, Chicago, IL 60606, USA
| | - Timothy Durham
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Lauren Saunders
- Molecular & Cellular Biology Program, University of Washington, Seattle, WA, USA; Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Molecular & Cellular Biology Program, University of Washington, Seattle, WA, USA; Department of Genome Sciences, University of Washington, Seattle, WA, USA; Brotman-Baty Institute for Precision Medicine, Seattle, WA, USA.
| | - Sreeram Kannan
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
12
|
Zhu MJ, Dong CY, Chen XY, Ren JW, Zhao XY. Identifying the pulsed neuron networks' structures by a nonlinear Granger causality method. BMC Neurosci 2020; 21:7. [PMID: 32050908 PMCID: PMC7017568 DOI: 10.1186/s12868-020-0555-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Accepted: 02/03/2020] [Indexed: 11/26/2022] Open
Abstract
Background It is a crucial task of brain science researches to explore functional connective maps of Biological Neural Networks (BNN). The maps help to deeply study the dominant relationship between the structures of the BNNs and their network functions. Results In this study, the ideas of linear Granger causality modeling and causality identification are extended to those of nonlinear Granger causality modeling and network structure identification. We employed Radial Basis Functions to fit the nonlinear multivariate dynamical responses of BNNs with neuronal pulse firing. By introducing the contributions from presynaptic neurons and detecting whether the predictions for postsynaptic neurons’ pulse firing signals are improved or not, we can reveal the information flows distribution of BNNs. Thus, the functional connections from presynaptic neurons can be identified from the obtained network information flows. To verify the effectiveness of the proposed method, the Nonlinear Granger Causality Identification Method (NGCIM) is applied to the network structure discovery processes of Spiking Neural Networks (SNN). SNN is a simulation model based on an Integrate-and-Fire mechanism. By network simulations, the multi-channel neuronal pulse sequence data of the SNNs can be used to reversely identify the synaptic connections and strengths of the SNNs. Conclusions The identification results show: for 2–6 nodes small-scale neural networks, 20 nodes medium-scale neural networks, and 100 nodes large-scale neural networks, the identification accuracy of NGCIM with the Gaussian kernel function was 100%, 99.64%, 98.64%, 98.37%, 98.31%, 84.87% and 80.56%, respectively. The identification accuracies were significantly higher than those of a traditional Linear Granger Causality Identification Method with the same network sizes. Thus, with an accumulation of the data obtained by the existing measurement methods, such as Electroencephalography, functional Magnetic Resonance Imaging, and Multi-Electrode Array, the NGCIM can be a promising network modeling method to infer the functional connective maps of BNNs.
Collapse
Affiliation(s)
- Mei-Jia Zhu
- School of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China.,Inner Mongolia Key Laboratory of Mechanical and Electrical Control, Hohhot, 010051, China
| | - Chao-Yi Dong
- School of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China. .,Inner Mongolia Key Laboratory of Mechanical and Electrical Control, Hohhot, 010051, China.
| | - Xiao-Yan Chen
- School of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China.,Inner Mongolia Key Laboratory of Mechanical and Electrical Control, Hohhot, 010051, China
| | - Jing-Wen Ren
- School of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China.,Inner Mongolia Key Laboratory of Mechanical and Electrical Control, Hohhot, 010051, China
| | - Xiao-Yi Zhao
- School of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China.,Inner Mongolia Key Laboratory of Mechanical and Electrical Control, Hohhot, 010051, China
| |
Collapse
|
13
|
Affiliation(s)
- Marco Scutari
- Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA) Manno Switzerland
| |
Collapse
|
14
|
Muldoon JJ, Yu JS, Fassia MK, Bagheri N. Network inference performance complexity: a consequence of topological, experimental and algorithmic determinants. Bioinformatics 2019; 35:3421-3432. [PMID: 30932143 PMCID: PMC6748731 DOI: 10.1093/bioinformatics/btz105] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 01/24/2019] [Accepted: 02/11/2019] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Network inference algorithms aim to uncover key regulatory interactions governing cellular decision-making, disease progression and therapeutic interventions. Having an accurate blueprint of this regulation is essential for understanding and controlling cell behavior. However, the utility and impact of these approaches are limited because the ways in which various factors shape inference outcomes remain largely unknown. RESULTS We identify and systematically evaluate determinants of performance-including network properties, experimental design choices and data processing-by developing new metrics that quantify confidence across algorithms in comparable terms. We conducted a multifactorial analysis that demonstrates how stimulus target, regulatory kinetics, induction and resolution dynamics, and noise differentially impact widely used algorithms in significant and previously unrecognized ways. The results show how even if high-quality data are paired with high-performing algorithms, inferred models are sometimes susceptible to giving misleading conclusions. Lastly, we validate these findings and the utility of the confidence metrics using realistic in silico gene regulatory networks. This new characterization approach provides a way to more rigorously interpret how algorithms infer regulation from biological datasets. AVAILABILITY AND IMPLEMENTATION Code is available at http://github.com/bagherilab/networkinference/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joseph J Muldoon
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL, USA
| | - Jessica S Yu
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Mohammad-Kasim Fassia
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Department of Biomedical Engineering, Northwestern University, Evanston, IL, USA
| | - Neda Bagheri
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA
| |
Collapse
|
15
|
Duren Z, Wang Y, Wang J, Zhao XM, Lv L, Li X, Liu J, Zhu XG, Chen L, Wang Y. Hierarchical graphical model reveals HFR1 bridging circadian rhythm and flower development in Arabidopsis thaliana. NPJ Syst Biol Appl 2019; 5:28. [PMID: 31428455 PMCID: PMC6690920 DOI: 10.1038/s41540-019-0106-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 07/23/2019] [Indexed: 01/02/2023] Open
Abstract
To study systems-level properties of the cell, it is necessary to go beyond individual regulators and target genes to study the regulatory network among transcription factors (TFs). However, it is difficult to directly dissect the TFs mediated genome-wide gene regulatory network (GRN) by experiment. Here, we proposed a hierarchical graphical model to estimate TF activity from mRNA expression by building TF complexes with protein cofactors and inferring TF's downstream regulatory network simultaneously. Then we applied our model on flower development and circadian rhythm processes in Arabidopsis thaliana. The computational results show that the sequence specific bHLH family TF HFR1 recruits the chromatin regulator HAC1 to flower development master regulator TF AG and further activates AG's expression by histone acetylation. Both independent data and experimental results supported this discovery. We also found a flower tissue specific H3K27ac ChIP-seq peak at AG gene body and a HFR1 motif in the center of this H3K27ac peak. Furthermore, we verified that HFR1 physically interacts with HAC1 by yeast two-hybrid experiment. This HFR1-HAC1-AG triplet relationship may imply that flower development and circadian rhythm are bridged by epigenetic regulation and enrich the classical ABC model in flower development. In addition, our TF activity network can serve as a general method to elucidate molecular mechanisms on other complex biological regulatory processes.
Collapse
Affiliation(s)
- Zhana Duren
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190 China
- University of Chinese Academy of Sciences, Beijing, 100049 China
| | - Yaling Wang
- State Key Laboratory of Molecular Plant Sciences and Center of Excellence for Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032 China
| | - Jiguang Wang
- Division of Life Science, Department of Chemical and Biological Engineering, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433 China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, Shanghai, China
| | - Le Lv
- Bayer U.S. – Crop Science, Monsanto Legal Entity, St. Louis, MO 63156 USA
| | - Xiaobo Li
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA 94305 USA
| | - Jingdong Liu
- Bayer U.S. – Crop Science, Monsanto Legal Entity, St. Louis, MO 63156 USA
| | - Xin-Guang Zhu
- State Key Laboratory of Molecular Plant Sciences and Center of Excellence for Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032 China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, 200031 China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223 China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210 China
- Research Center for Brain Science and Brain-Inspired Intelligence, 201210 Shanghai, China
| | - Yong Wang
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190 China
- University of Chinese Academy of Sciences, Beijing, 100049 China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223 China
| |
Collapse
|
16
|
Zhang T, Ma Y, Xiao X, Lin Y, Zhang X, Yin F, Li X. Dynamic Bayesian network in infectious diseases surveillance: a simulation study. Sci Rep 2019; 9:10376. [PMID: 31316113 PMCID: PMC6637193 DOI: 10.1038/s41598-019-46737-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 07/04/2019] [Indexed: 11/09/2022] Open
Abstract
The surveillance of infectious diseases relies on the identification of dynamic relations between the infectious diseases and corresponding influencing factors. However, the identification task confronts with two practical challenges: small sample size and delayed effect. To overcome both challenges to imporve the identification results, this study evaluated the performance of dynamic Bayesian network(DBN) in infectious diseases surveillance. Specifically, the evaluation was conducted by two simulations. The first simulation was to evaluate the performance of DBN by comparing it with the Granger causality test and the least absolute shrinkage and selection operator (LASSO) method; and the second simulation was to assess how the DBN could improve the forecasting ability of infectious diseases. In order to make both simulations close to the real-world situation as much as possible, their simulation scenarios were adapted from real-world studies, and practical issues such as nonlinearity and nuisance variables were also considered. The main simulation results were: ① When the sample size was large (n = 340), the true positive rates (TPRs) of DBN (≥98%) were slightly higher than those of the Granger causality method and approximately the same as those of the LASSO method; the false positive rates (FPRs) of DBN were averagely 46% less than those of the Granger causality test, and 22% less than those of the LASSO method. ② When the sample size was small, the main problem was low TPR, which would be further aggravated by the issues of nonlinearity and nuisance variables. In the worst situation (i.e., small sample size, nonlinearity and existence of nuisance variables), the TPR of DBN declined to 43.30%. However, it was worth noting that such decline could also be found in the corresponding results of Granger causality test and LASSO method. ③ Sample size was important for identifying the dynamic relations among multiple variables, in this case, at least three years of weekly historical data were needed to guarantee the quality of infectious diseases surveillance. ④ DBN could improve the foresting results through reducing forecasting errors by 7%. According to the above results, DBN is recommended to improve the quality of infectious diseases surveillance.
Collapse
Affiliation(s)
- Tao Zhang
- Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China
| | - Yue Ma
- Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China.
| | - Xiong Xiao
- Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China
| | - Yun Lin
- Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China
| | - Xingyu Zhang
- Department of Systems, Populations and Leadership, University of Michigan, School of Nursing, Ann Arbor, USA.
| | - Fei Yin
- Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China.
| | - Xiaosong Li
- Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China
| |
Collapse
|
17
|
Garren JM, Kim J. Bootstrapping Time-Course Gene Expression Data for Gene Networks: Application to Gene Relevance Networks. J Comput Biol 2018; 25:1374-1384. [PMID: 30133320 DOI: 10.1089/cmb.2018.0029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Identification of gene regulatory networks (GRNs) is a fundamental step to understand the molecular role of each gene and it helps to develop treatment and cure of a disease. To identify GRNs, time-course gene expression data are widely used. However, the identification is hampered by intrinsic attributes of the data such as small sample size, a large number of variables, and complex error structures with high variation. Under this situation, most GRN inference methods utilize point estimators or make numerous assumptions that are often incompatible with the experimental data. Moreover, different inference methods often provide inconsistent results. An alternative to alleviate this problem can be the bootstrap method because it provides more reliable outcomes by integrating results from multiple bootstrap samples without any distributional assumptions. In this study, we propose a bootstrap method for dependent time-course gene expression data and we mainly focus on its application to gene relevance networks. The proposed method is applied to gene networks for zebrafish retina.
Collapse
Affiliation(s)
| | - Jaejik Kim
- Department of Statistics, Sungkyunkwan University, Seoul, Korea
| |
Collapse
|
18
|
Ruan X, Wülfing C, Murphy RF. Image-based spatiotemporal causality inference for protein signaling networks. Bioinformatics 2017; 33:i217-i224. [PMID: 28881992 PMCID: PMC5870542 DOI: 10.1093/bioinformatics/btx258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Motivation Efforts to model how signaling and regulatory networks work in cells have largely either not considered spatial organization or have used compartmental models with minimal spatial resolution. Fluorescence microscopy provides the ability to monitor the spatiotemporal distribution of many molecules during signaling events, but as of yet no methods have been described for large scale image analysis to learn a complex protein regulatory network. Here we present and evaluate methods for identifying how changes in concentration in one cell region influence concentration of other proteins in other regions. Results Using 3D confocal microscope movies of GFP-tagged T cells undergoing costimulation, we learned models containing putative causal relationships among 12 proteins involved in T cell signaling. The models included both relationships consistent with current knowledge and novel predictions deserving further exploration. Further, when these models were applied to the initial frames of movies of T cells that had been only partially stimulated, they predicted the localization of proteins at later times with statistically significant accuracy. The methods, consisting of spatiotemporal alignment, automated region identification, and causal inference, are anticipated to be applicable to a number of biological systems. Availability and implementation The source code and data are available as a Reproducible Research Archive at http://murphylab.cbd.cmu.edu/software/2017_TcellCausalModels/
Collapse
Affiliation(s)
- Xiongtao Ruan
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Christoph Wülfing
- School of Cellular and Molecular Medicine, University of Bristol, Bristol BS, UK
| | - Robert F Murphy
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.,Departments of Biological Sciences, Biomedical Engineering, and Machine Learning, Carnegie Mellon University, Pittsburgh, PA, USA.,Freiburg Institute for Advanced Studies and Faculty of Biology, Albert Ludwig University of Freiburg, Freiburg im Breisgau, Baden-Württemberg, Germany
| |
Collapse
|
19
|
Liang Y, Kelemen A. Computational dynamic approaches for temporal omics data with applications to systems medicine. BioData Min 2017. [PMID: 28638442 PMCID: PMC5473988 DOI: 10.1186/s13040-017-0140-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Modeling and predicting biological dynamic systems and simultaneously estimating the kinetic structural and functional parameters are extremely important in systems and computational biology. This is key for understanding the complexity of the human health, drug response, disease susceptibility and pathogenesis for systems medicine. Temporal omics data used to measure the dynamic biological systems are essentials to discover complex biological interactions and clinical mechanism and causations. However, the delineation of the possible associations and causalities of genes, proteins, metabolites, cells and other biological entities from high throughput time course omics data is challenging for which conventional experimental techniques are not suited in the big omics era. In this paper, we present various recently developed dynamic trajectory and causal network approaches for temporal omics data, which are extremely useful for those researchers who want to start working in this challenging research area. Moreover, applications to various biological systems, health conditions and disease status, and examples that summarize the state-of-the art performances depending on different specific mining tasks are presented. We critically discuss the merits, drawbacks and limitations of the approaches, and the associated main challenges for the years ahead. The most recent computing tools and software to analyze specific problem type, associated platform resources, and other potentials for the dynamic trajectory and interaction methods are also presented and discussed in detail.
Collapse
Affiliation(s)
- Yulan Liang
- Department of Family and Community Health, University of Maryland, Baltimore, MD 21201 USA
| | - Arpad Kelemen
- Department of Organizational Systems and Adult Health, University of Maryland, Baltimore, MD 21201 USA
| |
Collapse
|
20
|
Yang G, Wang L, Wang X. Reconstruction of Complex Directional Networks with Group Lasso Nonlinear Conditional Granger Causality. Sci Rep 2017; 7:2991. [PMID: 28592807 PMCID: PMC5462833 DOI: 10.1038/s41598-017-02762-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 04/18/2017] [Indexed: 12/19/2022] Open
Abstract
Reconstruction of networks underlying complex systems is one of the most crucial problems in many areas of engineering and science. In this paper, rather than identifying parameters of complex systems governed by pre-defined models or taking some polynomial and rational functions as a prior information for subsequent model selection, we put forward a general framework for nonlinear causal network reconstruction from time-series with limited observations. With obtaining multi-source datasets based on the data-fusion strategy, we propose a novel method to handle nonlinearity and directionality of complex networked systems, namely group lasso nonlinear conditional granger causality. Specially, our method can exploit different sets of radial basis functions to approximate the nonlinear interactions between each pair of nodes and integrate sparsity into grouped variables selection. The performance characteristic of our approach is firstly assessed with two types of simulated datasets from nonlinear vector autoregressive model and nonlinear dynamic models, and then verified based on the benchmark datasets from DREAM3 Challenge4. Effects of data size and noise intensity are also discussed. All of the results demonstrate that the proposed method performs better in terms of higher area under precision-recall curve.
Collapse
Affiliation(s)
- Guanxue Yang
- Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, P. R. China
| | - Lin Wang
- Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, P. R. China
| | - Xiaofan Wang
- Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, P. R. China.
| |
Collapse
|
21
|
Ghodsi Z, Huang X, Hassani H. Causality analysis detects the regulatory role of maternal effect genes in the early Drosophila embryo. GENOMICS DATA 2017; 11:20-38. [PMID: 27924281 PMCID: PMC5129166 DOI: 10.1016/j.gdata.2016.11.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2016] [Revised: 10/28/2016] [Accepted: 11/10/2016] [Indexed: 11/28/2022]
Abstract
In developmental studies, inferring regulatory interactions of segmentation genetic network play a vital role in unveiling the mechanism of pattern formation. As such, there exists an opportune demand for theoretical developments and new mathematical models which can result in a more accurate illustration of this genetic network. Accordingly, this paper seeks to extract the meaningful regulatory role of the maternal effect genes using a variety of causality detection techniques and to explore whether these methods can suggest a new analytical view to the gene regulatory networks. We evaluate the use of three different powerful and widely-used models representing time and frequency domain Granger causality and convergent cross mapping technique with the results being thoroughly evaluated for statistical significance. Our findings show that the regulatory role of maternal effect genes is detectable in different time classes and thereby the method is applicable to infer the possible regulatory interactions present among the other genes of this network.
Collapse
Affiliation(s)
- Zara Ghodsi
- Statistical Research Centre, Bournemouth University, 89 Holdenhurst Road, Bournemouth BH8 8EB, UK; Translational Genetics Group, Bournemouth University, Fern Barrow, Poole BH125BB, UK
| | - Xu Huang
- Statistical Research Centre, Bournemouth University, 89 Holdenhurst Road, Bournemouth BH8 8EB, UK
| | - Hossein Hassani
- Institute for International Energy Studies (IIES), Tehran 1967743 711, Iran
| |
Collapse
|
22
|
Tefera DT, Yañez Jaramillo LM, Ranjan R, Li C, de Klerk A, Prasad V. A Bayesian Learning Approach to Modeling Pseudoreaction Networks for Complex Reacting Systems: Application to the Mild Visbreaking of Bitumen. Ind Eng Chem Res 2017. [DOI: 10.1021/acs.iecr.6b04437] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Dereje Tamiru Tefera
- Department of Chemical and
Materials Engineering, University of Alberta Edmonton, Alberta T6G 1H9, Canada
| | - Lina Maria Yañez Jaramillo
- Department of Chemical and
Materials Engineering, University of Alberta Edmonton, Alberta T6G 1H9, Canada
| | - Rajesh Ranjan
- Department of Chemical and
Materials Engineering, University of Alberta Edmonton, Alberta T6G 1H9, Canada
| | - Chaoqun Li
- Department of Chemical and
Materials Engineering, University of Alberta Edmonton, Alberta T6G 1H9, Canada
| | - Arno de Klerk
- Department of Chemical and
Materials Engineering, University of Alberta Edmonton, Alberta T6G 1H9, Canada
| | - Vinay Prasad
- Department of Chemical and
Materials Engineering, University of Alberta Edmonton, Alberta T6G 1H9, Canada
| |
Collapse
|
23
|
Sriyudthsak K, Mejia RF, Arita M, Hirai MY. PASMet: a web-based platform for prediction, modelling and analyses of metabolic systems. Nucleic Acids Res 2016; 44:W205-11. [PMID: 27174940 PMCID: PMC4987946 DOI: 10.1093/nar/gkw415] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 05/01/2016] [Indexed: 12/29/2022] Open
Abstract
PASMet (Prediction, Analysis and Simulation of Metabolic networks) is a web-based platform for proposing and verifying mathematical models to understand the dynamics of metabolism. The advantages of PASMet include user-friendliness and accessibility, which enable biologists and biochemists to easily perform mathematical modelling. PASMet offers a series of user-functions to handle the time-series data of metabolite concentrations. The functions are organised into four steps: (i) Prediction of a probable metabolic pathway and its regulation; (ii) Construction of mathematical models; (iii) Simulation of metabolic behaviours; and (iv) Analysis of metabolic system characteristics. Each function contains various statistical and mathematical methods that can be used independently. Users who may not have enough knowledge of computing or programming can easily and quickly analyse their local data without software downloads, updates or installations. Users only need to upload their files in comma-separated values (CSV) format or enter their model equations directly into the website. Once the time-series data or mathematical equations are uploaded, PASMet automatically performs computation on server-side. Then, users can interactively view their results and directly download them to their local computers. PASMet is freely available with no login requirement at http://pasmet.riken.jp/ from major web browsers on Windows, Mac and Linux operating systems.
Collapse
Affiliation(s)
| | | | - Masanori Arita
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa 230-0045, Japan National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
| | - Masami Yokota Hirai
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
24
|
Sriyudthsak K, Shiraishi F, Hirai MY. Mathematical Modeling and Dynamic Simulation of Metabolic Reaction Systems Using Metabolome Time Series Data. Front Mol Biosci 2016; 3:15. [PMID: 27200361 PMCID: PMC4853375 DOI: 10.3389/fmolb.2016.00015] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 04/12/2016] [Indexed: 01/05/2023] Open
Abstract
The high-throughput acquisition of metabolome data is greatly anticipated for the complete understanding of cellular metabolism in living organisms. A variety of analytical technologies have been developed to acquire large-scale metabolic profiles under different biological or environmental conditions. Time series data are useful for predicting the most likely metabolic pathways because they provide important information regarding the accumulation of metabolites, which implies causal relationships in the metabolic reaction network. Considerable effort has been undertaken to utilize these data for constructing a mathematical model merging system properties and quantitatively characterizing a whole metabolic system in toto. However, there are technical difficulties between benchmarking the provision and utilization of data. Although, hundreds of metabolites can be measured, which provide information on the metabolic reaction system, simultaneous measurement of thousands of metabolites is still challenging. In addition, it is nontrivial to logically predict the dynamic behaviors of unmeasurable metabolite concentrations without sufficient information on the metabolic reaction network. Yet, consolidating the advantages of advancements in both metabolomics and mathematical modeling remain to be accomplished. This review outlines the conceptual basis of and recent advances in technologies in both the research fields. It also highlights the potential for constructing a large-scale mathematical model by estimating model parameters from time series metabolome data in order to comprehensively understand metabolism at the systems level.
Collapse
Affiliation(s)
| | - Fumihide Shiraishi
- Department of Bioscience and Biotechnology, Graduate School of Bioresource and Bioenvironmental Science, Kyushu UniversityFukuoka, Japan
| | | |
Collapse
|
25
|
Yao S, Yoo S, Yu D. Prior knowledge driven Granger causality analysis on gene regulatory network discovery. BMC Bioinformatics 2015; 16:273. [PMID: 26316173 PMCID: PMC4551367 DOI: 10.1186/s12859-015-0710-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 08/17/2015] [Indexed: 12/20/2022] Open
Abstract
Background Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. Results In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. Conclusions In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. Our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.
Collapse
Affiliation(s)
- Shun Yao
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, 11790, NY, USA. .,Computational Science Center, Brookhaven National Laboratory, Upton, 11793, NY, USA.
| | - Shinjae Yoo
- Computational Science Center, Brookhaven National Laboratory, Upton, 11793, NY, USA.
| | - Dantong Yu
- Computational Science Center, Brookhaven National Laboratory, Upton, 11793, NY, USA.
| |
Collapse
|
26
|
Inferring Broad Regulatory Biology from Time Course Data: Have We Reached an Upper Bound under Constraints Typical of In Vivo Studies? PLoS One 2015; 10:e0127364. [PMID: 25984725 PMCID: PMC4435750 DOI: 10.1371/journal.pone.0127364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/13/2015] [Indexed: 12/21/2022] Open
Abstract
There is a growing appreciation for the network biology that regulates the coordinated expression of molecular and cellular markers however questions persist regarding the identifiability of these networks. Here we explore some of the issues relevant to recovering directed regulatory networks from time course data collected under experimental constraints typical of in vivo studies. NetSim simulations of sparsely connected biological networks were used to evaluate two simple feature selection techniques used in the construction of linear Ordinary Differential Equation (ODE) models, namely truncation of terms versus latent vector projection. Performance was compared with ODE-based Time Series Network Identification (TSNI) integral, and the information-theoretic Time-Delay ARACNE (TD-ARACNE). Projection-based techniques and TSNI integral outperformed truncation-based selection and TD-ARACNE on aggregate networks with edge densities of 10-30%, i.e. transcription factor, protein-protein cliques and immune signaling networks. All were more robust to noise than truncation-based feature selection. Performance was comparable on the in silico 10-node DREAM 3 network, a 5-node Yeast synthetic network designed for In vivo Reverse-engineering and Modeling Assessment (IRMA) and a 9-node human HeLa cell cycle network of similar size and edge density. Performance was more sensitive to the number of time courses than to sample frequency and extrapolated better to larger networks by grouping experiments. In all cases performance declined rapidly in larger networks with lower edge density. Limited recovery and high false positive rates obtained overall bring into question our ability to generate informative time course data rather than the design of any particular reverse engineering algorithm.
Collapse
|
27
|
He F, Wei HL, Billings SA, Sarrigiannis PG. A nonlinear generalization of spectral Granger causality. IEEE Trans Biomed Eng 2015; 61:1693-701. [PMID: 24845279 DOI: 10.1109/tbme.2014.2300636] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Spectral measures of linear Granger causality have been widely applied to study the causal connectivity between time series data in neuroscience, biology, and economics. Traditional Granger causality measures are based on linear autoregressive with exogenous (ARX) inputs models of time series data, which cannot truly reveal nonlinear effects in the data especially in the frequency domain. In this study, it is shown that the classical Geweke's spectral causality measure can be explicitly linked with the output spectra of corresponding restricted and unrestricted time-domain models. The latter representation is then generalized to nonlinear bivariate signals and for the first time nonlinear causality analysis in the frequency domain. This is achieved by using the nonlinear ARX (NARX) modeling of signals, and decomposition of the recently defined output frequency response function which is related to the NARX model.
Collapse
|
28
|
Detangling complex relationships in forensic data: principles and use of causal networks and their application to clinical forensic science. Int J Legal Med 2015; 129:1163-72. [DOI: 10.1007/s00414-015-1164-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2014] [Accepted: 02/27/2015] [Indexed: 10/23/2022]
|
29
|
Lim N, d’Alché-Buc F, Auliac C, Michailidis G. Operator-valued kernel-based vector autoregressive models for network inference. Mach Learn 2014. [DOI: 10.1007/s10994-014-5479-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
30
|
Acerbi E, Zelante T, Narang V, Stella F. Gene network inference using continuous time Bayesian networks: a comparative study and application to Th17 cell differentiation. BMC Bioinformatics 2014; 15:387. [PMID: 25495206 PMCID: PMC4267461 DOI: 10.1186/s12859-014-0387-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 11/17/2014] [Indexed: 12/17/2022] Open
Abstract
Background Dynamic aspects of gene regulatory networks are typically investigated by measuring system variables at multiple time points. Current state-of-the-art computational approaches for reconstructing gene networks directly build on such data, making a strong assumption that the system evolves in a synchronous fashion at fixed points in time. However, nowadays omics data are being generated with increasing time course granularity. Thus, modellers now have the possibility to represent the system as evolving in continuous time and to improve the models’ expressiveness. Results Continuous time Bayesian networks are proposed as a new approach for gene network reconstruction from time course expression data. Their performance was compared to two state-of-the-art methods: dynamic Bayesian networks and Granger causality analysis. On simulated data, the methods comparison was carried out for networks of increasing size, for measurements taken at different time granularity densities and for measurements unevenly spaced over time. Continuous time Bayesian networks outperformed the other methods in terms of the accuracy of regulatory interactions learnt from data for all network sizes. Furthermore, their performance degraded smoothly as the size of the network increased. Continuous time Bayesian networks were significantly better than dynamic Bayesian networks for all time granularities tested and better than Granger causality for dense time series. Both continuous time Bayesian networks and Granger causality performed robustly for unevenly spaced time series, with no significant loss of performance compared to the evenly spaced case, while the same did not hold true for dynamic Bayesian networks. The comparison included the IRMA experimental datasets which confirmed the effectiveness of the proposed method. Continuous time Bayesian networks were then applied to elucidate the regulatory mechanisms controlling murine T helper 17 (Th17) cell differentiation and were found to be effective in discovering well-known regulatory mechanisms, as well as new plausible biological insights. Conclusions Continuous time Bayesian networks were effective on networks of both small and large size and were particularly feasible when the measurements were not evenly distributed over time. Reconstruction of the murine Th17 cell differentiation network using continuous time Bayesian networks revealed several autocrine loops, suggesting that Th17 cells may be auto regulating their own differentiation process.
Collapse
Affiliation(s)
- Enzo Acerbi
- Singapore Immunology Network (SIgN), A*STAR, 8A Biomedical Grove, Immunos Building, Level 4 138648, Singapore.
| | | | | | | |
Collapse
|
31
|
Duan P, Chen T, Shah SL, Yang F. Methods for root cause diagnosis of plant-wide oscillations. AIChE J 2014. [DOI: 10.1002/aic.14391] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Ping Duan
- Dept. of Electrical and Computer Engineering; University of Alberta; Edmonton AB Canada T6G 2V4
| | - Tongwen Chen
- Dept. of Electrical and Computer Engineering; University of Alberta; Edmonton AB Canada T6G 2V4
| | - Sirish L. Shah
- Dept. of Chemical and Materials Engineering; University of Alberta; Edmonton AB Canada T6G 2G6
| | - Fan Yang
- Tsinghua National Laboratory for Information Science and Technology and; Dept. of Automation, Tsinghua University; Beijing 100084 China
| |
Collapse
|
32
|
Hartung T, Hoffmann S, Stephens M. Mechanistic validation. ALTEX-ALTERNATIVES TO ANIMAL EXPERIMENTATION 2013; 30:119-30. [PMID: 23665802 DOI: 10.14573/altex.2013.2.119] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Validation of new approaches in regulatory toxicology is commonly defined as the independent assessment of the reproducibility and relevance (the scientific basis and predictive capacity) of a test for a particular purpose. In large ring trials, the emphasis to date has been mainly on reproducibility and predictive capacity (comparison to the traditional test) with less attention given to the scientific or mechanistic basis. Assessing predictive capacity is difficult for novel approaches (which are based on mechanism), such as pathways of toxicity or the complex networks within the organism (systems toxicology). This is highly relevant for implementing Toxicology for the 21st Century, either by high-throughput testing in the ToxCast/Tox21 project or omics-based testing in the Human Toxome Project. This article explores the mostly neglected assessment of a test's scientific basis, which moves mechanism and causality to the foreground when validating/qualifying tests. Such mechanistic validation faces the problem of establishing causality in complex systems. However, pragmatic adaptations of the Bradford Hill criteria, as well as bioinformatic tools, are emerging. As critical infrastructures of the organism are perturbed by a toxic mechanism we argue that by focusing on the target of toxicity and its vulnerability, in addition to the way it is perturbed, we can anchor the identification of the mechanism and its verification.
Collapse
Affiliation(s)
- Thomas Hartung
- Center for Alternatives to Animal Testing (CAAT), Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.
| | | | | |
Collapse
|
33
|
Intra- and inter-brain synchronization during musical improvisation on the guitar. PLoS One 2013; 8:e73852. [PMID: 24040094 PMCID: PMC3769391 DOI: 10.1371/journal.pone.0073852] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 07/25/2013] [Indexed: 01/23/2023] Open
Abstract
Humans interact with the environment through sensory and motor acts. Some of these interactions require synchronization among two or more individuals. Multiple-trial designs, which we have used in past work to study interbrain synchronization in the course of joint action, constrain the range of observable interactions. To overcome the limitations of multiple-trial designs, we conducted single-trial analyses of electroencephalography (EEG) signals recorded from eight pairs of guitarists engaged in musical improvisation. We identified hyper-brain networks based on a complex interplay of different frequencies. The intra-brain connections primarily involved higher frequencies (e.g., beta), whereas inter-brain connections primarily operated at lower frequencies (e.g., delta and theta). The topology of hyper-brain networks was frequency-dependent, with a tendency to become more regular at higher frequencies. We also found hyper-brain modules that included nodes (i.e., EEG electrodes) from both brains. Some of the observed network properties were related to musical roles during improvisation. Our findings replicate and extend earlier work and point to mechanisms that enable individuals to engage in temporally coordinated joint action.
Collapse
|
34
|
McFarlin DR, Kerr DL, Nitschke JB. Upsampling to 400-ms resolution for assessing effective connectivity in functional magnetic resonance imaging data with Granger causality. Brain Connect 2013; 3:61-71. [PMID: 23134194 PMCID: PMC3621314 DOI: 10.1089/brain.2012.0093] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Granger causality analysis of functional magnetic resonance imaging (fMRI) blood-oxygen-level-dependent signal data allows one to infer the direction and magnitude of influence that brain regions exert on one another. We employed a method for upsampling the time resolution of fMRI data that does not require additional interpolation beyond the interpolation that is regularly used for slice-timing correction. The mathematics for this new method are provided, and simulations demonstrate its viability. Using fMRI, 17 snake phobics and 19 healthy controls viewed snake, disgust, and neutral fish video clips preceded by anticipatory cues. Multivariate Granger causality models at the native 2-sec resolution and at the upsampled 400-ms resolution assessed directional associations of fMRI data among 13 anatomical regions of interest identified in prior research on anxiety and emotion. Superior sensitivity was observed for the 400-ms model, both for connectivity within each group and for group differences in connectivity. Context-dependent analyses for the 400-ms multivariate Granger causality model revealed the specific trial types showing group differences in connectivity. This is the first demonstration of effective connectivity of fMRI data using a method for achieving 400-ms resolution without sacrificing accuracy available at 2-sec resolution.
Collapse
Affiliation(s)
- Daniel R McFarlin
- Waisman Laboratory for Brain Imaging and Behavior, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA.
| | | | | |
Collapse
|
35
|
Abstract
Network inference approaches are now widely used in biological applications to probe regulatory relationships between molecular components such as genes or proteins. Many methods have been proposed for this setting, but the connections and differences between their statistical formulations have received less attention. In this paper, we show how a broad class of statistical network inference methods, including a number of existing approaches, can be described in terms of variable selection for the linear model. This reveals some subtle but important differences between the methods, including the treatment of time intervals in discretely observed data. In developing a general formulation, we also explore the relationship between single-cell stochastic dynamics and network inference on averages over cells. This clarifies the link between biochemical networks as they operate at the cellular level and network inference as carried out on data that are averages over populations of cells. We present empirical results, comparing thirty-two network inference methods that are instances of the general formulation we describe, using two published dynamical models. Our investigation sheds light on the applicability and limitations of network inference and provides guidance for practitioners and suggestions for experimental design.
Collapse
Affiliation(s)
- C J Oates
- Centre for Complexity Science, University of Warwick, CV4 7AL, UK ; Department of Statistics, University of Warwick, CV4 7AL, UK ; Netherlands Cancer Institute, 1066 CX, Amsterdam, The Netherlands
| | | |
Collapse
|
36
|
Cheng W, Ji X, Zhang J, Feng J. Individual classification of ADHD patients by integrating multiscale neuroimaging markers and advanced pattern recognition techniques. Front Syst Neurosci 2012; 6:58. [PMID: 22888314 PMCID: PMC3412279 DOI: 10.3389/fnsys.2012.00058] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2012] [Accepted: 07/19/2012] [Indexed: 11/01/2022] Open
Abstract
Accurate classification or prediction of the brain state across individual subject, i.e., healthy, or with brain disorders, is generally a more difficult task than merely finding group differences. The former must be approached with highly informative and sensitive biomarkers as well as effective pattern classification/feature selection approaches. In this paper, we propose a systematic methodology to discriminate attention deficit hyperactivity disorder (ADHD) patients from healthy controls on the individual level. Multiple neuroimaging markers that are proved to be sensitive features are identified, which include multiscale characteristics extracted from blood oxygenation level dependent (BOLD) signals, such as regional homogeneity (ReHo) and amplitude of low-frequency fluctuations. Functional connectivity derived from Pearson, partial, and spatial correlation is also utilized to reflect the abnormal patterns of functional integration, or, dysconnectivity syndromes in the brain. These neuroimaging markers are calculated on either voxel or regional level. Advanced feature selection approach is then designed, including a brain-wise association study (BWAS). Using identified features and proper feature integration, a support vector machine (SVM) classifier can achieve a cross-validated classification accuracy of 76.15% across individuals from a large dataset consisting of 141 healthy controls and 98 ADHD patients, with the sensitivity being 63.27% and the specificity being 85.11%. Our results show that the most discriminative features for classification are primarily associated with the frontal and cerebellar regions. The proposed methodology is expected to improve clinical diagnosis and evaluation of treatment for ADHD patient, and to have wider applications in diagnosis of general neuropsychiatric disorders.
Collapse
Affiliation(s)
- Wei Cheng
- Centre for Computational Systems Biology, Fudan University Shanghai, P.R. China
| | | | | | | |
Collapse
|
37
|
Zhang X, Kendrick KM, Zhou H, Zhan Y, Feng J. A computational study on altered theta-gamma coupling during learning and phase coding. PLoS One 2012; 7:e36472. [PMID: 22737207 PMCID: PMC3380897 DOI: 10.1371/journal.pone.0036472] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Accepted: 04/09/2012] [Indexed: 11/19/2022] Open
Abstract
There is considerable interest in the role of coupling between theta and gamma oscillations in the brain in the context of learning and memory. Here we have used a neural network model which is capable of producing coupling of theta phase to gamma amplitude firstly to explore its ability to reproduce reported learning changes and secondly to memory-span and phase coding effects. The spiking neural network incorporates two kinetically different GABAA receptor-mediated currents to generate both theta and gamma rhythms and we have found that by selective alteration of both NMDA receptors and GABAA,slow receptors it can reproduce learning-related changes in the strength of coupling between theta and gamma either with or without coincident changes in theta amplitude. When the model was used to explore the relationship between theta and gamma oscillations, working memory capacity and phase coding it showed that the potential storage capacity of short term memories, in terms of nested gamma-subcycles, coincides with the maximal theta power. Increasing theta power is also related to the precision of theta phase which functions as a potential timing clock for neuronal firing in the cortex or hippocampus.
Collapse
Affiliation(s)
- Xuejuan Zhang
- Mathematical Department, Zhejiang Normal University, Jinhua, P.R. China
| | - Keith M. Kendrick
- Key Laboratory for Neuroinformation, Social Cognition and Affective Neuroscience Group, School of Life Sciences and Technology, University of Electronic Science and Technology of China, Chengdu, P.R. China
| | - Haifu Zhou
- Mathematical Department, Zhejiang Normal University, Jinhua, P.R. China
| | - Yang Zhan
- Mouse Biology Unit, European Molecular Biology Laboratory, Monterotondo, Italy
- EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Jianfeng Feng
- Mathematical Department, Zhejiang Normal University, Jinhua, P.R. China
- Center for Computational System Biology, Fudan University, Shanghai, P.R. China
- Department of Computer Science, Warwick University, Coventry, United Kingdom
- * E-mail:
| |
Collapse
|
38
|
|
39
|
Abstract
Estimating conditional dependence between two random variables given the knowledge of a third random variable is essential in neuroscientific applications to understand the causal architecture of a distributed network. However, existing methods of assessing conditional dependence, such as the conditional mutual information, are computationally expensive, involve free parameters, and are difficult to understand in the context of realizations. In this letter, we discuss a novel approach to this problem and develop a computationally simple and parameter-free estimator. The difference between the proposed approach and the existing ones is that the former expresses conditional dependence in terms of a finite set of realizations, whereas the latter use random variables, which are not available in practice. We call this approach conditional association, since it is based on a generalization of the concept of association to arbitrary metric spaces. We also discuss a novel and computationally efficient approach of generating surrogate data for evaluating the significance of the acquired association value.
Collapse
Affiliation(s)
- Sohan Seth
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32608, USA.
| | | |
Collapse
|
40
|
Castro-Melchor M, Le H, Hu WS. Transcriptome data analysis for cell culture processes. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2012; 127:27-70. [PMID: 22194060 DOI: 10.1007/10_2011_116] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
In the past decade, DNA microarrays have fundamentally changed the way we study complex biological systems. By measuring the expression levels of thousands of transcripts, the paradigm of studying organisms has shifted from focusing on the local phenomena of a few genes to surveying the whole genome. DNA microarrays are used in a variety of ways, from simple comparisons between two samples to more intricate time-series studies. With the large number of genes being studied, the dimensionality of the problem is inevitably high. The analysis of microarray data thus requires specific approaches. In the case of time-series microarray studies, data analysis is further complicated by the correlation between successive time points in a series.In this review, we survey the methodologies used in the analysis of static and time-series microarray data, covering data pre-processing, identification of differentially expressed genes, profile pattern recognition, pathway analysis, and network reconstruction. When available, examples of their use in mammalian cell cultures are presented.
Collapse
|
41
|
Kleinberg S, Hripcsak G. A review of causal inference for biomedical informatics. J Biomed Inform 2011; 44:1102-12. [PMID: 21782035 PMCID: PMC3219814 DOI: 10.1016/j.jbi.2011.07.001] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 05/16/2011] [Accepted: 07/04/2011] [Indexed: 11/26/2022]
Abstract
Causality is an important concept throughout the health sciences and is particularly vital for informatics work such as finding adverse drug events or risk factors for disease using electronic health records. While philosophers and scientists working for centuries on formalizing what makes something a cause have not reached a consensus, new methods for inference show that we can make progress in this area in many practical cases. This article reviews core concepts in understanding and identifying causality and then reviews current computational methods for inference and explanation, focusing on inference from large-scale observational data. While the problem is not fully solved, we show that graphical models and Granger causality provide useful frameworks for inference and that a more recent approach based on temporal logic addresses some of the limitations of these methods.
Collapse
Affiliation(s)
- Samantha Kleinberg
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States.
| | | |
Collapse
|
42
|
Benz HL, Zhang H, Bezerianos A, Acharya S, Crone NE, Zheng X, Thakor NV. Connectivity analysis as a novel approach to motor decoding for prosthesis control. IEEE Trans Neural Syst Rehabil Eng 2011; 20:143-52. [PMID: 22084052 DOI: 10.1109/tnsre.2011.2175309] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The use of neural signals for prosthesis control is an emerging frontier of research to restore lost function to amputees and the paralyzed. Electrocorticography (ECoG) brain-machine interfaces (BMI) are an alternative to EEG and neural spiking and local field potential BMI approaches. Conventional ECoG BMIs rely on spectral analysis at specific electrode sites to extract signals for controlling prostheses. We compare traditional features with information about the connectivity of an ECoG electrode network. We use time-varying dynamic Bayesian networks (TV-DBN) to determine connectivity between ECoG channels in humans during a motor task. We show that, on average, TV-DBN connectivity decreases from baseline preceding movement and then becomes negative, indicating an alteration in the phase relationship between electrode pairs. In some subjects, this change occurs preceding and during movement, before changes in low or high frequency power. We tested TV-DBN output in a hand kinematic decoder and obtained an average correlation coefficient (r(2)) between actual and predicted joint angle of 0.40, and as high as 0.66 in one subject. This result compares favorably with spectral feature decoders, for which the average correlation coefficient was 0.13. This work introduces a new feature set based on connectivity and demonstrates its potential to improve ECoG BMI accuracy.
Collapse
Affiliation(s)
- Heather L Benz
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA.
| | | | | | | | | | | | | |
Collapse
|
43
|
Valdes-Sosa PA, Roebroeck A, Daunizeau J, Friston K. Effective connectivity: influence, causality and biophysical modeling. Neuroimage 2011; 58:339-61. [PMID: 21477655 PMCID: PMC3167373 DOI: 10.1016/j.neuroimage.2011.03.058] [Citation(s) in RCA: 252] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2010] [Revised: 03/15/2011] [Accepted: 03/23/2011] [Indexed: 11/30/2022] Open
Abstract
This is the final paper in a Comments and Controversies series dedicated to "The identification of interacting networks in the brain using fMRI: Model selection, causality and deconvolution". We argue that discovering effective connectivity depends critically on state-space models with biophysically informed observation and state equations. These models have to be endowed with priors on unknown parameters and afford checks for model Identifiability. We consider the similarities and differences among Dynamic Causal Modeling, Granger Causal Modeling and other approaches. We establish links between past and current statistical causal modeling, in terms of Bayesian dependency graphs and Wiener-Akaike-Granger-Schweder influence measures. We show that some of the challenges faced in this field have promising solutions and speculate on future developments.
Collapse
Affiliation(s)
- Pedro A Valdes-Sosa
- Cuban Neuroscience Center, Ave 25 #15202 esquina 158, Cubanacan, Playa, Cuba.
| | | | | | | |
Collapse
|
44
|
Ge T, Feng J, Grabenhorst F, Rolls ET. Componential Granger causality, and its application to identifying the source and mechanisms of the top-down biased activation that controls attention to affective vs sensory processing. Neuroimage 2011; 59:1846-58. [PMID: 21888980 DOI: 10.1016/j.neuroimage.2011.08.047] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2011] [Revised: 07/13/2011] [Accepted: 08/15/2011] [Indexed: 11/29/2022] Open
Abstract
We describe a new measure of Granger causality, componential Granger causality, and show how it can be applied to the identification of the directionality of influences between brain areas with functional neuroimaging data. Componential Granger causality measures the effect of y on x, but allows interaction effects between y and x to be measured. In addition, the terms in componential Granger causality sum to 1, allowing causal effects to be directly compared between systems. We show using componential Granger causality analysis applied to an fMRI investigation that there is a top-down attentional effect from the anterior dorsolateral prefrontal cortex to the orbitofrontal cortex when attention is paid to the pleasantness of a taste, and that this effect depends on the activity in the orbitofrontal cortex as shown by the interaction term. Correspondingly there is a top-down attentional effect from the posterior dorsolateral prefrontal cortex to the insular primary taste cortex when attention is paid to the intensity of a taste, and this effect depends on the activity of the insular primary taste cortex as shown by the interaction term. Componential Granger causality thus not only can reveal the directionality of effects between areas (and these can be bidirectional), but also allows the mechanisms to be understood in terms of whether the causal influence of one system on another depends on the state of the system being causally influenced. Componential Granger causality measures the full effects of second order statistics by including variance and covariance effects between each time series, thus allowing interaction effects to be measured, and also provides a systematic framework within which to measure the effects of cross, self, and noise contributions to causality. The findings reveal some of the mechanisms involved in a biased activation theory of selective attention.
Collapse
Affiliation(s)
- Tian Ge
- Centre for Computational Systems Biology, School of Mathematical Sciences, Fudan University, Shanghai, China
| | | | | | | |
Collapse
|
45
|
Penfold CA, Wild DL. How to infer gene networks from expression profiles, revisited. Interface Focus 2011; 1:857-70. [PMID: 23226586 DOI: 10.1098/rsfs.2011.0053] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 07/12/2011] [Indexed: 01/17/2023] Open
Abstract
Inferring the topology of a gene-regulatory network (GRN) from genome-scale time-series measurements of transcriptional change has proved useful for disentangling complex biological processes. To address the challenges associated with this inference, a number of competing approaches have previously been used, including examples from information theory, Bayesian and dynamic Bayesian networks (DBNs), and ordinary differential equation (ODE) or stochastic differential equation. The performance of these competing approaches have previously been assessed using a variety of in silico and in vivo datasets. Here, we revisit this work by assessing the performance of more recent network inference algorithms, including a novel non-parametric learning approach based upon nonlinear dynamical systems. For larger GRNs, containing hundreds of genes, these non-parametric approaches more accurately infer network structures than do traditional approaches, but at significant computational cost. For smaller systems, DBNs are competitive with the non-parametric approaches with respect to computational time and accuracy, and both of these approaches appear to be more accurate than Granger causality-based methods and those using simple ODEs models.
Collapse
|
46
|
Granger causality with signal-dependent noise. Neuroimage 2011; 57:1422-9. [PMID: 21645623 DOI: 10.1016/j.neuroimage.2011.05.054] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2011] [Revised: 05/05/2011] [Accepted: 05/17/2011] [Indexed: 01/28/2023] Open
Abstract
It is generally believed that the noise variance in in vivo neuronal data exhibits time-varying volatility, particularly signal-dependent noise. Despite a widely used and powerful tool to detect causal influences in various data sources, Granger causality has not been well tailored for time-varying volatility models. In this technical note, a unified treatment of the causal influences in both mean and variance is naturally proposed on models with signal-dependent noise in both time and frequency domains. The approach is first systematically validated on toy models, and then applied to the physiological data collected from Parkinson patients, where a clear advantage over the classical Granger causality is demonstrated.
Collapse
|
47
|
Hu S, Dai G, Worrell GA, Dai Q, Liang H. Causality analysis of neural connectivity: critical examination of existing methods and advances of new methods. ACTA ACUST UNITED AC 2011; 22:829-44. [PMID: 21511564 DOI: 10.1109/tnn.2011.2123917] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Granger causality (GC) is one of the most popular measures to reveal causality influence of time series and has been widely applied in economics and neuroscience. Especially, its counterpart in frequency domain, spectral GC, as well as other Granger-like causality measures have recently been applied to study causal interactions between brain areas in different frequency ranges during cognitive and perceptual tasks. In this paper, we show that: 1) GC in time domain cannot correctly determine how strongly one time series influences the other when there is directional causality between two time series, and 2) spectral GC and other Granger-like causality measures have inherent shortcomings and/or limitations because of the use of the transfer function (or its inverse matrix) and partial information of the linear regression model. On the other hand, we propose two novel causality measures (in time and frequency domains) for the linear regression model, called new causality and new spectral causality, respectively, which are more reasonable and understandable than GC or Granger-like measures. Especially, from one simple example, we point out that, in time domain, both new causality and GC adopt the concept of proportion, but they are defined on two different equations where one equation (for GC) is only part of the other (for new causality), thus the new causality is a natural extension of GC and has a sound conceptual/theoretical basis, and GC is not the desired causal influence at all. By several examples, we confirm that new causality measures have distinct advantages over GC or Granger-like measures. Finally, we conduct event-related potential causality analysis for a subject with intracranial depth electrodes undergoing evaluation for epilepsy surgery, and show that, in the frequency domain, all measures reveal significant directional event-related causality, but the result from new spectral causality is consistent with event-related time-frequency power spectrum activity. The spectral GC as well as other Granger-like measures are shown to generate misleading results. The proposed new causality measures may have wide potential applications in economics and neuroscience.
Collapse
Affiliation(s)
- Sanqing Hu
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, China.
| | | | | | | | | |
Collapse
|
48
|
Yuan Y, Li CT, Windram O. Directed partial correlation: inferring large-scale gene regulatory network through induced topology disruptions. PLoS One 2011; 6:e16835. [PMID: 21494330 PMCID: PMC3071805 DOI: 10.1371/journal.pone.0016835] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2010] [Accepted: 01/11/2011] [Indexed: 11/19/2022] Open
Abstract
Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC) method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/).
Collapse
Affiliation(s)
- Yinyin Yuan
- Cancer Research UK, Cambridge Research Institute, Cambridge, United Kingdom.
| | | | | |
Collapse
|
49
|
Shojaie A, Michailidis G. Discovering graphical Granger causality using the truncating lasso penalty. Bioinformatics 2010; 26:i517-23. [PMID: 20823316 PMCID: PMC2935442 DOI: 10.1093/bioinformatics/btq377] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Components of biological systems interact with each other in order to carry out vital cell functions. Such information can be used to improve estimation and inference, and to obtain better insights into the underlying cellular mechanisms. Discovering regulatory interactions among genes is therefore an important problem in systems biology. Whole-genome expression data over time provides an opportunity to determine how the expression levels of genes are affected by changes in transcription levels of other genes, and can therefore be used to discover regulatory interactions among genes. RESULTS In this article, we propose a novel penalization method, called truncating lasso, for estimation of causal relationships from time-course gene expression data. The proposed penalty can correctly determine the order of the underlying time series, and improves the performance of the lasso-type estimators. Moreover, the resulting estimate provides information on the time lag between activation of transcription factors and their effects on regulated genes. We provide an efficient algorithm for estimation of model parameters, and show that the proposed method can consistently discover causal relationships in the large p, small n setting. The performance of the proposed model is evaluated favorably in simulated, as well as real, data examples. AVAILABILITY The proposed truncating lasso method is implemented in the R-package 'grangerTlasso' and is freely available at http://www.stat.lsa.umich.edu/~shojaie/.
Collapse
Affiliation(s)
- Ali Shojaie
- Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | |
Collapse
|
50
|
Küffner R, Petri T, Windhager L, Zimmer R. Petri Nets with Fuzzy Logic (PNFL): reverse engineering and parametrization. PLoS One 2010; 5. [PMID: 20862218 PMCID: PMC2942832 DOI: 10.1371/journal.pone.0012807] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2010] [Accepted: 06/18/2010] [Indexed: 12/31/2022] Open
Abstract
Background The recent DREAM4 blind assessment provided a particularly realistic and challenging setting for network reverse engineering methods. The in silico part of DREAM4 solicited the inference of cycle-rich gene regulatory networks from heterogeneous, noisy expression data including time courses as well as knockout, knockdown and multifactorial perturbations. Methodology and Principal Findings We inferred and parametrized simulation models based on Petri Nets with Fuzzy Logic (PNFL). This completely automated approach correctly reconstructed networks with cycles as well as oscillating network motifs. PNFL was evaluated as the best performer on DREAM4 in silico networks of size 10 with an area under the precision-recall curve (AUPR) of 81%. Besides topology, we inferred a range of additional mechanistic details with good reliability, e.g. distinguishing activation from inhibition as well as dependent from independent regulation. Our models also performed well on new experimental conditions such as double knockout mutations that were not included in the provided datasets. Conclusions The inference of biological networks substantially benefits from methods that are expressive enough to deal with diverse datasets in a unified way. At the same time, overly complex approaches could generate multiple different models that explain the data equally well. PNFL appears to strike the balance between expressive power and complexity. This also applies to the intuitive representation of PNFL models combining a straightforward graphical notation with colloquial fuzzy parameters.
Collapse
Affiliation(s)
- Robert Küffner
- Institut für Informatik, Ludwig-Maximilians-Universität, München, Germany.
| | | | | | | |
Collapse
|