1
|
Theorell A, Jadebeck JF, Wiechert W, McFadden J, Nöh K. Rethinking 13C-metabolic flux analysis - The Bayesian way of flux inference. Metab Eng 2024; 83:137-149. [PMID: 38582144 DOI: 10.1016/j.ymben.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 03/22/2024] [Accepted: 03/23/2024] [Indexed: 04/08/2024]
Abstract
Metabolic reaction rates (fluxes) play a crucial role in comprehending cellular phenotypes and are essential in areas such as metabolic engineering, biotechnology, and biomedical research. The state-of-the-art technique for estimating fluxes is metabolic flux analysis using isotopic labelling (13C-MFA), which uses a dataset-model combination to determine the fluxes. Bayesian statistical methods are gaining popularity in the field of life sciences, but the use of 13C-MFA is still dominated by conventional best-fit approaches. The slow take-up of Bayesian approaches is, at least partly, due to the unfamiliarity of Bayesian methods to metabolic engineering researchers. To address this unfamiliarity, we here outline similarities and differences between the two approaches and highlight particular advantages of the Bayesian way of flux analysis. With a real-life example, re-analysing a moderately informative labelling dataset of E. coli, we identify situations in which Bayesian methods are advantageous and more informative, pointing to potential pitfalls of current 13C-MFA evaluation approaches. We propose the use of Bayesian model averaging (BMA) for flux inference as a means of overcoming the problem of model uncertainty through its tendency to assign low probabilities to both, models that are unsupported by data, and models that are overly complex. In this capacity, BMA resembles a tempered Ockham's razor. With the tempered razor as a guide, BMA-based 13C-MFA alleviates the problem of model selection uncertainty and is thereby capable of becoming a game changer for metabolic engineering by uncovering new insights and inspiring novel approaches.
Collapse
Affiliation(s)
- Axel Theorell
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Johann F Jadebeck
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52062 Aachen, Germany
| | - Wolfgang Wiechert
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52062 Aachen, Germany
| | - Johnjoe McFadden
- Department of Microbial and Cellular Sciences, University of Surrey, GU2 7XH Guildford, United Kingdom
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany.
| |
Collapse
|
2
|
Loskot P. A query-response causal analysis of reaction events in biochemical reaction networks. Comput Biol Chem 2024; 108:107995. [PMID: 38039799 DOI: 10.1016/j.compbiolchem.2023.107995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 11/16/2023] [Accepted: 11/27/2023] [Indexed: 12/03/2023]
Abstract
The stochastic kinetics of biochemical reaction networks is described by a chemical master equation (CME) and the underlying laws of mass action. Assuming network-free simulations of the rule-based models of biochemical reaction networks (BRNs), this paper departs from the usual analysis of network dynamics as the time-dependent distributions of chemical species counts, and instead considers statistically evaluating the sequences of reaction events generated from the stochastic simulations. The reaction event-time series can be used for reaction clustering, identifying rare events, and recognizing the periods of increased or steady-state activity. However, the main aim of this paper is to device an effective method for identifying causally and anti-causally related sub-sequences of reaction events using their empirical probabilities. This allows discovering some of the causal dynamics of BRNs as well as uncovering their short-term deterministic behaviors. In particular, it is proposed that the reaction sub-sequences that are conditionally nearly certain or nearly uncertain can be considered as being causally related. Moreover, since the time-ordering of reaction events is locally irrelevant, the reaction sub-sequences can be transformed into the reaction sets or multi-sets. The distance metrics can be then used to define the equivalences among the reaction events. The proposed method for identifying the causally related reaction sub-sequences has been implemented as a computationally efficient query-response mechanism. The method was evaluated for five models of genetic networks in seven defined numerical experiments. The models were simulated in BioNetGen using the open-source network-free simulator NFsim. This simulator had to be modified first to allow recording the traces of reaction events, and it is available in the Github repository, ploskot/nfsim_1.20. The generated event time-series were analyzed with Python and Matlab scripts. The whole process of data generation, analysis and visualization has been nearly fully automated using shell scripts. This demonstrates the opportunities for substantially increasing the research productivity by creating automated data generation and processing pipelines.
Collapse
Affiliation(s)
- Pavel Loskot
- ZJU-UIUC Institute, 314400, Haining, Zhejiang, China.
| |
Collapse
|
3
|
Chastain E. Formal autopoiesis: Solutions of the classical and extended functional closure equations. Biosystems 2023; 226:104872. [PMID: 36921792 DOI: 10.1016/j.biosystems.2023.104872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/27/2023] [Accepted: 02/28/2023] [Indexed: 03/16/2023]
Abstract
Formalization of autopoiesis is an ongoing effort among theoretical biologists. In this field, Letelier and co-authors proposed that Robert Rosen's (M,R)-systems theory be used as a formalism for autopoiesis. In (M,R)-systems theory, Rosen proposes that one solve a set of functional closure equations (FCEs) which account for all of the components of the system as coming from within the system itself. A key part of the functional closure equations is the repair of the metabolism component of the system. Rosen's theory gives the organizational closure of the components as well as their products, as found in autopoiesis. However, according to Razeto-Barry (M,R)-systems leaves out some of the messiness and approximation that we find in autopoiesis as he reformulates it. A related problem is that though FCEs have a long history, they are difficult in practice to solve due to their mathematical formulation. In this paper we give a novel exact solution for the FCEs for continuous real vector-valued functions which is nevertheless difficult to compute. In addition we propose an extended form of FCEs which both captures more of the messiness of autopoiesis and also helps to make the FCEs more solvable. Finally, we use our solution for the extended FCEs to give an extended repair function for a metabolism taken from a representative class of biological dynamics for gene expression (the repressilator). More generally we show that one can use our solution for the extended FCEs to get an extended repair function for continuous real vector-valued functions.
Collapse
Affiliation(s)
- Erick Chastain
- Departments of Mathematics and Computer Science, University of Dallas, United States of America.
| |
Collapse
|
4
|
Mao G, Zeng R, Peng J, Zuo K, Pang Z, Liu J. Reconstructing gene regulatory networks of biological function using differential equations of multilayer perceptrons. BMC Bioinformatics 2022; 23:503. [PMID: 36434499 PMCID: PMC9700916 DOI: 10.1186/s12859-022-05055-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 11/14/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Building biological networks with a certain function is a challenge in systems biology. For the functionality of small (less than ten nodes) biological networks, most methods are implemented by exhausting all possible network topological spaces. This exhaustive approach is difficult to scale to large-scale biological networks. And regulatory relationships are complex and often nonlinear or non-monotonic, which makes inference using linear models challenging. RESULTS In this paper, we propose a multi-layer perceptron-based differential equation method, which operates by training a fully connected neural network (NN) to simulate the transcription rate of genes in traditional differential equations. We verify whether the regulatory network constructed by the NN method can continue to achieve the expected biological function by verifying the degree of overlap between the regulatory network discovered by NN and the regulatory network constructed by the Hill function. And we validate our approach by adapting to noise signals, regulator knockout, and constructing large-scale gene regulatory networks using link-knockout techniques. We apply a real dataset (the mesoderm inducer Xenopus Brachyury expression) to construct the core topology of the gene regulatory network and find that Xbra is only strongly expressed at moderate levels of activin signaling. CONCLUSION We have demonstrated from the results that this method has the ability to identify the underlying network topology and functional mechanisms, and can also be applied to larger and more complex gene network topologies.
Collapse
Affiliation(s)
- Guo Mao
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Ruigeng Zeng
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Jintao Peng
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Ke Zuo
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Zhengbin Pang
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Jie Liu
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China ,grid.412110.70000 0000 9548 2110Laboratory of Software Engineering for Complex System, National University of Defense Technology, Deya Road, Changsha, 410073 China
| |
Collapse
|
5
|
Liu W, Sun X, Yang L, Li K, Yang Y, Fu X. NSCGRN: a network structure control method for gene regulatory network inference. Brief Bioinform 2022; 23:6585392. [PMID: 35554485 DOI: 10.1093/bib/bbac156] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/27/2022] [Accepted: 04/06/2022] [Indexed: 01/18/2023] Open
Abstract
Accurate inference of gene regulatory networks (GRNs) is an essential premise for understanding pathogenesis and curing diseases. Various computational methods have been developed for GRN inference, but the identification of redundant regulation remains a challenge faced by researchers. Although combining global and local topology can identify and reduce redundant regulations, the topologies' specific forms and cooperation modes are unclear and real regulations may be sacrificed. Here, we propose a network structure control method [network-structure-controlling-based GRN inference method (NSCGRN)] that stipulates the global and local topology's specific forms and cooperation mode. The method is carried out in a cooperative mode of 'global topology dominates and local topology refines'. Global topology requires layering and sparseness of the network, and local topology requires consistency of the subgraph association pattern with the network motifs (fan-in, fan-out, cascade and feedforward loop). Specifically, an ordered gene list is obtained by network topology centrality sorting. A Bernaola-Galvan mutation detection algorithm applied to the list gives the hierarchy of GRNs to control the upstream and downstream regulations within the global scope. Finally, four network motifs are integrated into the hierarchy to optimize local complex regulations and form a cooperative mode where global and local topologies play the dominant and refined roles, respectively. NSCGRN is compared with state-of-the-art methods on three different datasets (six networks in total), and it achieves the highest F1 and Matthews correlation coefficient. Experimental results show its unique advantages in GRN inference.
Collapse
Affiliation(s)
- Wei Liu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Li Yang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Kaiwen Li
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| | - Yu Yang
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China
| |
Collapse
|
6
|
|
7
|
Gene regulatory network inference from sparsely sampled noisy data. Nat Commun 2020; 11:3493. [PMID: 32661225 PMCID: PMC7359369 DOI: 10.1038/s41467-020-17217-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 06/11/2020] [Indexed: 12/16/2022] Open
Abstract
The complexity of biological systems is encoded in gene regulatory networks. Unravelling this intricate web is a fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases. The major obstacle in inferring gene regulatory networks is the lack of data. While time series data are nowadays widely available, they are typically noisy, with low sampling frequency and overall small number of samples. This paper develops a method called BINGO to specifically deal with these issues. Benchmarked with both real and simulated time-series data covering many different gene regulatory networks, BINGO clearly and consistently outperforms state-of-the-art methods. The novelty of BINGO lies in a nonparametric approach featuring statistical sampling of continuous gene expression profiles. BINGO's superior performance and ease of use, even by non-specialists, make gene regulatory network inference available to any researcher, helping to decipher the complex mechanisms of life.
Collapse
|
8
|
Invergo BM, Petursson B, Akhtar N, Bradley D, Giudice G, Hijazi M, Cutillas P, Petsalaki E, Beltrao P. Prediction of Signed Protein Kinase Regulatory Circuits. Cell Syst 2020; 10:384-396.e9. [DOI: 10.1016/j.cels.2020.04.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 01/24/2020] [Accepted: 04/20/2020] [Indexed: 01/18/2023]
|
9
|
Shafiee Kamalabad M, Grzegorczyk M. Non-homogeneous dynamic Bayesian networks with edge-wise sequentially coupled parameters. Bioinformatics 2020; 36:1198-1207. [PMID: 31504191 PMCID: PMC7703764 DOI: 10.1093/bioinformatics/btz690] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 08/02/2019] [Accepted: 09/02/2019] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Non-homogeneous dynamic Bayesian networks (NH-DBNs) are a popular tool for learning networks with time-varying interaction parameters. A multiple changepoint process is used to divide the data into disjoint segments and the network interaction parameters are assumed to be segment-specific. The objective is to infer the network structure along with the segmentation and the segment-specific parameters from the data. The conventional (uncoupled) NH-DBNs do not allow for information exchange among segments, and the interaction parameters have to be learned separately for each segment. More advanced coupled NH-DBN models allow the interaction parameters to vary but enforce them to stay similar over time. As the enforced similarity of the network parameters can have counter-productive effects, we propose a new consensus NH-DBN model that combines features of the uncoupled and the coupled NH-DBN. The new model infers for each individual edge whether its interaction parameter stays similar over time (and should be coupled) or if it changes from segment to segment (and should stay uncoupled). RESULTS Our new model yields higher network reconstruction accuracies than state-of-the-art models for synthetic and yeast network data. For gene expression data from A.thaliana our new model infers a plausible network topology and yields hypotheses about the light-dependencies of the gene interactions. AVAILABILITY AND IMPLEMENTATION Data are available from earlier publications. Matlab code is available at Bioinformatics online. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mahdi Shafiee Kamalabad
- Bernoulli Institute, Department of Mathematics, Faculty of Science and Engineering, Groningen University, Groningen 9747 AG, The Netherlands
| | - Marco Grzegorczyk
- Bernoulli Institute, Department of Mathematics, Faculty of Science and Engineering, Groningen University, Groningen 9747 AG, The Netherlands
| |
Collapse
|
10
|
Pfister N, Bauer S, Peters J. Learning stable and predictive structures in kinetic systems. Proc Natl Acad Sci U S A 2019; 116:25405-25411. [PMID: 31776252 PMCID: PMC6925987 DOI: 10.1073/pnas.1905688116] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Learning kinetic systems from data is one of the core challenges in many fields. Identifying stable models is essential for the generalization capabilities of data-driven inference. We introduce a computationally efficient framework, called CausalKinetiX, that identifies structure from discrete time, noisy observations, generated from heterogeneous experiments. The algorithm assumes the existence of an underlying, invariant kinetic model, a key criterion for reproducible research. Results on both simulated and real-world examples suggest that learning the structure of kinetic systems benefits from a causal perspective. The identified variables and models allow for a concise description of the dynamics across multiple experimental settings and can be used for prediction in unseen experiments. We observe significant improvements compared to well-established approaches focusing solely on predictive performance, especially for out-of-sample generalization.
Collapse
Affiliation(s)
- Niklas Pfister
- Seminar for Statistics, Eidgenössische Technische Hochschule Zürich, 8092 Zürich, Switzerland;
| | - Stefan Bauer
- Empirical Inference, Max-Planck-Institute for Intelligent Systems, 72076 Tübingen, Germany
| | - Jonas Peters
- Department of Mathematical Sciences, University of Copenhagen, 2100 Copenhagen, Denmark
| |
Collapse
|
11
|
Theorell A, Nöh K. Reversible jump MCMC for multi-model inference in Metabolic Flux Analysis. Bioinformatics 2019; 36:232-240. [DOI: 10.1093/bioinformatics/btz500] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 03/15/2019] [Accepted: 06/12/2019] [Indexed: 02/06/2023] Open
Abstract
Abstract
Motivation
The validity of model based inference, as used in systems biology, depends on the underlying model formulation. Often, a vast number of competing models is available, that are built on different assumptions, all consistent with the existing knowledge about the studied biological phenomenon. As a remedy for this, Bayesian Model Averaging (BMA) facilitates parameter and structural inferences based on multiple models simultaneously. However, in fields where a vast number of alternative, high-dimensional and non-linear models are involved, the BMA-based inference task is computationally very challenging.
Results
Here we use BMA in the complex setting of Metabolic Flux Analysis (MFA) to infer whether potentially reversible reactions proceed uni- or bidirectionally, using 13C labeling data and metabolic networks. BMA is applied on a large set of candidate models with differing directionality settings, using a tailored multi-model Markov Chain Monte Carlo (MCMC) approach. The applicability of our algorithm is shown by inferring the in vivo probability of reaction bidirectionalities in a realistic network setup, thereby extending the scope of 13C MFA from parameter to structural inference.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Axel Theorell
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| |
Collapse
|
12
|
Grzegorczyk M, Aderhold A, Husmeier D. Overview and Evaluation of Recent Methods for Statistical Inference of Gene Regulatory Networks from Time Series Data. Methods Mol Biol 2019; 1883:49-94. [PMID: 30547396 DOI: 10.1007/978-1-4939-8882-2_3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
A challenging problem in systems biology is the reconstruction of gene regulatory networks from postgenomic data. A variety of reverse engineering methods from machine learning and computational statistics have been proposed in the literature. However, deciding on the best method to adopt for a particular application or data set might be a confusing task. The present chapter provides a broad overview of state-of-the-art methods with an emphasis on conceptual understanding rather than a deluge of mathematical details, and the pros and cons of the various approaches are discussed. Guidance on practical applications with pointers to publicly available software implementations are included. The chapter concludes with a comprehensive comparative benchmark study on simulated data and a real-work application taken from the current plant systems biology.
Collapse
Affiliation(s)
- Marco Grzegorczyk
- Johann Bernoulli Institute, University of Groningen, Groningen, The Netherlands
| | - Andrej Aderhold
- Center for Computer Science, Universidade Federal do Rio Grande, Rio Grande, Brazil
| | - Dirk Husmeier
- School of Mathematics and Statistics, University of Glasgow, Glasgow, UK.
| |
Collapse
|
13
|
Leguia MG, Martínez CGB, Malvestio I, Campo AT, Rocamora R, Levnajić Z, Andrzejak RG. Inferring directed networks using a rank-based connectivity measure. Phys Rev E 2019; 99:012319. [PMID: 30780311 DOI: 10.1103/physreve.99.012319] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Indexed: 11/07/2022]
Abstract
Inferring the topology of a network using the knowledge of the signals of each of the interacting units is key to understanding real-world systems. One way to address this problem is using data-driven methods like cross-correlation or mutual information. However, these measures lack the ability to distinguish the direction of coupling. Here, we use a rank-based nonlinear interdependence measure originally developed for pairs of signals. This measure not only allows one to measure the strength but also the direction of the coupling. Our results for a system of coupled Lorenz dynamics show that we are able to consistently infer the underlying network for a subrange of the coupling strength and link density. Furthermore, we report that the addition of dynamical noise can benefit the reconstruction. Finally, we show an application to multichannel electroencephalographic recordings from an epilepsy patient.
Collapse
Affiliation(s)
- Marc G Leguia
- Faculty of Information Studies, 8000 Novo Mesto, Slovenia.,Department of Communication and Information Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain
| | - Cristina G B Martínez
- Department of Communication and Information Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain
| | - Irene Malvestio
- Department of Communication and Information Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain.,Department of Physics and Astronomy, University of Florence, 50119 Sesto Fiorentino, Italy.,Institute for Complex Systems, CNR, 50119 Sesto Fiorentino, Italy
| | - Adrià Tauste Campo
- Center for Brain and Cognition, Department of Information and Communication Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain.,Epilepsy Unit, Department of Neurology, IMIM Hospital del Mar, Universitat Pompeu Fabra, 08003 Barcelona, Spain.,Barcelonaβeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
| | - Rodrigo Rocamora
- Epilepsy Unit, Department of Neurology, IMIM Hospital del Mar, Universitat Pompeu Fabra, 08003 Barcelona, Spain.,Faculty of Health and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Zoran Levnajić
- Faculty of Information Studies, 8000 Novo Mesto, Slovenia.,Institute Jozef Stefan, 1000 Ljubljana, Slovenia
| | - Ralph G Andrzejak
- Department of Communication and Information Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain.,Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028 Barcelona, Spain
| |
Collapse
|
14
|
Dondelinger F, Mukherjee S. Statistical Network Inference for Time-Varying Molecular Data with Dynamic Bayesian Networks. Methods Mol Biol 2019; 1883:25-48. [PMID: 30547395 DOI: 10.1007/978-1-4939-8882-2_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
In this chapter, we review the problem of network inference from time-course data, focusing on a class of graphical models known as dynamic Bayesian networks (DBNs). We discuss the relationship of DBNs to models based on ordinary differential equations, and consider extensions to nonlinear time dynamics. We provide an introduction to time-varying DBN models, which allow for changes to the network structure and parameters over time. We also discuss causal perspectives on network inference, including issues around model semantics that can arise due to missing variables. We present a case study of applying time-varying DBNs to gene expression measurements over the life cycle of Drosophila melanogaster. We finish with a discussion of future perspectives, including possible applications of time-varying network inference to single-cell gene expression data.
Collapse
Affiliation(s)
| | - Sach Mukherjee
- German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| |
Collapse
|
15
|
Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks. Methods Mol Biol 2018. [PMID: 30547398 DOI: 10.1007/978-1-4939-8882-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Biological networks are a very convenient modeling and visualization tool to discover knowledge from modern high-throughput genomics and post-genomics data sets. Indeed, biological entities are not isolated but are components of complex multilevel systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems. We present the causal formalism and bring it out in the context of biological networks, when the data is observational. We also discuss its ability to decipher the causal information flow as observed in gene expression. We also illustrate our exploration by experiments on small simulated networks as well as on a real biological data set.
Collapse
|
16
|
Reconstructing phosphorylation signalling networks from quantitative phosphoproteomic data. Essays Biochem 2018; 62:525-534. [PMID: 30072490 PMCID: PMC6204553 DOI: 10.1042/ebc20180019] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Revised: 06/25/2018] [Accepted: 06/26/2018] [Indexed: 12/25/2022]
Abstract
Cascades of phosphorylation between protein kinases comprise a core mechanism in the integration and propagation of intracellular signals. Although we have accumulated a wealth of knowledge around some such pathways, this is subject to study biases and much remains to be uncovered. Phosphoproteomics, the identification and quantification of phosphorylated proteins on a proteomic scale, provides a high-throughput means of interrogating the state of intracellular phosphorylation, both at the pathway level and at the whole-cell level. In this review, we discuss methods for using human quantitative phosphoproteomic data to reconstruct the underlying signalling networks that generated it. We address several challenges imposed by the data on such analyses and we consider promising advances towards reconstructing unbiased, kinome-scale signalling networks.
Collapse
|
17
|
Abstract
Analysis of Rosen's theories with a focus on their mathematical content. Provides links of Rosen's work with most recent research into mathematical modelling. Possible implementations of ’closure to efficient causation’ in models are discussed. Critical analysis of Rosen's use of category theory.
The theoretical biologist Robert Rosen developed a highly original approach for investigating the question “What is life?”, the most fundamental problem of biology. Considering that Rosen made extensive use of mathematics it might seem surprising that his ideas have only rarely been implemented in mathematical models. On the one hand, Rosen propagates relational models that neglect underlying structural details of the components and focus on relationships between the elements of a biological system, according to the motto “throw away the physics, keep the organisation”. Rosen's strong rejection of mechanistic models that he implicitly associates with a strong form of reductionism might have deterred mathematical modellers from adopting his ideas for their own work. On the other hand Rosen's presentation of his modelling framework, (M, R) systems, is highly abstract which makes it hard to appreciate how this approach could be applied to concrete biological problems. In this article, both the mathematics as well as those aspects of Rosen's work are analysed that relate to his philosophical ideas. It is shown that Rosen's relational models are a particular type of mechanistic model with specific underlying assumptions rather than a fundamentally different approach that excludes mechanistic models. The strengths and weaknesses of relational models are investigated by comparison with current network biology literature. Finally, it is argued that Rosen's definition of life, “organisms are closed to efficient causation”, should be considered as a hypothesis to be tested and ideas how this postulate could be implemented in mathematical models are presented.
Collapse
|
18
|
Inferring a nonlinear biochemical network model from a heterogeneous single-cell time course data. Sci Rep 2018; 8:6790. [PMID: 29717206 PMCID: PMC5931614 DOI: 10.1038/s41598-018-25064-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Accepted: 04/09/2018] [Indexed: 12/30/2022] Open
Abstract
Mathematical modeling and analysis of biochemical reaction networks are key routines in computational systems biology and biophysics; however, it remains difficult to choose the most valid model. Here, we propose a computational framework for data-driven and systematic inference of a nonlinear biochemical network model. The framework is based on the expectation-maximization algorithm combined with particle smoother and sparse regularization techniques. In this method, a “redundant” model consisting of an excessive number of nodes and regulatory paths is iteratively updated by eliminating unnecessary paths, resulting in an inference of the most likely model. Using artificial single-cell time-course data showing heterogeneous oscillatory behaviors, we demonstrated that this algorithm successfully inferred the true network without any prior knowledge of network topology or parameter values. Furthermore, we showed that both the regulatory paths among nodes and the optimal number of nodes in the network could be systematically determined. The method presented in this study provides a general framework for inferring a nonlinear biochemical network model from heterogeneous single-cell time-course data.
Collapse
|
19
|
Choi K. Robust Approaches to Generating Reliable Predictive Models in Systems Biology. RNA TECHNOLOGIES 2018. [DOI: 10.1007/978-3-319-92967-5_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
20
|
Babtie AC, Stumpf MPH. How to deal with parameters for whole-cell modelling. J R Soc Interface 2017; 14:20170237. [PMID: 28768879 PMCID: PMC5582120 DOI: 10.1098/rsif.2017.0237] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 06/22/2017] [Indexed: 11/12/2022] Open
Abstract
Dynamical systems describing whole cells are on the verge of becoming a reality. But as models of reality, they are only useful if we have realistic parameters for the molecular reaction rates and cell physiological processes. There is currently no suitable framework to reliably estimate hundreds, let alone thousands, of reaction rate parameters. Here, we map out the relative weaknesses and promises of different approaches aimed at redressing this issue. While suitable procedures for estimation or inference of the whole (vast) set of parameters will, in all likelihood, remain elusive, some hope can be drawn from the fact that much of the cellular behaviour may be explained in terms of smaller sets of parameters. Identifying such parameter sets and assessing their behaviour is now becoming possible even for very large systems of equations, and we expect such methods to become central tools in the development and analysis of whole-cell models.
Collapse
Affiliation(s)
- Ann C Babtie
- Department of Life Sciences, Imperial College London, London, UK
| | | |
Collapse
|
21
|
Fribourg M, Logothetis DE, González-Maeso J, Sealfon SC, Galocha-Iragüen B, Las-Heras Andrés F, Brezina V. Elucidation of molecular kinetic schemes from macroscopic traces using system identification. PLoS Comput Biol 2017; 13:e1005376. [PMID: 28192423 PMCID: PMC5330533 DOI: 10.1371/journal.pcbi.1005376] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Revised: 02/28/2017] [Accepted: 01/21/2017] [Indexed: 12/28/2022] Open
Abstract
Overall cellular responses to biologically-relevant stimuli are mediated by networks of simpler lower-level processes. Although information about some of these processes can now be obtained by visualizing and recording events at the molecular level, this is still possible only in especially favorable cases. Therefore the development of methods to extract the dynamics and relationships between the different lower-level (microscopic) processes from the overall (macroscopic) response remains a crucial challenge in the understanding of many aspects of physiology. Here we have devised a hybrid computational-analytical method to accomplish this task, the SYStems-based MOLecular kinetic scheme Extractor (SYSMOLE). SYSMOLE utilizes system-identification input-output analysis to obtain a transfer function between the stimulus and the overall cellular response in the Laplace-transformed domain. It then derives a Markov-chain state molecular kinetic scheme uniquely associated with the transfer function by means of a classification procedure and an analytical step that imposes general biological constraints. We first tested SYSMOLE with synthetic data and evaluated its performance in terms of its rate of convergence to the correct molecular kinetic scheme and its robustness to noise. We then examined its performance on real experimental traces by analyzing macroscopic calcium-current traces elicited by membrane depolarization. SYSMOLE derived the correct, previously known molecular kinetic scheme describing the activation and inactivation of the underlying calcium channels and correctly identified the accepted mechanism of action of nifedipine, a calcium-channel blocker clinically used in patients with cardiovascular disease. Finally, we applied SYSMOLE to study the pharmacology of a new class of glutamate antipsychotic drugs and their crosstalk mechanism through a heteromeric complex of G protein-coupled receptors. Our results indicate that our methodology can be successfully applied to accurately derive molecular kinetic schemes from experimental macroscopic traces, and we anticipate that it may be useful in the study of a wide variety of biological systems. Unraveling the lower-level (microscopic) processes underlying the overall (macroscopic) cell response to a given stimulus is a challenging problem in cell physiology. This has been a classic problem in biophysics, where the ability to record the activity of single ion channels that generate a macroscopic ion current has allowed a measure of direct access to the underlying microscopic processes. These classic studies have demonstrated that very different groupings of the microscopic processes can yield extremely similar macroscopic responses. Biologists in fields other than biophysics are frequently confronted with the same macroscopic-to-microscopic problem, usually, however, without any direct access to the microscopic processes. Thus, the development of computational methods to deduce from the available macroscopic measurements the nature of the underlying microscopic processes can be expected to substantially advance the study of many areas of cell physiology. Toward that aim, here we have derived and tested a hybrid computational-analytical method to extract information about the microscopic processes that is hidden in macroscopic experimental traces. Our method is independent of the particular system under study, and thus can be applied to new as well as previously-recorded macroscopic traces obtained in a wide variety of biological systems.
Collapse
Affiliation(s)
- Miguel Fribourg
- Department of Neurology and Center for Translational Systems Biology, Icahn School Of Medicine at Mount Sinai, New York, New York, United States of America
- * E-mail:
| | - Diomedes E. Logothetis
- Department of Pharmaceutical Sciences, Northeastern University, Boston, Massachusetts, United States of America
| | - Javier González-Maeso
- Department of Physiology and Biophysics, Virginia Commonwealth University School of Medicine, Richmond, Virginia, United States of America
| | - Stuart C. Sealfon
- Department of Neurology and Center for Translational Systems Biology, Icahn School Of Medicine at Mount Sinai, New York, New York, United States of America
| | - Belén Galocha-Iragüen
- Department of Signals Systems and Radiocommunications, Universidad Politécnica de Madrid, Madrid, Spain
| | | | - Vladimir Brezina
- Department of Neuroscience, Icahn School of Medicine, New York, New York, United States of America
| |
Collapse
|
22
|
Varga M, Prokop A, Csukas B. Biosystem models, generated from a complex rule/reaction/influence network and from two functionality prototypes. Biosystems 2017; 152:24-43. [PMID: 28062323 DOI: 10.1016/j.biosystems.2016.12.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 12/23/2016] [Indexed: 12/24/2022]
Abstract
In this work we have further developed the Direct Computer Mapping (DCM) based modelling and simulation methodology. A unified, transition-based representation of complex rule, reaction and influence networks has been introduced and two prototypes (one general state- and another general transition-prototype) have been developed for the unified functional modelling of the state and transition nodes. Starting from the network and from the functional prototypes, an automatic generation method of the graphically editable and extensible GraphML description of biosystem models has been elaborated. The new developments have been implemented in the improved kernel of DCM models. The applied knowledge representation makes possible the unified generation and execution of the balance-based quantitative and influence- or rule-based qualitative, as well as optionally time-driven, multiscale biosystem models. Application of the developed methodology has been illustrated by the improved implementation of the formerly studied and upgraded example biosystem model for combining the detailed, quantitative p53/miR34a signalling system with the pathological model through an extended rule-based coupling model.
Collapse
Affiliation(s)
- M Varga
- Research Group on Process Network Engineering, Kaposvar University, 40 Guba S, 7400, Kaposvar, Hungary.
| | - A Prokop
- Department of Chemical & Biomolecular Engineering, Vanderbilt University, Nashville, TN, USA
| | - B Csukas
- Research Group on Process Network Engineering, Kaposvar University, 40 Guba S, 7400, Kaposvar, Hungary
| |
Collapse
|
23
|
Abstract
Molecular profiling of proteins and phosphoproteins using a reverse phase protein array (RPPA) platform, with a panel of target-specific antibodies, enables the parallel, quantitative proteomic analysis of many biological samples in a microarray format. Hence, RPPA analysis can generate a high volume of multidimensional data that must be effectively interrogated and interpreted. A range of computational techniques for data mining can be applied to detect and explore data structure and to form functional predictions from large datasets. Here, two approaches for the computational analysis of RPPA data are detailed: the identification of similar patterns of protein expression by hierarchical cluster analysis and the modeling of protein interactions and signaling relationships by network analysis. The protocols use freely available, cross-platform software, are easy to implement, and do not require any programming expertise. Serving as data-driven starting points for further in-depth analysis, validation, and biological experimentation, these and related bioinformatic approaches can accelerate the functional interpretation of RPPA data.
Collapse
Affiliation(s)
- Adam Byron
- Cancer Research UK Edinburgh Centre, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XR, UK.
| |
Collapse
|
24
|
Klimovskaia A, Ganscha S, Claassen M. Sparse Regression Based Structure Learning of Stochastic Reaction Networks from Single Cell Snapshot Time Series. PLoS Comput Biol 2016; 12:e1005234. [PMID: 27923064 PMCID: PMC5140059 DOI: 10.1371/journal.pcbi.1005234] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 11/02/2016] [Indexed: 11/29/2022] Open
Abstract
Stochastic chemical reaction networks constitute a model class to quantitatively describe dynamics and cell-to-cell variability in biological systems. The topology of these networks typically is only partially characterized due to experimental limitations. Current approaches for refining network topology are based on the explicit enumeration of alternative topologies and are therefore restricted to small problem instances with almost complete knowledge. We propose the reactionet lasso, a computational procedure that derives a stepwise sparse regression approach on the basis of the Chemical Master Equation, enabling large-scale structure learning for reaction networks by implicitly accounting for billions of topology variants. We have assessed the structure learning capabilities of the reactionet lasso on synthetic data for the complete TRAIL induced apoptosis signaling cascade comprising 70 reactions. We find that the reactionet lasso is able to efficiently recover the structure of these reaction systems, ab initio, with high sensitivity and specificity. With only < 1% false discoveries, the reactionet lasso is able to recover 45% of all true reactions ab initio among > 6000 possible reactions and over 102000 network topologies. In conjunction with information rich single cell technologies such as single cell RNA sequencing or mass cytometry, the reactionet lasso will enable large-scale structure learning, particularly in areas with partial network structure knowledge, such as cancer biology, and thereby enable the detection of pathological alterations of reaction networks. We provide software to allow for wide applicability of the reactionet lasso. Virtually all biological processes are driven by biochemical reactions. However, their quantitative description in terms of stochastic chemical reaction networks is often precluded by the computational difficulty of structure learning, i.e. the identification of biologically active reaction networks among the combinatorially many possible topologies. This work describes the reactionet lasso, a structure learning approach that takes advantage of novel, information-rich single cell data and a tractable problem formulation to achieve structure learning for problem instances hundreds of orders of magnitude larger than previously reported. This approach opens the prospect of obtaining quantitative and predictive reaction models in many areas of biology and medicine, and in particular areas such as cancer biology, which are characterized by significant system alterations and many unknown reactions.
Collapse
Affiliation(s)
- Anna Klimovskaia
- Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
- Life Science Zurich Graduate School, Zurich, Switzerland
| | - Stefan Ganscha
- Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
- Life Science Zurich Graduate School, Zurich, Switzerland
| | - Manfred Claassen
- Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
- * E-mail:
| |
Collapse
|
25
|
McGoff KA, Guo X, Deckard A, Kelliher CM, Leman AR, Francey LJ, Hogenesch JB, Haase SB, Harer JL. The Local Edge Machine: inference of dynamic models of gene regulation. Genome Biol 2016; 17:214. [PMID: 27760556 PMCID: PMC5072315 DOI: 10.1186/s13059-016-1076-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 10/03/2016] [Indexed: 12/31/2022] Open
Abstract
We present a novel approach, the Local Edge Machine, for the inference of regulatory interactions directly from time-series gene expression data. We demonstrate its performance, robustness, and scalability on in silico datasets with varying behaviors, sizes, and degrees of complexity. Moreover, we demonstrate its ability to incorporate biological prior information and make informative predictions on a well-characterized in vivo system using data from budding yeast that have been synchronized in the cell cycle. Finally, we use an atlas of transcription data in a mammalian circadian system to illustrate how the method can be used for discovery in the context of large complex networks.
Collapse
Affiliation(s)
- Kevin A McGoff
- Department of Mathematics and Statistics, UNC Charlotte, 9201 University City Blvd., Charlotte, 28269, NC, USA.
| | - Xin Guo
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China
| | | | | | - Adam R Leman
- Department of Biology, Duke University, Durham, NC, USA
| | - Lauren J Francey
- Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH, USA
| | - John B Hogenesch
- Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH, USA
| | | | - John L Harer
- Department of Mathematics, Duke University, Durham, NC, USA
| |
Collapse
|
26
|
Aderhold A, Husmeier D, Grzegorczyk M. Approximate Bayesian inference in semi-mechanistic models. STATISTICS AND COMPUTING 2016; 27:1003-1040. [PMID: 32226236 PMCID: PMC7089672 DOI: 10.1007/s11222-016-9668-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 05/05/2016] [Indexed: 06/10/2023]
Abstract
Inference of interaction networks represented by systems of differential equations is a challenging problem in many scientific disciplines. In the present article, we follow a semi-mechanistic modelling approach based on gradient matching. We investigate the extent to which key factors, including the kinetic model, statistical formulation and numerical methods, impact upon performance at network reconstruction. We emphasize general lessons for computational statisticians when faced with the challenge of model selection, and we assess the accuracy of various alternative paradigms, including recent widely applicable information criteria and different numerical procedures for approximating Bayes factors. We conduct the comparative evaluation with a novel inferential pipeline that systematically disambiguates confounding factors via an ANOVA scheme.
Collapse
Affiliation(s)
- Andrej Aderhold
- School of Mathematics and Statistics, Glasgow University, Glasgow, UK
| | - Dirk Husmeier
- School of Mathematics and Statistics, Glasgow University, Glasgow, UK
| | - Marco Grzegorczyk
- Johann Bernoulli Institute (JBI), Groningen University, Groningen, The Netherlands
| |
Collapse
|
27
|
Mangan NM, Brunton SL, Proctor JL, Kutz JN. Inferring Biological Networks by Sparse Identification of Nonlinear Dynamics. ACTA ACUST UNITED AC 2016. [DOI: 10.1109/tmbmc.2016.2633265] [Citation(s) in RCA: 172] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
28
|
Chasman D, Fotuhi Siahpirani A, Roy S. Network-based approaches for analysis of complex biological systems. Curr Opin Biotechnol 2016; 39:157-166. [PMID: 27115495 DOI: 10.1016/j.copbio.2016.04.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2015] [Revised: 04/04/2016] [Accepted: 04/05/2016] [Indexed: 12/22/2022]
Abstract
Cells function and respond to changes in their environment by the coordinated activity of their molecular components, including mRNAs, proteins and metabolites. At the heart of proper cellular function are molecular networks connecting these components to process extra-cellular environmental signals and drive dynamic, context-specific cellular responses. Network-based computational approaches aim to systematically integrate measurements from high-throughput experiments to gain a global understanding of cellular function under changing environmental conditions. We provide an overview of recent methodological developments toward solving two major computational problems within this field in the past two years (2013-2015): network reconstruction and network-based interpretation. Looking forward, we envision development of methods that can predict phenotypes with high accuracy as well as provide biologically plausible mechanistic hypotheses.
Collapse
Affiliation(s)
- Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Alireza Fotuhi Siahpirani
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States; Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States; Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States.
| |
Collapse
|
29
|
Tanevski J, Todorovski L, Džeroski S. Learning stochastic process-based models of dynamical systems from knowledge and data. BMC SYSTEMS BIOLOGY 2016; 10:30. [PMID: 27005698 PMCID: PMC4802653 DOI: 10.1186/s12918-016-0273-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 03/06/2016] [Indexed: 01/02/2023]
Abstract
Background Identifying a proper model structure, using methods that address both structural and parameter uncertainty, is a crucial problem within the systems approach to biology. And yet, it has a marginal presence in the recent literature. While many existing approaches integrate methods for simulation and parameter estimation of a single model to address parameter uncertainty, only few of them address structural uncertainty at the same time. The methods for handling structure uncertainty often oversimplify the problem by allowing the human modeler to explicitly enumerate a relatively small number of alternative model structures. On the other hand, process-based modeling methods provide flexible modular formalisms for specifying large classes of plausible model structures, but their scope is limited to deterministic models. Here, we aim at extending the scope of process-based modeling methods to inductively learn stochastic models from knowledge and data. Results We combine the flexibility of process-based modeling in terms of addressing structural uncertainty with the benefits of stochastic modeling. The proposed method combines search trough the space of plausible model structures, the parsimony principle and parameter estimation to identify a model with optimal structure and parameters. We illustrate the utility of the proposed method on four stochastic modeling tasks in two domains: gene regulatory networks and epidemiology. Within the first domain, using synthetically generated data, the method successfully recovers the structure and parameters of known regulatory networks from simulations. In the epidemiology domain, the method successfully reconstructs previously established models of epidemic outbreaks from real, sparse and noisy measurement data. Conclusions The method represents a unified approach to modeling dynamical systems that allows for flexible formalization of the space of candidate model structures, deterministic and stochastic interpretation of model dynamics, and automated induction of model structure and parameters from data. The method is able to reconstruct models of dynamical systems from synthetic and real data.
Collapse
Affiliation(s)
- Jovan Tanevski
- Jožef Stefan Institute, Jamova cesta 39, Ljubljana, 1000, Slovenia. .,Jožef Stefan International Postgraduate School, Jamova cesta 39, Ljubljana, 1000, Slovenia.
| | - Ljupčo Todorovski
- University of Ljubljana, Gosarjeva ulica 5, Ljubljana, 1000, Slovenia
| | - Sašo Džeroski
- Jožef Stefan Institute, Jamova cesta 39, Ljubljana, 1000, Slovenia.,Jožef Stefan International Postgraduate School, Jamova cesta 39, Ljubljana, 1000, Slovenia
| |
Collapse
|
30
|
Ocone A, Haghverdi L, Mueller NS, Theis FJ. Reconstructing gene regulatory dynamics from high-dimensional single-cell snapshot data. Bioinformatics 2015; 31:i89-96. [PMID: 26072513 PMCID: PMC4765871 DOI: 10.1093/bioinformatics/btv257] [Citation(s) in RCA: 104] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Motivation: High-dimensional single-cell snapshot data are becoming widespread in the systems biology community, as a mean to understand biological processes at the cellular level. However, as temporal information is lost with such data, mathematical models have been limited to capture only static features of the underlying cellular mechanisms. Results: Here, we present a modular framework which allows to recover the temporal behaviour from single-cell snapshot data and reverse engineer the dynamics of gene expression. The framework combines a dimensionality reduction method with a cell time-ordering algorithm to generate pseudo time-series observations. These are in turn used to learn transcriptional ODE models and do model selection on structural network features. We apply it on synthetic data and then on real hematopoietic stem cells data, to reconstruct gene expression dynamics during differentiation pathways and infer the structure of a key gene regulatory network. Availability and implementation: C++ and Matlab code available at https://www.helmholtz-muenchen.de/fileadmin/ICB/software/inferenceSnapshot.zip. Contact:fabian.theis@helmholtz-muenchen.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrea Ocone
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| | - Laleh Haghverdi
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| | - Nikola S Mueller
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| |
Collapse
|
31
|
Korkut A, Wang W, Demir E, Aksoy BA, Jing X, Molinelli EJ, Babur Ö, Bemis DL, Onur Sumer S, Solit DB, Pratilas CA, Sander C. Perturbation biology nominates upstream-downstream drug combinations in RAF inhibitor resistant melanoma cells. eLife 2015; 4:e04640. [PMID: 26284497 PMCID: PMC4539601 DOI: 10.7554/elife.04640] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 07/07/2015] [Indexed: 01/16/2023] Open
Abstract
Resistance to targeted cancer therapies is an important clinical problem. The discovery of anti-resistance drug combinations is challenging as resistance can arise by diverse escape mechanisms. To address this challenge, we improved and applied the experimental-computational perturbation biology method. Using statistical inference, we build network models from high-throughput measurements of molecular and phenotypic responses to combinatorial targeted perturbations. The models are computationally executed to predict the effects of thousands of untested perturbations. In RAF-inhibitor resistant melanoma cells, we measured 143 proteomic/phenotypic entities under 89 perturbation conditions and predicted c-Myc as an effective therapeutic co-target with BRAF or MEK. Experiments using the BET bromodomain inhibitor JQ1 affecting the level of c-Myc protein and protein kinase inhibitors targeting the ERK pathway confirmed the prediction. In conclusion, we propose an anti-cancer strategy of co-targeting a specific upstream alteration and a general downstream point of vulnerability to prevent or overcome resistance to targeted drugs.
Collapse
Affiliation(s)
- Anil Korkut
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Weiqing Wang
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Emek Demir
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Bülent Arman Aksoy
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, United States
| | - Xiaohong Jing
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Evan J Molinelli
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Özgün Babur
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Debra L Bemis
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Selcuk Onur Sumer
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - David B Solit
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, United States
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Christine A Pratilas
- The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University, Baltimore, United States
| | - Chris Sander
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| |
Collapse
|
32
|
Computational and experimental single cell biology techniques for the definition of cell type heterogeneity, interplay and intracellular dynamics. Curr Opin Biotechnol 2015; 34:9-15. [DOI: 10.1016/j.copbio.2014.10.010] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 10/21/2014] [Accepted: 10/22/2014] [Indexed: 12/31/2022]
|
33
|
Oates CJ, Amos R, Spencer SEF. Quantifying the multi-scale performance of network inference algorithms. Stat Appl Genet Mol Biol 2015; 13:611-31. [PMID: 25153244 DOI: 10.1515/sagmb-2014-0012] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Graphical models are widely used to study complex multivariate biological systems. Network inference algorithms aim to reverse-engineer such models from noisy experimental data. It is common to assess such algorithms using techniques from classifier analysis. These metrics, based on ability to correctly infer individual edges, possess a number of appealing features including invariance to rank-preserving transformation. However, regulation in biological systems occurs on multiple scales and existing metrics do not take into account the correctness of higher-order network structure. In this paper novel performance scores are presented that share the appealing properties of existing scores, whilst capturing ability to uncover regulation on multiple scales. Theoretical results confirm that performance of a network inference algorithm depends crucially on the scale at which inferences are to be made; in particular strong local performance does not guarantee accurate reconstruction of higher-order topology. Applying these scores to a large corpus of data from the DREAM5 challenge, we undertake a data-driven assessment of estimator performance. We find that the "wisdom of crowds" network, that demonstrated superior local performance in the DREAM5 challenge, is also among the best performing methodologies for inference of regulation on multiple length scales.
Collapse
|
34
|
Daniels BC, Nemenman I. Efficient inference of parsimonious phenomenological models of cellular dynamics using S-systems and alternating regression. PLoS One 2015; 10:e0119821. [PMID: 25806510 PMCID: PMC4373916 DOI: 10.1371/journal.pone.0119821] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 01/16/2015] [Indexed: 11/18/2022] Open
Abstract
The nonlinearity of dynamics in systems biology makes it hard to infer them from experimental data. Simple linear models are computationally efficient, but cannot incorporate these important nonlinearities. An adaptive method based on the S-system formalism, which is a sensible representation of nonlinear mass-action kinetics typically found in cellular dynamics, maintains the efficiency of linear regression. We combine this approach with adaptive model selection to obtain efficient and parsimonious representations of cellular dynamics. The approach is tested by inferring the dynamics of yeast glycolysis from simulated data. With little computing time, it produces dynamical models with high predictive power and with structural complexity adapted to the difficulty of the inference problem.
Collapse
Affiliation(s)
- Bryan C. Daniels
- Center for Complexity and Collective Computation, Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI 53715, USA
- * E-mail: (BCD); (IN)
| | - Ilya Nemenman
- Departments of Physics and Biology, Emory University, Atlanta, GA 30322, USA
- * E-mail: (BCD); (IN)
| |
Collapse
|
35
|
Abstract
Mathematical models of natural systems are abstractions of much more complicated processes. Developing informative and realistic models of such systems typically involves suitable statistical inference methods, domain expertise, and a modicum of luck. Except for cases where physical principles provide sufficient guidance, it will also be generally possible to come up with a large number of potential models that are compatible with a given natural system and any finite amount of data generated from experiments on that system. Here we develop a computational framework to systematically evaluate potentially vast sets of candidate differential equation models in light of experimental and prior knowledge about biological systems. This topological sensitivity analysis enables us to evaluate quantitatively the dependence of model inferences and predictions on the assumed model structures. Failure to consider the impact of structural uncertainty introduces biases into the analysis and potentially gives rise to misleading conclusions.
Collapse
|