1
|
Diamataris IG, Peristeras LD, Papavasileiou KD, Melissas VS, Boulougouris GC. Statistical Inference of Rate Constants in Chemical and Biochemical Reaction Networks Using an "Inverse" Event-Driven Kinetic Monte Carlo Method. J Phys Chem B 2023; 127:9132-9143. [PMID: 37823789 DOI: 10.1021/acs.jpcb.3c03649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
The use of rate models for networks of stochastic reactions is frequently used to comprehend the macroscopically observed dynamic properties of finite size reactive systems as well as their relationship to the underlying molecular events. Τhis particular approach usually stumbles on parameter derivation associated with stochastic kinetics, a quite demanding procedure. The present study incorporates a novel algorithm, which infers kinetic parameters from the system's time evolution, manifested as changes in molecular species populations. The proposed methodology reconstructs distributions required to infer kinetic parameters of a stochastic process pertaining to either a simulation or experimental data. The suggested approach accurately replicates rate constants of the stochastic reaction networks, which have evolved over time by event-driven Monte Carlo (MC) simulations using the Gillespie algorithm. Furthermore, our approach has been successfully used to estimate rate constants of association and dissociation events between molecular species developing during molecular dynamics (MD) simulations. We certainly believe that our method will be remarkably helpful for considering the macroscopic characteristic molecular roots related to stochastic physical and biological processes.
Collapse
Affiliation(s)
- Ioannis G Diamataris
- Laboratory of Computational Physical-Chemistry, Department of Molecular Biology and Genetics, University of Thrace, Alexandroupoulis GR-681 00, Greece
| | - Loukas D Peristeras
- Institute of Nanoscience and Nanotechnology, Molecular Thermodynamics and Modelling of Materials Laboratory, National Center for Scientific Research "Demokritos", Attikis, Agia Paraskevi GR-153 10, Greece
| | - Konstantinos D Papavasileiou
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia CY-1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca CY-6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., Piraeus GR-185 45, Greece
| | | | - Georgios C Boulougouris
- Laboratory of Computational Physical-Chemistry, Department of Molecular Biology and Genetics, University of Thrace, Alexandroupoulis GR-681 00, Greece
| |
Collapse
|
2
|
Li L, Hu Y, Xu Y, Tang S. Mathematical modeling the order of driver gene mutations in colorectal cancer. PLoS Comput Biol 2023; 19:e1011225. [PMID: 37368936 DOI: 10.1371/journal.pcbi.1011225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 05/29/2023] [Indexed: 06/29/2023] Open
Abstract
Tumor heterogeneity is a large obstacle for cancer study and treatment. Different cancer patients may involve different combinations of gene mutations or the distinct regulatory pathways for inducing the progression of tumor. Investigating the pathways of gene mutations which can cause the formation of tumor can provide a basis for the personalized treatment of cancer. Studies suggested that KRAS, APC and TP53 are the most significant driver genes for colorectal cancer. However, it is still an open issue regarding the detailed mutation order of these genes in the development of colorectal cancer. For this purpose, we analyze the mathematical model considering all orders of mutations in oncogene, KRAS and tumor suppressor genes, APC and TP53, and fit it on data describing the incidence rates of colorectal cancer at different age from the Surveillance Epidemiology and End Results registry in the United States for the year 1973-2013. The specific orders that can induce the development of colorectal cancer are identified by the model fitting. The fitting results indicate that the mutation order with KRAS → APC → TP53, APC → TP53 → KRAS and APC → KRAS → TP53 explain the age-specific risk of colorectal cancer with very well. Furthermore, eleven pathways of gene mutations can be accepted for the mutation order of genes with KRAS → APC → TP53, APC → TP53 → KRAS and APC → KRAS → TP53, and the alternation of APC acts as the initiating or promoting event in the colorectal cancer. The estimated mutation rates of cells in the different pathways demonstrate that genetic instability must exist in colorectal cancer with alterations of genes, KRAS, APC and TP53.
Collapse
Affiliation(s)
- Lingling Li
- School of Mathematics and Statistics, Shaanxi Normal University, Xi'an, China
- School of Science, Xi'an Polytechnic University, Xi'an, China
| | - Yulu Hu
- School of Science, Xi'an Polytechnic University, Xi'an, China
| | - Yunshan Xu
- Mathematics Department, Faculty of Science and Technology, University of Macau, Taipa, Macau, China
| | - Sanyi Tang
- School of Mathematics and Statistics, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
3
|
Wadkin LE, Golightly A, Branson J, Hoppit A, Parker NG, Baggaley AW. Quantifying Invasive Pest Dynamics through Inference of a Two-Node Epidemic Network Model. DIVERSITY 2023. [DOI: 10.3390/d15040496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Invasive woodland pests have substantial ecological, economic, and social impacts, harming biodiversity and ecosystem services. Mathematical modelling informed by Bayesian inference can deepen our understanding of the fundamental behaviours of invasive pests and provide predictive tools for forecasting future spread. A key invasive pest of concern in the UK is the oak processionary moth (OPM). OPM was established in the UK in 2006; it is harmful to both oak trees and humans, and its infestation area is continually expanding. Here, we use a computational inference scheme to estimate the parameters for a two-node network epidemic model to describe the temporal dynamics of OPM in two geographically neighbouring parks (Bushy Park and Richmond Park, London). We show the applicability of such a network model to describing invasive pest dynamics and our results suggest that the infestation within Richmond Park has largely driven the infestation within Bushy Park.
Collapse
|
4
|
Makrygiorgos G, Berliner AJ, Shi F, Clark DS, Arkin AP, Mesbah A. Data-driven flow-map models for data-efficient discovery of dynamics and fast uncertainty quantification of biological and biochemical systems. Biotechnol Bioeng 2023; 120:803-818. [PMID: 36453664 DOI: 10.1002/bit.28295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 07/27/2022] [Accepted: 10/09/2022] [Indexed: 12/05/2022]
Abstract
Computational models are increasingly used to investigate and predict the complex dynamics of biological and biochemical systems. Nevertheless, governing equations of a biochemical system may not be (fully) known, which would necessitate learning the system dynamics directly from, often limited and noisy, observed data. On the other hand, when expensive models are available, systematic and efficient quantification of the effects of model uncertainties on quantities of interest can be an arduous task. This paper leverages the notion of flow-map (de)compositions to present a framework that can address both of these challenges via learning data-driven models useful for capturing the dynamical behavior of biochemical systems. Data-driven flow-map models seek to directly learn the integration operators of the governing differential equations in a black-box manner, irrespective of structure of the underlying equations. As such, they can serve as a flexible approach for deriving fast-to-evaluate surrogates for expensive computational models of system dynamics, or, alternatively, for reconstructing the long-term system dynamics via experimental observations. We present a data-efficient approach to data-driven flow-map modeling based on polynomial chaos Kriging. The approach is demonstrated for discovery of the dynamics of various benchmark systems and a coculture bioreactor subject to external forcing, as well as for uncertainty quantification of a microbial electrosynthesis reactor. Such data-driven models and analyses of dynamical systems can be paramount in the design and optimization of bioprocesses and integrated biomanufacturing systems.
Collapse
Affiliation(s)
- Georgios Makrygiorgos
- Center for the Utilization of Biological Engineering in Space (CUBES), Berkeley, California, USA.,Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California, USA
| | - Aaron J Berliner
- Center for the Utilization of Biological Engineering in Space (CUBES), Berkeley, California, USA.,Department of Bioengineering, University of California, Berkeley, California, USA
| | - Fengzhe Shi
- Center for the Utilization of Biological Engineering in Space (CUBES), Berkeley, California, USA.,Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California, USA
| | - Douglas S Clark
- Center for the Utilization of Biological Engineering in Space (CUBES), Berkeley, California, USA.,Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California, USA
| | - Adam P Arkin
- Center for the Utilization of Biological Engineering in Space (CUBES), Berkeley, California, USA.,Department of Bioengineering, University of California, Berkeley, California, USA
| | - Ali Mesbah
- Center for the Utilization of Biological Engineering in Space (CUBES), Berkeley, California, USA.,Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California, USA
| |
Collapse
|
5
|
Järvenpää M, Corander J. On predictive inference for intractable models via approximate Bayesian computation. STATISTICS AND COMPUTING 2023; 33:42. [PMID: 36785730 PMCID: PMC9911513 DOI: 10.1007/s11222-022-10163-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 10/02/2022] [Indexed: 06/18/2023]
Abstract
UNLABELLED Approximate Bayesian computation (ABC) is commonly used for parameter estimation and model comparison for intractable simulator-based statistical models whose likelihood function cannot be evaluated. In this paper we instead investigate the feasibility of ABC as a generic approximate method for predictive inference, in particular, for computing the posterior predictive distribution of future observations or missing data of interest. We consider three complementary ABC approaches for this goal, each based on different assumptions regarding which predictive density of the intractable model can be sampled from. The case where only simulation from the joint density of the observed and future data given the model parameters can be used for inference is given particular attention and it is shown that the ideal summary statistic in this setting is minimal predictive sufficient instead of merely minimal sufficient (in the ordinary sense). An ABC prediction approach that takes advantage of a certain latent variable representation is also investigated. We additionally show how common ABC sampling algorithms can be used in the predictive settings considered. Our main results are first illustrated by using simple time-series models that facilitate analytical treatment, and later by using two common intractable dynamic models. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s11222-022-10163-6.
Collapse
Affiliation(s)
- Marko Järvenpää
- Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Jukka Corander
- Department of Biostatistics, University of Oslo, Oslo, Norway
- Department of Mathematics and Statistics, Helsinki Institute of Information Technology (HIIT), University of Helsinki, Helsinki, Finland
- Wellcome Sanger Institute, Hinxton, Cambridgeshire, UK
| |
Collapse
|
6
|
Linden NJ, Kramer B, Rangamani P. Bayesian parameter estimation for dynamical models in systems biology. PLoS Comput Biol 2022; 18:e1010651. [PMID: 36269772 PMCID: PMC9629650 DOI: 10.1371/journal.pcbi.1010651] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 11/02/2022] [Accepted: 10/12/2022] [Indexed: 11/07/2022] Open
Abstract
Dynamical systems modeling, particularly via systems of ordinary differential equations, has been used to effectively capture the temporal behavior of different biochemical components in signal transduction networks. Despite the recent advances in experimental measurements, including sensor development and '-omics' studies that have helped populate protein-protein interaction networks in great detail, modeling in systems biology lacks systematic methods to estimate kinetic parameters and quantify associated uncertainties. This is because of multiple reasons, including sparse and noisy experimental measurements, lack of detailed molecular mechanisms underlying the reactions, and missing biochemical interactions. Additionally, the inherent nonlinearities with respect to the states and parameters associated with the system of differential equations further compound the challenges of parameter estimation. In this study, we propose a comprehensive framework for Bayesian parameter estimation and complete quantification of the effects of uncertainties in the data and models. We apply these methods to a series of signaling models of increasing mathematical complexity. Systematic analysis of these dynamical systems showed that parameter estimation depends on data sparsity, noise level, and model structure, including the existence of multiple steady states. These results highlight how focused uncertainty quantification can enrich systems biology modeling and enable additional quantitative analyses for parameter estimation.
Collapse
Affiliation(s)
- Nathaniel J. Linden
- Department of Mechanical and Aerospace Engineering, University of California San Diego, San Diego, California, United States of America
| | - Boris Kramer
- Department of Mechanical and Aerospace Engineering, University of California San Diego, San Diego, California, United States of America
- * E-mail: (BK); (PR)
| | - Padmini Rangamani
- Department of Mechanical and Aerospace Engineering, University of California San Diego, San Diego, California, United States of America
- * E-mail: (BK); (PR)
| |
Collapse
|
7
|
Biron-Lattes M, Bouchard-Côté A, Campbell T. Pseudo-marginal inference for CTMCs on infinite spaces via monotonic likelihood approximations. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2118750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
8
|
Roy A, Shen L, Balasubramanian K, Ghadimi S. Stochastic zeroth-order discretizations of Langevin diffusions for Bayesian inference. BERNOULLI 2022. [DOI: 10.3150/21-bej1400] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Abhishek Roy
- Department of Statistics, University of California, Davis, Davis, CA 95616, USA
| | - Lingqing Shen
- Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA 15213
| | | | - Saeed Ghadimi
- Department of Management Sciences, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
9
|
Öcal K, Gutmann MU, Sanguinetti G, Grima R. Inference and uncertainty quantification of stochastic gene expression via synthetic models. JOURNAL OF THE ROYAL SOCIETY, INTERFACE 2022; 19:20220153. [PMID: 35858045 PMCID: PMC9277240 DOI: 10.1098/rsif.2022.0153] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Estimating uncertainty in model predictions is a central task in quantitative biology. Biological models at the single-cell level are intrinsically stochastic and nonlinear, creating formidable challenges for their statistical estimation which inevitably has to rely on approximations that trade accuracy for tractability. Despite intensive interest, a sweet spot in this trade-off has not been found yet. We propose a flexible procedure for uncertainty quantification in a wide class of reaction networks describing stochastic gene expression including those with feedback. The method is based on creating a tractable coarse-graining of the model that is learned from simulations, a synthetic model, to approximate the likelihood function. We demonstrate that synthetic models can substantially outperform state-of-the-art approaches on a number of non-trivial systems and datasets, yielding an accurate and computationally viable solution to uncertainty quantification in stochastic models of gene expression.
Collapse
Affiliation(s)
- Kaan Öcal
- School of Informatics, University of Edinburgh, Edinburgh EH9 3JH, UK.,School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| | - Michael U Gutmann
- School of Informatics, University of Edinburgh, Edinburgh EH9 3JH, UK
| | - Guido Sanguinetti
- Scuola Internazionale Superiore di Studi Avanzati, 34136 Trieste, Italy
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| |
Collapse
|
10
|
Sherlock C, Golightly A. Exact Bayesian inference for discretely observed Markov Jump Processes using finite rate matrices. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2093886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, UK
| | | |
Collapse
|
11
|
Quantifying biochemical reaction rates from static population variability within incompletely observed complex networks. PLoS Comput Biol 2022; 18:e1010183. [PMID: 35731728 PMCID: PMC9216546 DOI: 10.1371/journal.pcbi.1010183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 05/07/2022] [Indexed: 11/19/2022] Open
Abstract
Quantifying biochemical reaction rates within complex cellular processes remains a key challenge of systems biology even as high-throughput single-cell data have become available to characterize snapshots of population variability. That is because complex systems with stochastic and non-linear interactions are difficult to analyze when not all components can be observed simultaneously and systems cannot be followed over time. Instead of using descriptive statistical models, we show that incompletely specified mechanistic models can be used to translate qualitative knowledge of interactions into reaction rate functions from covariability data between pairs of components. This promises to turn a globally intractable problem into a sequence of solvable inference problems to quantify complex interaction networks from incomplete snapshots of their stochastic fluctuations.
Collapse
|
12
|
Explainable Machine Learning for Longitudinal Multi-Omic Microbiome. MATHEMATICS 2022. [DOI: 10.3390/math10121994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Over the years, research studies have shown there is a key connection between the microbial community in the gut, genes, and immune system. Understanding this association may help discover the cause of complex chronic idiopathic disorders such as inflammatory bowel disease. Even though important efforts have been put into the field, the functions, dynamics, and causation of dysbiosis state performed by the microbial community remains unclear. Machine learning models can help elucidate important connections and relationships between microbes in the human host. Our study aims to extend the current knowledge of associations between the human microbiome and health and disease through the application of dynamic Bayesian networks to describe the temporal variation of the gut microbiota and dynamic relationships between taxonomic entities and clinical variables. We develop a set of preprocessing steps to clean, filter, select, integrate, and model informative metagenomics, metatranscriptomics, and metabolomics longitudinal data from the Human Microbiome Project. This study accomplishes novel network models with satisfactory predictive performance (accuracy = 0.648) for each inflammatory bowel disease state, validating Bayesian networks as a framework for developing interpretable models to help understand the basic ways the different biological entities (taxa, genes, metabolites) interact with each other in a given environment (human gut) over time. These findings can serve as a starting point to advance the discovery of novel therapeutic approaches and new biomarkers for precision medicine.
Collapse
|
13
|
Chkrebtii OA, García YE, Capistrán MA, Noyola DE. Inference for stochastic kinetic models from multiple data sources for joint estimation of infection dynamics from aggregate reports and virological data. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Yury E. García
- Área de Matemáticas Básicas, Centro de Investigación en Matemáticas
| | | | - Daniel E. Noyola
- Department of Microbiology, Faculty of Medicine, Universidad Autónoma de San Luis Potosí
| |
Collapse
|
14
|
Pieschner S, Hasenauer J, Fuchs C. Identifiability analysis for models of the translation kinetics after mRNA transfection. J Math Biol 2022; 84:56. [PMID: 35577967 PMCID: PMC9110294 DOI: 10.1007/s00285-022-01739-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 03/25/2022] [Accepted: 03/26/2022] [Indexed: 11/12/2022]
Abstract
Mechanistic models are a powerful tool to gain insights into biological processes. The parameters of such models, e.g. kinetic rate constants, usually cannot be measured directly but need to be inferred from experimental data. In this article, we study dynamical models of the translation kinetics after mRNA transfection and analyze their parameter identifiability. That is, whether parameters can be uniquely determined from perfect or realistic data in theory and practice. Previous studies have considered ordinary differential equation (ODE) models of the process, and here we formulate a stochastic differential equation (SDE) model. For both model types, we consider structural identifiability based on the model equations and practical identifiability based on simulated as well as experimental data and find that the SDE model provides better parameter identifiability than the ODE model. Moreover, our analysis shows that even for those parameters of the ODE model that are considered to be identifiable, the obtained estimates are sometimes unreliable. Overall, our study clearly demonstrates the relevance of considering different modeling approaches and that stochastic models can provide more reliable and informative results.
Collapse
Affiliation(s)
- Susanne Pieschner
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Oberschleißheim, Germany.,Department of Mathematics, Technical University Munich, Garching, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Oberschleißheim, Germany.,Department of Mathematics, Technical University Munich, Garching, Germany.,Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | - Christiane Fuchs
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Oberschleißheim, Germany. .,Department of Mathematics, Technical University Munich, Garching, Germany. .,Faculty of Business Administration and Economics, Bielefeld University, Bielefeld, Germany.
| |
Collapse
|
15
|
Münch JL, Paul F, Schmauder R, Benndorf K. Bayesian inference of kinetic schemes for ion channels by Kalman filtering. eLife 2022; 11:e62714. [PMID: 35506659 PMCID: PMC9342998 DOI: 10.7554/elife.62714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 04/22/2022] [Indexed: 11/16/2022] Open
Abstract
Inferring adequate kinetic schemes for ion channel gating from ensemble currents is a daunting task due to limited information in the data. We address this problem by using a parallelized Bayesian filter to specify hidden Markov models for current and fluorescence data. We demonstrate the flexibility of this algorithm by including different noise distributions. Our generalized Kalman filter outperforms both a classical Kalman filter and a rate equation approach when applied to patch-clamp data exhibiting realistic open-channel noise. The derived generalization also enables inclusion of orthogonal fluorescence data, making unidentifiable parameters identifiable and increasing the accuracy of the parameter estimates by an order of magnitude. By using Bayesian highest credibility volumes, we found that our approach, in contrast to the rate equation approach, yields a realistic uncertainty quantification. Furthermore, the Bayesian filter delivers negligibly biased estimates for a wider range of data quality. For some data sets, it identifies more parameters than the rate equation approach. These results also demonstrate the power of assessing the validity of algorithms by Bayesian credibility volumes in general. Finally, we show that our Bayesian filter is more robust against errors induced by either analog filtering before analog-to-digital conversion or by limited time resolution of fluorescence data than a rate equation approach.
Collapse
Affiliation(s)
- Jan L Münch
- Institut für Physiologie II, Universitätsklinikum Jena, Friedrich Schiller University JenaJenaGermany
| | - Fabian Paul
- Department of Biochemistry and Molecular Biology, University of ChicagoChicagoUnited States
| | - Ralf Schmauder
- Institut für Physiologie II, Universitätsklinikum Jena, Friedrich Schiller University JenaJenaGermany
| | - Klaus Benndorf
- Institut für Physiologie II, Universitätsklinikum Jena, Friedrich Schiller University JenaJenaGermany
| |
Collapse
|
16
|
Zhang H, Chen J, Tian T. Bayesian Inference of Stochastic Dynamic Models Using Early-Rejection Methods Based on Sequential Stochastic Simulations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1484-1494. [PMID: 33216717 DOI: 10.1109/tcbb.2020.3039490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Stochastic modelling is an important method to investigate the functions of noise in a wide range of biological systems. However, the parameter inference for stochastic models is still a challenging problem partially due to the large computing time required for stochastic simulations. To address this issue, we propose a novel early-rejection method by using sequential stochastic simulations. We first show that a large number of stochastic simulations are required to obtain reliable inference results. Instead of generating a large number of simulations for each parameter sample, we propose to generate these simulations in a number of stages. The simulation process will go to the next stage only if the accuracy of simulations at the current stage satisfies a given error criterion. We propose a formula to determine the error criterion and use a stochastic differential equation model to examine the effects of different criteria. Three biochemical network models are used to evaluate the efficiency and accuracy of the proposed method. Numerical results suggest the proposed early-rejection method achieves substantial improvement in the efficiency for the inference of stochastic models.
Collapse
|
17
|
Parameter inference for stochastic biochemical models from perturbation experiments parallelised at the single cell level. PLoS Comput Biol 2022; 18:e1009950. [PMID: 35303737 PMCID: PMC8967023 DOI: 10.1371/journal.pcbi.1009950] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 03/30/2022] [Accepted: 02/21/2022] [Indexed: 01/30/2023] Open
Abstract
Understanding and characterising biochemical processes inside single cells requires experimental platforms that allow one to perturb and observe the dynamics of such processes as well as computational methods to build and parameterise models from the collected data. Recent progress with experimental platforms and optogenetics has made it possible to expose each cell in an experiment to an individualised input and automatically record cellular responses over days with fine time resolution. However, methods to infer parameters of stochastic kinetic models from single-cell longitudinal data have generally been developed under the assumption that experimental data is sparse and that responses of cells to at most a few different input perturbations can be observed. Here, we investigate and compare different approaches for calculating parameter likelihoods of single-cell longitudinal data based on approximations of the chemical master equation (CME) with a particular focus on coupling the linear noise approximation (LNA) or moment closure methods to a Kalman filter. We show that, as long as cells are measured sufficiently frequently, coupling the LNA to a Kalman filter allows one to accurately approximate likelihoods and to infer model parameters from data even in cases where the LNA provides poor approximations of the CME. Furthermore, the computational cost of filtering-based iterative likelihood evaluation scales advantageously in the number of measurement times and different input perturbations and is thus ideally suited for data obtained from modern experimental platforms. To demonstrate the practical usefulness of these results, we perform an experiment in which single cells, equipped with an optogenetic gene expression system, are exposed to various different light-input sequences and measured at several hundred time points and use parameter inference based on iterative likelihood evaluation to parameterise a stochastic model of the system. A common result for the modelling of cellular processes is that available data is not sufficiently rich to uniquely determine the biological mechanism or even just to ensure identifiability of parameters of a given model. Perturbing cellular processes with informative input stimuli and measuring dynamical responses may alleviate this problem. With the development of novel experimental platforms, we are now in a position to parallelise such perturbation experiments at the single cell level. This raises a plethora of new questions. Is it more informative to diversify input perturbations but to observe only few cells for each input or should we rather ensure that many cells are observed for only few inputs? How can we calculate likelihoods and infer parameters of stochastic kinetic models from data sets in which each cell receives a different input perturbation? How does the computational efficiency of parameter inference methods scale with the number of inputs and the number of measurement times? Are there approaches that are particularly well-suited for such data sets? In this paper, we investigate these questions using the CcaS/CcaR optogenetic system driving the expression of a fluorescent reporter protein as primary case study.
Collapse
|
18
|
Unosson M, Brancaccio M, Hastings M, Johansen AM, Finkenstädt B. A spatio-temporal model to reveal oscillator phenotypes in molecular clocks: Parameter estimation elucidates circadian gene transcription dynamics in single-cells. PLoS Comput Biol 2021; 17:e1009698. [PMID: 34919546 PMCID: PMC8719734 DOI: 10.1371/journal.pcbi.1009698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 12/31/2021] [Accepted: 11/29/2021] [Indexed: 11/19/2022] Open
Abstract
We propose a stochastic distributed delay model together with a Markov random field prior and a measurement model for bioluminescence-reporting to analyse spatio-temporal gene expression in intact networks of cells. The model describes the oscillating time evolution of molecular mRNA counts through a negative transcriptional-translational feedback loop encoded in a chemical Langevin equation with a probabilistic delay distribution. The model is extended spatially by means of a multiplicative random effects model with a first order Markov random field prior distribution. Our methodology effectively separates intrinsic molecular noise, measurement noise, and extrinsic noise and phenotypic variation driving cell heterogeneity, while being amenable to parameter identification and inference. Based on the single-cell model we propose a novel computational stability analysis that allows us to infer two key characteristics, namely the robustness of the oscillations, i.e. whether the reaction network exhibits sustained or damped oscillations, and the profile of the regulation, i.e. whether the inhibition occurs over time in a more distributed versus a more direct manner, which affects the cells' ability to phase-shift to new schedules. We show how insight into the spatio-temporal characteristics of the circadian feedback loop in the suprachiasmatic nucleus (SCN) can be gained by applying the methodology to bioluminescence-reported expression of the circadian core clock gene Cry1 across mouse SCN tissue. We find that while (almost) all SCN neurons exhibit robust cell-autonomous oscillations, the parameters that are associated with the regulatory transcription profile give rise to a spatial division of the tissue between the central region whose oscillations are resilient to perturbation in the sense that they maintain a high degree of synchronicity, and the dorsal region which appears to phase shift in a more diversified way as a response to large perturbations and thus could be more amenable to entrainment.
Collapse
Affiliation(s)
- Måns Unosson
- Department of Statistics, University of Warwick, Coventry, United Kingdom
| | - Marco Brancaccio
- UK Dementia Research Institute at Imperial College London, Department of Brain Sciences, Faculty of Medicine, London, United Kingdom
| | - Michael Hastings
- MRC Laboratory of Molecular Biology, Division of Neurobiology, Cambridge, United Kingdom
| | - Adam M. Johansen
- Department of Statistics, University of Warwick, Coventry, United Kingdom
| | - Bärbel Finkenstädt
- Department of Statistics, University of Warwick, Coventry, United Kingdom
- The Zeeman Institute for Systems Biology & Infectious Disease Epidemiology Research, University of Warwick, Coventry, United Kingdom
- * E-mail:
| |
Collapse
|
19
|
Lee W, McCormick TH, Neil J, Sodja C, Cui Y. Anomaly Detection in Large-Scale Networks With Latent Space Models. Technometrics 2021. [DOI: 10.1080/00401706.2021.1952900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Wesley Lee
- Department of Statistics, University of Washington, Seattle, DC
| | - Tyler H. McCormick
- Department of Statistics and Department of Sociology, University of Washington, Seattle, DC
| | | | | | - Yanran Cui
- Department of Statistics, University of Washington, Seattle, DC
| |
Collapse
|
20
|
Sherlock C, Thiery AH, Golightly A. Efficiency of delayed-acceptance random walk Metropolis algorithms. Ann Stat 2021. [DOI: 10.1214/21-aos2068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University
| | - Alexandre H. Thiery
- Department of Statistics and Applied Probability, National University of Singapore
| | - Andrew Golightly
- School of Mathematics, Statistics and Physics, Newcastle University
| |
Collapse
|
21
|
Bittner SR, Palmigiano A, Piet AT, Duan CA, Brody CD, Miller KD, Cunningham J. Interrogating theoretical models of neural computation with emergent property inference. eLife 2021; 10:e56265. [PMID: 34323690 PMCID: PMC8321557 DOI: 10.7554/elife.56265] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 06/30/2021] [Indexed: 11/13/2022] Open
Abstract
A cornerstone of theoretical neuroscience is the circuit model: a system of equations that captures a hypothesized neural mechanism. Such models are valuable when they give rise to an experimentally observed phenomenon -- whether behavioral or a pattern of neural activity -- and thus can offer insights into neural computation. The operation of these circuits, like all models, critically depends on the choice of model parameters. A key step is then to identify the model parameters consistent with observed phenomena: to solve the inverse problem. In this work, we present a novel technique, emergent property inference (EPI), that brings the modern probabilistic modeling toolkit to theoretical neuroscience. When theorizing circuit models, theoreticians predominantly focus on reproducing computational properties rather than a particular dataset. Our method uses deep neural networks to learn parameter distributions with these computational properties. This methodology is introduced through a motivational example of parameter inference in the stomatogastric ganglion. EPI is then shown to allow precise control over the behavior of inferred parameters and to scale in parameter dimension better than alternative techniques. In the remainder of this work, we present novel theoretical findings in models of primary visual cortex and superior colliculus, which were gained through the examination of complex parametric structure captured by EPI. Beyond its scientific contribution, this work illustrates the variety of analyses possible once deep learning is harnessed towards solving theoretical inverse problems.
Collapse
Affiliation(s)
- Sean R Bittner
- Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | | | - Alex T Piet
- Princeton Neuroscience InstitutePrincetonUnited States
- Princeton UniversityPrincetonUnited States
- Allen Institute for Brain ScienceSeattleUnited States
| | - Chunyu A Duan
- Institute of Neuroscience, Chinese Academy of SciencesShanghaiChina
| | - Carlos D Brody
- Princeton Neuroscience InstitutePrincetonUnited States
- Princeton UniversityPrincetonUnited States
- Howard Hughes Medical InstituteChevy ChaseUnited States
| | - Kenneth D Miller
- Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - John Cunningham
- Department of Statistics, Columbia UniversityNew YorkUnited States
| |
Collapse
|
22
|
Ion IG, Wildner C, Loukrezis D, Koeppl H, De Gersem H. Tensor-train approximation of the chemical master equation and its application for parameter inference. J Chem Phys 2021; 155:034102. [PMID: 34293878 DOI: 10.1063/5.0045521] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
In this work, we perform Bayesian inference tasks for the chemical master equation in the tensor-train format. The tensor-train approximation has been proven to be very efficient in representing high-dimensional data arising from the explicit representation of the chemical master equation solution. An additional advantage of representing the probability mass function in the tensor-train format is that parametric dependency can be easily incorporated by introducing a tensor product basis expansion in the parameter space. Time is treated as an additional dimension of the tensor and a linear system is derived to solve the chemical master equation in time. We exemplify the tensor-train method by performing inference tasks such as smoothing and parameter inference using the tensor-train framework. A very high compression ratio is observed for storing the probability mass function of the solution. Since all linear algebra operations are performed in the tensor-train format, a significant reduction in the computational time is observed as well.
Collapse
Affiliation(s)
- Ion Gabriel Ion
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| | - Christian Wildner
- Department of Electrical Engineering and Information Technology, Technische Universität Darmstadt, Darmstadt, Germany
| | - Dimitrios Loukrezis
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| | - Heinz Koeppl
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| | - Herbert De Gersem
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| |
Collapse
|
23
|
Efficient inference for stochastic differential equation mixed-effects models using correlated particle pseudo-marginal algorithms. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107151] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
24
|
Sherlock C. Direct statistical inference for finite Markov jump processes via the matrix exponential. Comput Stat 2021; 36:2863-2887. [PMID: 33897113 PMCID: PMC8054858 DOI: 10.1007/s00180-021-01102-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 03/23/2021] [Indexed: 11/27/2022]
Abstract
Given noisy, partial observations of a time-homogeneous, finite-statespace Markov chain, conceptually simple, direct statistical inference is available, in theory, via its rate matrix, or infinitesimal generator, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q, since \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\exp ({\mathsf {Q}}t)$$\end{document}exp(Qt) is the transition matrix over time t. However, perhaps because of inadequate tools for matrix exponentiation in programming languages commonly used amongst statisticians or a belief that the necessary calculations are prohibitively expensive, statistical inference for continuous-time Markov chains with a large but finite state space is typically conducted via particle MCMC or other relatively complex inference schemes. When, as in many applications \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q arises from a reaction network, it is usually sparse. We describe variations on known algorithms which allow fast, robust and accurate evaluation of the product of a non-negative vector with the exponential of a large, sparse rate matrix. Our implementation uses relatively recently developed, efficient, linear algebra tools that take advantage of such sparsity. We demonstrate the straightforward statistical application of the key algorithm on a model for the mixing of two alleles in a population and on the Susceptible-Infectious-Removed epidemic model.
Collapse
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| |
Collapse
|
25
|
Fisher HF, Boys RJ, Gillespie CS, Proctor CJ, Golightly A. Parameter inference for a stochastic kinetic model of expanded polyglutamine proteins. Biometrics 2021; 78:1195-1208. [PMID: 33837525 DOI: 10.1111/biom.13467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 03/21/2021] [Accepted: 03/24/2021] [Indexed: 11/30/2022]
Abstract
The presence of protein aggregates in cells is a known feature of many human age-related diseases, such as Huntington's disease. Simulations using fixed parameter values in a model of the dynamic evolution of expanded polyglutaime (PolyQ) proteins in cells have been used to gain a better understanding of the biological system. However, there is considerable uncertainty about the values of some of the parameters governing the system. Currently, appropriate values are chosen by ad hoc attempts to tune the parameters so that the model output matches experimental data. The problem is further complicated by the fact that the data only offer a partial insight into the underlying biological process: the data consist only of the proportions of cell death and of cells with inclusion bodies at a few time points, corrupted by measurement error. Developing inference procedures to estimate the model parameters in this scenario is a significant task. The model probabilities corresponding to the observed proportions cannot be evaluated exactly, and so they are estimated within the inference algorithm by repeatedly simulating realizations from the model. In general such an approach is computationally very expensive, and we therefore construct Gaussian process emulators for the key quantities and reformulate our algorithm around these fast stochastic approximations. We conclude by highlighting appropriate values of the model parameters leading to new insights into the underlying biological processes.
Collapse
Affiliation(s)
- H F Fisher
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK.,Population Health Sciences Institute, Newcastle University, Newcastle Upon Tyne, UK
| | - R J Boys
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK
| | - C S Gillespie
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK
| | - C J Proctor
- Institute of Cellular Medicine, Newcastle University, Newcastle Upon Tyne, UK
| | - A Golightly
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK
| |
Collapse
|
26
|
Hansen AS, Zechner C. Promoters adopt distinct dynamic manifestations depending on transcription factor context. Mol Syst Biol 2021; 17:e9821. [PMID: 33595925 PMCID: PMC7888307 DOI: 10.15252/msb.20209821] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 08/15/2020] [Accepted: 08/25/2020] [Indexed: 01/22/2023] Open
Abstract
Cells respond to external signals and stresses by activating transcription factors (TF), which induce gene expression changes. Prior work suggests that signal-specific gene expression changes are partly achieved because different gene promoters exhibit distinct induction dynamics in response to the same TF input signal. Here, using high-throughput quantitative single-cell measurements and a novel statistical method, we systematically analyzed transcriptional responses to a large number of dynamic TF inputs. In particular, we quantified the scaling behavior among different transcriptional features extracted from the measured trajectories such as the gene activation delay or duration of promoter activity. Surprisingly, we found that even the same gene promoter can exhibit qualitatively distinct induction and scaling behaviors when exposed to different dynamic TF contexts. While it was previously known that promoters fall into distinct classes, here we show that the same promoter can switch between different classes depending on context. Thus, promoters can adopt context-dependent "manifestations". Our analysis suggests that the full complexity of signal processing by genetic circuits may be significantly underestimated when studied in only specific contexts.
Collapse
Affiliation(s)
- Anders S Hansen
- Department of Biological EngineeringMassachusetts Institute of TechnologyCambridgeMAUSA
| | - Christoph Zechner
- Max Planck Institute of Molecular Cell Biology & GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
- Cluster of Excellence Physics of LifeTU DresdenDresdenGermany
| |
Collapse
|
27
|
Rathinam M, Yu M. State and parameter estimation from exact partial state observation in stochastic reaction networks. J Chem Phys 2021; 154:034103. [PMID: 33499627 DOI: 10.1063/5.0032539] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We consider chemical reaction networks modeled by a discrete state and continuous in time Markov process for the vector copy number of the species and provide a novel particle filter method for state and parameter estimation based on exact observation of some of the species in continuous time. The conditional probability distribution of the unobserved states is shown to satisfy a system of differential equations with jumps. We provide a method of simulating a process that is a proxy for the vector copy number of the unobserved species along with a weight. The resulting weighted Monte Carlo simulation is then used to compute the conditional probability distribution of the unobserved species. We also show how our algorithm can be adapted for a Bayesian estimation of parameters and for the estimation of a past state value based on observations up to a future time.
Collapse
Affiliation(s)
- Muruhan Rathinam
- Department of Mathematics and Statistics, University of Maryland Baltimore County, Baltimore, Maryland 21250, USA
| | - Mingkai Yu
- Department of Mathematics and Statistics, University of Maryland Baltimore County, Baltimore, Maryland 21250, USA
| |
Collapse
|
28
|
Analysis of Markov Jump Processes under Terminal Constraints. TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS 2021. [PMCID: PMC7979204 DOI: 10.1007/978-3-030-72016-2_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Many probabilistic inference problems such as stochastic filtering or the computation of rare event probabilities require model analysis under initial and terminal constraints. We propose a solution to this bridging problem for the widely used class of population-structured Markov jump processes. The method is based on a state-space lumping scheme that aggregates states in a grid structure. The resulting approximate bridging distribution is used to iteratively refine relevant and truncate irrelevant parts of the state-space. This way, the algorithm learns a well-justified finite-state projection yielding guaranteed lower bounds for the system behavior under endpoint constraints. We demonstrate the method’s applicability to a wide range of problems such as Bayesian inference and the analysis of rare events.
Collapse
|
29
|
Mider M, Schauer M, van der Meulen F. Continuous-discrete smoothing of diffusions. Electron J Stat 2021. [DOI: 10.1214/21-ejs1894] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Marcin Mider
- Trium Analysis Online GmbH, Hohenlindener Str. 1, 81677 München, Germany
| | - Moritz Schauer
- Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Chalmers Tvärgata 3, 41296 Göteborg, Sweden
| | - Frank van der Meulen
- Delft Institute of Applied Mathematics (DIAM), Delft University of Technology, Mekelweg 4, 2628CD Delft, The Netherlands
| |
Collapse
|
30
|
Bae J, Jeong DH, Lee JM. Ranking-Based Parameter Subset Selection for Nonlinear Dynamics with Stochastic Disturbances under Limited Data. Ind Eng Chem Res 2020. [DOI: 10.1021/acs.iecr.0c04219] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jaehan Bae
- School of Chemical and Biological Engineering, Institute of Chemical Processes, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Korea
| | - Dong Hwi Jeong
- School of Chemical Engineering, University of Ulsan, 93, Daehak-ro,
Nam-gu, Ulsan 44610, Korea
| | - Jong Min Lee
- School of Chemical and Biological Engineering, Institute of Chemical Processes, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Korea
| |
Collapse
|
31
|
Ca̧kała T, Miasojedow B, Niemiro W. Particle MCMC With Poisson Resampling: Parallelization and Continuous Time Models. J Comput Graph Stat 2020. [DOI: 10.1080/10618600.2020.1840998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Tomasz Ca̧kała
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Błażej Miasojedow
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Wojciech Niemiro
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- Nicolaus Copernicus University, Torun, Poland
| |
Collapse
|
32
|
Browning AP, Warne DJ, Burrage K, Baker RE, Simpson MJ. Identifiability analysis for stochastic differential equation models in systems biology. J R Soc Interface 2020; 17:20200652. [PMID: 33323054 PMCID: PMC7811582 DOI: 10.1098/rsif.2020.0652] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 11/24/2020] [Indexed: 12/26/2022] Open
Abstract
Mathematical models are routinely calibrated to experimental data, with goals ranging from building predictive models to quantifying parameters that cannot be measured. Whether or not reliable parameter estimates are obtainable from the available data can easily be overlooked. Such issues of parameter identifiability have important ramifications for both the predictive power of a model, and the mechanistic insight that can be obtained. Identifiability analysis is well-established for deterministic, ordinary differential equation (ODE) models, but there are no commonly adopted methods for analysing identifiability in stochastic models. We provide an accessible introduction to identifiability analysis and demonstrate how existing ideas for analysis of ODE models can be applied to stochastic differential equation (SDE) models through four practical case studies. To assess structural identifiability, we study ODEs that describe the statistical moments of the stochastic process using open-source software tools. Using practically motivated synthetic data and Markov chain Monte Carlo methods, we assess parameter identifiability in the context of available data. Our analysis shows that SDE models can often extract more information about parameters than deterministic descriptions. All code used to perform the analysis is available on Github.
Collapse
Affiliation(s)
- Alexander P. Browning
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| | - David J. Warne
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| | - Kevin Burrage
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, Queensland University of Technology, Brisbane, Australia
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Ruth E. Baker
- Mathematical Institute, University of Oxford, Oxford, UK
| | - Matthew J. Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
33
|
Mikelson J, Khammash M. Likelihood-free nested sampling for parameter inference of biochemical reaction networks. PLoS Comput Biol 2020; 16:e1008264. [PMID: 33035218 PMCID: PMC7577508 DOI: 10.1371/journal.pcbi.1008264] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 10/21/2020] [Accepted: 08/16/2020] [Indexed: 12/03/2022] Open
Abstract
The development of mechanistic models of biological systems is a central part of Systems Biology. One major challenge in developing these models is the accurate inference of model parameters. In recent years, nested sampling methods have gained increased attention in the Systems Biology community due to the fact that they are parallelizable and provide error estimates with no additional computations. One drawback that severely limits the usability of these methods, however, is that they require the likelihood function to be available, and thus cannot be applied to systems with intractable likelihoods, such as stochastic models. Here we present a likelihood-free nested sampling method for parameter inference which overcomes these drawbacks. This method gives an unbiased estimator of the Bayesian evidence as well as samples from the posterior. We derive a lower bound on the estimators variance which we use to formulate a novel termination criterion for nested sampling. The presented method enables not only the reliable inference of the posterior of parameters for stochastic systems of a size and complexity that is challenging for traditional methods, but it also provides an estimate of the obtained variance. We illustrate our approach by applying it to several realistically sized models with simulated data as well as recently published biological data. We also compare our developed method with the two most popular other likeliood-free approaches: pMCMC and ABC-SMC. The C++ code of the proposed methods, together with test data, is available at the github web page https://github.com/Mijan/LFNS_paper. The behaviour of mathematical models of biochemical reactions is governed by model parameters encoding for various reaction rates, molecule concentrations and other biochemical quantities. As the general purpose of these models is to reproduce and predict the true biological response to different stimuli, the inference of these parameters, given experimental observations, is a crucial part of Systems Biology. While plenty of methods have been published for the inference of model parameters, most of them require the availability of the likelihood function and thus cannot be applied to models that do not allow for the computation of the likelihood. Further, most established methods do not provide an estimate of the variance of the obtained estimator. In this paper, we present a novel inference method that accurately approximates the posterior distribution of parameters and does not require the evaluation of the likelihood function. Our method is based on the nested sampling algorithm and approximates the likelihood with a particle filter. We show that the resulting posterior estimates are unbiased and provide a way to estimate not just the posterior distribution, but also an error estimate of the final estimator. We illustrate our method on several stochastic models with simulated data as well as one model of transcription with real biological data.
Collapse
|
34
|
Pieschner S, Fuchs C. Bayesian inference for diffusion processes: using higher-order approximations for transition densities. ROYAL SOCIETY OPEN SCIENCE 2020; 7:200270. [PMID: 33204444 PMCID: PMC7657901 DOI: 10.1098/rsos.200270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 09/17/2020] [Indexed: 06/11/2023]
Abstract
Modelling random dynamical systems in continuous time, diffusion processes are a powerful tool in many areas of science. Model parameters can be estimated from time-discretely observed processes using Markov chain Monte Carlo (MCMC) methods that introduce auxiliary data. These methods typically approximate the transition densities of the process numerically, both for calculating the posterior densities and proposing auxiliary data. Here, the Euler-Maruyama scheme is the standard approximation technique. However, the MCMC method is computationally expensive. Using higher-order approximations may accelerate it, but the specific implementation and benefit remain unclear. Hence, we investigate the utilization and usefulness of higher-order approximations in the example of the Milstein scheme. Our study demonstrates that the MCMC methods based on the Milstein approximation yield good estimation results. However, they are computationally more expensive and can be applied to multidimensional processes only with impractical restrictions. Moreover, the combination of the Milstein approximation and the well-known modified bridge proposal introduces additional numerical challenges.
Collapse
Affiliation(s)
- Susanne Pieschner
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
- Department of Mathematics, Technische Universität München, Boltzmannstrasse 3, 85748 Garching, Germany
| | - Christiane Fuchs
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
- Department of Mathematics, Technische Universität München, Boltzmannstrasse 3, 85748 Garching, Germany
- Data Science Group, Faculty of Business Administration and Economics, Universität Bielefeld, Postfach 100131, 33501 Bielefeld, Germany
| |
Collapse
|
35
|
Gorin G, Pachter L. Special function methods for bursty models of transcription. Phys Rev E 2020; 102:022409. [PMID: 32942485 DOI: 10.1103/physreve.102.022409] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 08/10/2020] [Indexed: 11/07/2022]
Abstract
We explore a Markov model used in the analysis of gene expression, involving the bursty production of pre-mRNA, its conversion to mature mRNA, and its consequent degradation. We demonstrate that the integration used to compute the solution of the stochastic system can be approximated by the evaluation of special functions. Furthermore, the form of the special function solution generalizes to a broader class of burst distributions. In light of the broader goal of biophysical parameter inference from transcriptomics data, we apply the method to simulated data, demonstrating effective control of precision and runtime. Finally, we propose and validate a non-Bayesian approach for parameter estimation based on the characteristic function of the target joint distribution of pre-mRNA and mRNA.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering & Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, USA
| |
Collapse
|
36
|
Harrison JU, Baker RE. An automatic adaptive method to combine summary statistics in approximate Bayesian computation. PLoS One 2020; 15:e0236954. [PMID: 32760106 PMCID: PMC7410215 DOI: 10.1371/journal.pone.0236954] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 07/16/2020] [Indexed: 11/18/2022] Open
Abstract
To infer the parameters of mechanistic models with intractable likelihoods, techniques such as approximate Bayesian computation (ABC) are increasingly being adopted. One of the main disadvantages of ABC in practical situations, however, is that parameter inference must generally rely on summary statistics of the data. This is particularly the case for problems involving high-dimensional data, such as biological imaging experiments. However, some summary statistics contain more information about parameters of interest than others, and it is not always clear how to weight their contributions within the ABC framework. We address this problem by developing an automatic, adaptive algorithm that chooses weights for each summary statistic. Our algorithm aims to maximize the distance between the prior and the approximate posterior by automatically adapting the weights within the ABC distance function. Computationally, we use a nearest neighbour estimator of the distance between distributions. We justify the algorithm theoretically based on properties of the nearest neighbour distance estimator. To demonstrate the effectiveness of our algorithm, we apply it to a variety of test problems, including several stochastic models of biochemical reaction networks, and a spatial model of diffusion, and compare our results with existing algorithms.
Collapse
Affiliation(s)
- Jonathan U. Harrison
- Mathematical Institute, Mathematical Sciences Building, University of Warwick, Coventry, United Kingdom
- * E-mail:
| | - Ruth E. Baker
- Mathematical Institute, Andrew Wiles Building, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
37
|
Deng Z, Zhang X, Tian T. Inference of Model Parameters Using Particle Filter Algorithm and Copula Distributions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1231-1240. [PMID: 30418916 DOI: 10.1109/tcbb.2018.2880974] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
It is widely accepted that experimental data often include noise because of the limitation in experimental conditions. In addition, biological systems inside the cells also contain uncertainty due to small copy molecular numbers. To address this issue, it was proposed that experimental data include both real system state and a noise term whose variance is a constant. An additional assumption is that the observation data of different variables are independent to each other. However, recent research works showed that noise in experimental data might not be the white noise. In addition, the observed values of different variables may be correlated. This work designs a new algorithm to infer the unknown model parameters based on noisy data. The innovation of this method includes a new noise model, in which the variance of noise is dependent on the system state, and a copula particle filter algorithm that uses the copula density functions to describe the dependence of different variables. The proposed algorithm is evaluated by using two deterministic models for gene networks and a stochastic model. Numerical results show that the accuracy of our proposed method is better than that of the widely used Liu-West filter and copula particle filter algorithms.
Collapse
|
38
|
Lawless C, Greaves L, Reeve AK, Turnbull DM, Vincent AE. The rise and rise of mitochondrial DNA mutations. Open Biol 2020; 10:200061. [PMID: 32428418 PMCID: PMC7276526 DOI: 10.1098/rsob.200061] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 04/23/2020] [Indexed: 12/24/2022] Open
Abstract
How mitochondrial DNA mutations clonally expand in an individual cell is a question that has perplexed mitochondrial biologists for decades. A growing body of literature indicates that mitochondrial DNA mutations play a major role in ageing, metabolic diseases, neurodegenerative diseases, neuromuscular disorders and cancers. Importantly, this process of clonal expansion occurs for both inherited and somatic mitochondrial DNA mutations. To complicate matters further there are fundamental differences between mitochondrial DNA point mutations and deletions, and between mitotic and post-mitotic cells, that impact this pathogenic process. These differences, along with the challenges of investigating a longitudinal process occurring over decades in humans, have so far hindered progress towards understanding clonal expansion. Here we summarize our current understanding of the clonal expansion of mitochondrial DNA mutations in different tissues and highlight key unanswered questions. We then discuss the various existing biological models, along with their advantages and disadvantages. Finally, we explore what has been achieved with mathematical modelling so far and suggest future work to advance this important area of research.
Collapse
Affiliation(s)
| | | | | | - Doug M. Turnbull
- Wellcome Centre for Mitochondrial Research, Clinical and Translational Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle NE2 4HH, UK
| | - Amy E. Vincent
- Wellcome Centre for Mitochondrial Research, Clinical and Translational Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle NE2 4HH, UK
| |
Collapse
|
39
|
Gorin G, Wang M, Golding I, Xu H. Stochastic simulation and statistical inference platform for visualization and estimation of transcriptional kinetics. PLoS One 2020; 15:e0230736. [PMID: 32214380 PMCID: PMC7098607 DOI: 10.1371/journal.pone.0230736] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 03/06/2020] [Indexed: 12/20/2022] Open
Abstract
Recent advances in single-molecule fluorescent imaging have enabled quantitative measurements of transcription at a single gene copy, yet an accurate understanding of transcriptional kinetics is still lacking due to the difficulty of solving detailed biophysical models. Here we introduce a stochastic simulation and statistical inference platform for modeling detailed transcriptional kinetics in prokaryotic systems, which has not been solved analytically. The model includes stochastic two-state gene activation, mRNA synthesis initiation and stepwise elongation, release to the cytoplasm, and stepwise co-transcriptional degradation. Using the Gillespie algorithm, the platform simulates nascent and mature mRNA kinetics of a single gene copy and predicts fluorescent signals measurable by time-lapse single-cell mRNA imaging, for different experimental conditions. To approach the inverse problem of estimating the kinetic parameters of the model from experimental data, we develop a heuristic optimization method based on the genetic algorithm and the empirical distribution of mRNA generated by simulation. As a demonstration, we show that the optimization algorithm can successfully recover the transcriptional kinetics of simulated and experimental gene expression data. The platform is available as a MATLAB software package at https://data.caltech.edu/records/1287.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Mengyu Wang
- Department of Physics, Grainger College of Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Ido Golding
- Department of Physics, Grainger College of Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Heng Xu
- School of Physics and Astronomy, Shanghai Jiao Tong University, Minhang District, Shanghai, China
- Institute of Natural Sciences, Shanghai Jiao Tong University, Minhang District, Shanghai, China
- * E-mail:
| |
Collapse
|
40
|
Warne DJ, Baker RE, Simpson MJ. A practical guide to pseudo-marginal methods for computational inference in systems biology. J Theor Biol 2020; 496:110255. [PMID: 32223995 DOI: 10.1016/j.jtbi.2020.110255] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 03/11/2020] [Accepted: 03/18/2020] [Indexed: 01/07/2023]
Abstract
For many stochastic models of interest in systems biology, such as those describing biochemical reaction networks, exact quantification of parameter uncertainty through statistical inference is intractable. Likelihood-free computational inference techniques enable parameter inference when the likelihood function for the model is intractable but the generation of many sample paths is feasible through stochastic simulation of the forward problem. The most common likelihood-free method in systems biology is approximate Bayesian computation that accepts parameters that result in low discrepancy between stochastic simulations and measured data. However, it can be difficult to assess how the accuracy of the resulting inferences are affected by the choice of acceptance threshold and discrepancy function. The pseudo-marginal approach is an alternative likelihood-free inference method that utilises a Monte Carlo estimate of the likelihood function. This approach has several advantages, particularly in the context of noisy, partially observed, time-course data typical in biochemical reaction network studies. Specifically, the pseudo-marginal approach facilitates exact inference and uncertainty quantification, and may be efficiently combined with particle filters for low variance, high-accuracy likelihood estimation. In this review, we provide a practical introduction to the pseudo-marginal approach using inference for biochemical reaction networks as a series of case studies. Implementations of key algorithms and examples are provided using the Julia programming language; a high performance, open source programming language for scientific computing (https://github.com/davidwarne/Warne2019_GuideToPseudoMarginal).
Collapse
Affiliation(s)
- David J Warne
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland 4001, Australia.
| | - Ruth E Baker
- Mathematical Institute, University of Oxford, Oxford, OX2 6GG, United Kingdom
| | - Matthew J Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland 4001, Australia
| |
Collapse
|
41
|
Warne DJ, Baker RE, Simpson MJ. Simulation and inference algorithms for stochastic biochemical reaction networks: from basic concepts to state-of-the-art. J R Soc Interface 2020; 16:20180943. [PMID: 30958205 DOI: 10.1098/rsif.2018.0943] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Stochasticity is a key characteristic of intracellular processes such as gene regulation and chemical signalling. Therefore, characterizing stochastic effects in biochemical systems is essential to understand the complex dynamics of living things. Mathematical idealizations of biochemically reacting systems must be able to capture stochastic phenomena. While robust theory exists to describe such stochastic models, the computational challenges in exploring these models can be a significant burden in practice since realistic models are analytically intractable. Determining the expected behaviour and variability of a stochastic biochemical reaction network requires many probabilistic simulations of its evolution. Using a biochemical reaction network model to assist in the interpretation of time-course data from a biological experiment is an even greater challenge due to the intractability of the likelihood function for determining observation probabilities. These computational challenges have been subjects of active research for over four decades. In this review, we present an accessible discussion of the major historical developments and state-of-the-art computational techniques relevant to simulation and inference problems for stochastic biochemical reaction network models. Detailed algorithms for particularly important methods are described and complemented with Matlab® implementations. As a result, this review provides a practical and accessible introduction to computational methods for stochastic models within the life sciences community.
Collapse
Affiliation(s)
- David J Warne
- 1 School of Mathematical Sciences, Queensland University of Technology , Brisbane, Queensland 4001 , Australia
| | - Ruth E Baker
- 2 Mathematical Institute, University of Oxford , Oxford OX2 6GG , UK
| | - Matthew J Simpson
- 1 School of Mathematical Sciences, Queensland University of Technology , Brisbane, Queensland 4001 , Australia
| |
Collapse
|
42
|
Calderazzo S, Brancaccio M, Finkenstädt B. Filtering and inference for stochastic oscillators with distributed delays. Bioinformatics 2020; 35:1380-1387. [PMID: 30202930 PMCID: PMC6477979 DOI: 10.1093/bioinformatics/bty782] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 08/08/2018] [Accepted: 09/06/2018] [Indexed: 01/30/2023] Open
Abstract
Motivation The time evolution of molecular species involved in biochemical reaction networks often arises from complex stochastic processes involving many species and reaction events. Inference for such systems is profoundly challenged by the relative sparseness of experimental data, as measurements are often limited to a small subset of the participating species measured at discrete time points. The need for model reduction can be realistically achieved for oscillatory dynamics resulting from negative translational and transcriptional feedback loops by the introduction of probabilistic time-delays. Although this approach yields a simplified model, inference is challenging and subject to ongoing research. The linear noise approximation (LNA) has recently been proposed to address such systems in stochastic form and will be exploited here. Results We develop a novel filtering approach for the LNA in stochastic systems with distributed delays, which allows the parameter values and unobserved states of a stochastic negative feedback model to be inferred from univariate time-series data. The performance of the methods is tested for simulated data. Results are obtained for real data when the model is fitted to imaging data on Cry1, a key gene involved in the mammalian central circadian clock, observed via a luciferase reporter construct in a mouse suprachiasmatic nucleus. Availability and implementation Programmes are written in MATLAB and Statistics Toolbox Release 2016 b, The MathWorks, Inc., Natick, Massachusetts, USA. Sample code and Cry1 data are available on GitHub https://github.com/scalderazzo/FLNADD. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Silvia Calderazzo
- Department of Statistics, University of Warwick, Coventry, UK.,Division of Biostatistics, German Cancer Research Center, Heidelberg, Germany
| | - Marco Brancaccio
- Division of Neurobiology, Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | | |
Collapse
|
43
|
|
44
|
Panchal V, Linder DF. Reverse engineering gene networks using global-local shrinkage rules. Interface Focus 2019; 10:20190049. [PMID: 31897291 DOI: 10.1098/rsfs.2019.0049] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/13/2019] [Indexed: 12/26/2022] Open
Abstract
Inferring gene regulatory networks from high-throughput 'omics' data has proven to be a computationally demanding task of critical importance. Frequently, the classical methods break down owing to the curse of dimensionality, and popular strategies to overcome this are typically based on regularized versions of the classical methods. However, these approaches rely on loss functions that may not be robust and usually do not allow for the incorporation of prior information in a straightforward way. Fully Bayesian methods are equipped to handle both of these shortcomings quite naturally, and they offer the potential for improvements in network structure learning. We propose a Bayesian hierarchical model to reconstruct gene regulatory networks from time-series gene expression data, such as those common in perturbation experiments of biological systems. The proposed methodology uses global-local shrinkage priors for posterior selection of regulatory edges and relaxes the common normal likelihood assumption in order to allow for heavy-tailed data, which were shown in several of the cited references to severely impact network inference. We provide a sufficient condition for posterior propriety and derive an efficient Markov chain Monte Carlo via Gibbs sampling in the electronic supplementary material. We describe a novel way to detect multiple scales based on the corresponding posterior quantities. Finally, we demonstrate the performance of our approach in a simulation study and compare it with existing methods on real data from a T-cell activation study.
Collapse
Affiliation(s)
- Viral Panchal
- Department of Mathematics and Statistics, University of North Carolina Wilmington, Wilmington, NC 28403, USA
| | - Daniel F Linder
- Medical College of Georgia, Augusta University, Augusta, GA 30912, USA
| |
Collapse
|
45
|
Cheng J, Chan NH. Efficient inference for nonlinear state space models: An automatic sample size selection rule. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2019.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
46
|
de Oliveira LR, Jaqaman K. FISIK: Framework for the Inference of In Situ Interaction Kinetics from Single-Molecule Imaging Data. Biophys J 2019; 117:1012-1028. [PMID: 31443908 DOI: 10.1016/j.bpj.2019.07.050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 06/27/2019] [Accepted: 07/22/2019] [Indexed: 12/12/2022] Open
Abstract
Recent experimental and computational developments have been pushing the limits of live-cell single-molecule imaging, enabling the monitoring of intermolecular interactions in their native environment with high spatiotemporal resolution. However, interactions are captured only for the labeled subset of molecules, which tends to be a small fraction. As a result, it has remained a challenge to calculate molecular interaction kinetics, in particular association rates, from live-cell single-molecule tracking data. To overcome this challenge, we developed a mathematical modeling-based Framework for the Inference of in Situ Interaction Kinetics (FISIK) from single-molecule imaging data with substoichiometric labeling. FISIK consists of (I) devising a mathematical model of molecular movement and interactions, mimicking the biological system and data-acquisition setup, and (II) estimating the unknown model parameters, including molecular association and dissociation rates, by fitting the model to experimental single-molecule data. Due to the stochastic nature of the model and data, we adapted the method of indirect inference for model calibration. We validated FISIK using a series of tests in which we simulated trajectories of diffusing molecules that interact with each other, considering a wide range of model parameters, and including resolution limitations, tracking errors, and mismatches between the model and the biological system it mimics. We found that FISIK has the sensitivity to determine association and dissociation rates, with accuracy and precision depending on the labeled fraction of molecules and the extent of molecule tracking errors. For cases where the labeled fraction is too low (e.g., to afford accurate tracking), combining dynamic but sparse single-molecule imaging data with almost-whole population oligomer distribution data improves FISIK's performance. All in all, FISIK is a promising approach for the derivation of molecular interaction kinetics in their native environment from single-molecule imaging data with substoichiometric labeling.
Collapse
Affiliation(s)
| | - Khuloud Jaqaman
- Department of Biophysics, UT Southwestern Medical Center, Dallas, Texas; Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, Texas.
| |
Collapse
|
47
|
Golightly A, Bradley E, Lowe T, Gillespie CS. Correlated pseudo-marginal schemes for time-discretised stochastic kinetic models. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2019.01.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
48
|
Loskot P, Atitey K, Mihaylova L. Comprehensive Review of Models and Methods for Inferences in Bio-Chemical Reaction Networks. Front Genet 2019; 10:549. [PMID: 31258548 PMCID: PMC6588029 DOI: 10.3389/fgene.2019.00549] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 05/24/2019] [Indexed: 01/30/2023] Open
Abstract
The key processes in biological and chemical systems are described by networks of chemical reactions. From molecular biology to biotechnology applications, computational models of reaction networks are used extensively to elucidate their non-linear dynamics. The model dynamics are crucially dependent on the parameter values which are often estimated from observations. Over the past decade, the interest in parameter and state estimation in models of (bio-) chemical reaction networks (BRNs) grew considerably. The related inference problems are also encountered in many other tasks including model calibration, discrimination, identifiability, and checking, and optimum experiment design, sensitivity analysis, and bifurcation analysis. The aim of this review paper is to examine the developments in literature to understand what BRN models are commonly used, and for what inference tasks and inference methods. The initial collection of about 700 documents concerning estimation problems in BRNs excluding books and textbooks in computational biology and chemistry were screened to select over 270 research papers and 20 graduate research theses. The paper selection was facilitated by text mining scripts to automate the search for relevant keywords and terms. The outcomes are presented in tables revealing the levels of interest in different inference tasks and methods for given models in the literature as well as the research trends are uncovered. Our findings indicate that many combinations of models, tasks and methods are still relatively unexplored, and there are many new research opportunities to explore combinations that have not been considered-perhaps for good reasons. The most common models of BRNs in literature involve differential equations, Markov processes, mass action kinetics, and state space representations whereas the most common tasks are the parameter inference and model identification. The most common methods in literature are Bayesian analysis, Monte Carlo sampling strategies, and model fitting to data using evolutionary algorithms. The new research problems which cannot be directly deduced from the text mining data are also discussed.
Collapse
Affiliation(s)
- Pavel Loskot
- College of Engineering, Swansea University, Swansea, United Kingdom
| | - Komlan Atitey
- College of Engineering, Swansea University, Swansea, United Kingdom
| | - Lyudmila Mihaylova
- Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
49
|
Mjolsness E. Prospects for Declarative Mathematical Modeling of Complex Biological Systems. Bull Math Biol 2019; 81:3385-3420. [PMID: 31175549 PMCID: PMC6677696 DOI: 10.1007/s11538-019-00628-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2018] [Accepted: 05/30/2019] [Indexed: 01/06/2023]
Abstract
Declarative modeling uses symbolic expressions to represent models. With such expressions, one can formalize high-level mathematical computations on models that would be difficult or impossible to perform directly on a lower-level simulation program, in a general-purpose programming language. Examples of such computations on models include model analysis, relatively general-purpose model reduction maps, and the initial phases of model implementation, all of which should preserve or approximate the mathematical semantics of a complex biological model. The potential advantages are particularly relevant in the case of developmental modeling, wherein complex spatial structures exhibit dynamics at molecular, cellular, and organogenic levels to relate genotype to multicellular phenotype. Multiscale modeling can benefit from both the expressive power of declarative modeling languages and the application of model reduction methods to link models across scale. Based on previous work, here we define declarative modeling of complex biological systems by defining the operator algebra semantics of an increasingly powerful series of declarative modeling languages including reaction-like dynamics of parameterized and extended objects; we define semantics-preserving implementation and semantics-approximating model reduction transformations; and we outline a “meta-hierarchy” for organizing declarative models and the mathematical methods that can fruitfully manipulate them.
Collapse
Affiliation(s)
- Eric Mjolsness
- Department of Computer Science, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
50
|
Picchini U, Forman JL. Bayesian inference for stochastic differential equation mixed effects models of a tumour xenography study. J R Stat Soc Ser C Appl Stat 2019. [DOI: 10.1111/rssc.12347] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Umberto Picchini
- Chalmers University of Technology and University of Gothenburg and Lund University Sweden
| | | |
Collapse
|