1
|
Corstanje M, Meulen FVD. Guided simulation of conditioned chemical reaction networks. STATISTICAL INFERENCE FOR STOCHASTIC PROCESSES 2025; 28:8. [PMID: 40416628 PMCID: PMC12101079 DOI: 10.1007/s11203-025-09326-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/13/2025] [Indexed: 05/27/2025]
Abstract
Let X be a chemical reaction process, modeled as a multi-dimensional continuous-time jump process. Assume that at given times 0 < t 1 < ⋯ < t n , linear combinationsv i = L i X ( t i ) , i = 1 , ⋯ , n are observed for given matrices L i . We show how the process that is conditioned on hitting the statesv 1 , ⋯ , v n is obtained by a change of measure on the law of the unconditioned process. This results in an algorithm for obtaining weighted samples from the conditioned process. Our results are illustrated by numerical simulations.
Collapse
Affiliation(s)
- Marc Corstanje
- Department of Mathematics, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Frank van der Meulen
- Department of Mathematics, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
2
|
Worthington J, Feletto E, He E, Wade S, de Graaff B, Nguyen ALT, George J, Canfell K, Caruana M. Evaluating Semi-Markov Processes and Other Epidemiological Time-to-Event Models by Computing Disease Sojourn Density as Partial Differential Equations. Med Decis Making 2025:272989X251333398. [PMID: 40340615 DOI: 10.1177/0272989x251333398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2025]
Abstract
IntroductionEpidemiological models benefit from incorporating detailed time-to-event data to understand how disease risk evolves. For example, decompensation risk in liver cirrhosis depends on sojourn time spent with cirrhosis. Semi-Markov and related models capture these details by modeling time-to-event distributions based on published survival data. However, implementations of semi-Markov processes rely on Monte Carlo sampling methods, which increase computational requirements and introduce stochastic variability. Explicitly calculating the evolving transition likelihood can avoid these issues and provide fast, reliable estimates.MethodsWe present the sojourn time density framework for computing semi-Markov and related models by calculating the evolving sojourn time probability density as a system of partial differential equations. The framework is parametrized by commonly used hazard and models the distribution of current disease state and sojourn time. We describe the mathematical background, a numerical method for computation, and an example model of liver disease.ResultsModels developed with the sojourn time density framework can directly incorporate time-to-event data and serial events in a deterministic system. This increases the level of potential model detail over Markov-type models, improves parameter identifiability, and reduces computational burden and stochastic uncertainty compared with Monte Carlo methods. The example model of liver disease was able to accurately reproduce targets without extensive calibration or fitting and required minimal computational burden.ConclusionsExplicitly modeling sojourn time distribution allows us to represent semi-Markov systems using detailed survival data from epidemiological studies without requiring sampling, avoiding the need for calibration, reducing computational time, and allowing for more robust probabilistic sensitivity analyses.HighlightsTime-inhomogeneous semi-Markov models and other time-to-event-based modeling approaches can capture risks that evolve over time spent with a disease.We describe an approach to computing these models that represents them as partial differential equations representing the evolution of the sojourn time probability density.This sojourn time density framework incorporates complex data sources on competing risks and serial events while minimizing computational complexity.
Collapse
Affiliation(s)
- Joachim Worthington
- The Daffodil Centre, The University of Sydney, a joint venture with Cancer Council NSW, Sydney, NSW, Australia
| | - Eleonora Feletto
- The Daffodil Centre, The University of Sydney, a joint venture with Cancer Council NSW, Sydney, NSW, Australia
| | - Emily He
- The Daffodil Centre, The University of Sydney, a joint venture with Cancer Council NSW, Sydney, NSW, Australia
| | - Stephen Wade
- The Daffodil Centre, The University of Sydney, a joint venture with Cancer Council NSW, Sydney, NSW, Australia
| | - Barbara de Graaff
- Menzies Institute for Medical Research, The University of Tasmania, Hobart, TAS, Australia
| | - Anh Le Tuan Nguyen
- Menzies Institute for Medical Research, The University of Tasmania, Hobart, TAS, Australia
- WHO Collaborating Centre for Viral Hepatitis, The Peter Doherty Institute for Infection and Immunity
| | - Jacob George
- Storr Liver Centre, The Westmead Institute for Medical Research, Westmead Hospital and University of Sydney, Sydney, NSW, Australia
| | - Karen Canfell
- School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Michael Caruana
- The Daffodil Centre, The University of Sydney, a joint venture with Cancer Council NSW, Sydney, NSW, Australia
| |
Collapse
|
3
|
Zolaktaf S, Dannenberg F, Schmidt M, Condon A, Winfree E. Predicting DNA kinetics with a truncated continuous-time Markov chain method. Comput Biol Chem 2023; 104:107837. [PMID: 36858009 DOI: 10.1016/j.compbiolchem.2023.107837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 02/05/2023] [Accepted: 02/21/2023] [Indexed: 03/03/2023]
Abstract
Predicting the kinetics of reactions involving nucleic acid strands is a fundamental task in biology and biotechnology. Reaction kinetics can be modeled as an elementary step continuous-time Markov chain, where states correspond to secondary structures and transitions correspond to base pair formation and breakage. Since the number of states in the Markov chain could be large, rates are determined by estimating the mean first passage time from sampled trajectories. As a result, the cost of kinetic predictions becomes prohibitively expensive for rare events with extremely long trajectories. Also problematic are scenarios where multiple predictions are needed for the same reaction, e.g., under different environmental conditions, or when calibrating model parameters, because a new set of trajectories is needed multiple times. We propose a new method, called pathway elaboration, to handle these scenarios. Pathway elaboration builds a truncated continuous-time Markov chain through both biased and unbiased sampling. The resulting Markov chain has moderate state space size, so matrix methods can efficiently compute reaction rates, even for rare events. Also the transition rates of the truncated Markov chain can easily be adapted when model or environmental parameters are perturbed, making model calibration feasible. We illustrate the utility of pathway elaboration on toehold-mediated strand displacement reactions, show that it well-approximates trajectory-based predictions of unbiased elementary step models on a wide range of reaction types for which such predictions are feasible, and demonstrate that it performs better than alternative truncation-based approaches that are applicable for mean first passage time estimation. Finally, in a small study, we use pathway elaboration to optimize the Metropolis kinetic model of Multistrand, an elementary step simulator, showing that the optimized parameters greatly improve reaction rate predictions. Our framework and dataset are available at https://github.com/DNA-and-Natural-Algorithms-Group/PathwayElaboration.
Collapse
Affiliation(s)
| | | | - Mark Schmidt
- University of British Columbia, Canada; Canada CIFAR AI Chair (Amii), Canada.
| | | | - Erik Winfree
- California Institute of Technology, United States of America.
| |
Collapse
|
4
|
Biron-Lattes M, Bouchard-Côté A, Campbell T. Pseudo-marginal inference for CTMCs on infinite spaces via monotonic likelihood approximations. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2118750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
5
|
Riva-Palacio A, Mena RH, Walker SG. On the estimation of partially observed continuous-time Markov chains. Comput Stat 2022. [DOI: 10.1007/s00180-022-01273-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Sherlock C, Golightly A. Exact Bayesian inference for discretely observed Markov Jump Processes using finite rate matrices. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2093886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, UK
| | | |
Collapse
|
7
|
Kilic Z, Sgouralis I, Heo W, Ishii K, Tahara T, Pressé S. Extraction of rapid kinetics from smFRET measurements using integrative detectors. CELL REPORTS. PHYSICAL SCIENCE 2021; 2:100409. [PMID: 34142102 PMCID: PMC8208598 DOI: 10.1016/j.xcrp.2021.100409] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Hidden Markov models (HMMs) are used to learn single-molecule kinetics across a range of experimental techniques. By their construction, HMMs assume that single-molecule events occur on slower timescales than those of data acquisition. To move beyond that HMM limitation and allow for single-molecule events to occur on any timescale, we must treat single-molecule events in continuous time as they occur in nature. We propose a method to learn kinetic rates from single-molecule Förster resonance energy transfer (smFRET) data collected by integrative detectors, even if those rates exceed data acquisition rates. To achieve that, we exploit our recently proposed "hidden Markov jump process" (HMJP), with which we learn transition kinetics from parallel measurements in donor and acceptor channels. HMJPs generalize the HMM paradigm in two critical ways: (1) they deal with physical smFRET systems as they switch between conformational states in continuous time, and (2) they estimate transition rates between conformational states directly without having recourse to transition probabilities or assuming slow dynamics. Our continuous-time treatment learns the transition kinetics and photon emission rates for dynamic regimes that are inaccessible to HMMs, which treat system kinetics in discrete time. We validate our framework's robustness on simulated data and demonstrate its performance on experimental data from FRET-labeled Holliday junctions.
Collapse
Affiliation(s)
- Zeliha Kilic
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287, USA
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee, Knoxville, TN 37996, USA
| | - Wooseok Heo
- Molecular Spectroscopy Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kunihiko Ishii
- Molecular Spectroscopy Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Ultrafast Spectroscopy Research Team, RIKEN Center for Advanced Photonics (RAP), 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Tahei Tahara
- Molecular Spectroscopy Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Ultrafast Spectroscopy Research Team, RIKEN Center for Advanced Photonics (RAP), 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Steve Pressé
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- Lead contact
| |
Collapse
|
8
|
Sherlock C. Direct statistical inference for finite Markov jump processes via the matrix exponential. Comput Stat 2021; 36:2863-2887. [PMID: 33897113 PMCID: PMC8054858 DOI: 10.1007/s00180-021-01102-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 03/23/2021] [Indexed: 11/27/2022]
Abstract
Given noisy, partial observations of a time-homogeneous, finite-statespace Markov chain, conceptually simple, direct statistical inference is available, in theory, via its rate matrix, or infinitesimal generator, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q, since \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\exp ({\mathsf {Q}}t)$$\end{document}exp(Qt) is the transition matrix over time t. However, perhaps because of inadequate tools for matrix exponentiation in programming languages commonly used amongst statisticians or a belief that the necessary calculations are prohibitively expensive, statistical inference for continuous-time Markov chains with a large but finite state space is typically conducted via particle MCMC or other relatively complex inference schemes. When, as in many applications \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q arises from a reaction network, it is usually sparse. We describe variations on known algorithms which allow fast, robust and accurate evaluation of the product of a non-negative vector with the exponential of a large, sparse rate matrix. Our implementation uses relatively recently developed, efficient, linear algebra tools that take advantage of such sparsity. We demonstrate the straightforward statistical application of the key algorithm on a model for the mixing of two alleles in a population and on the Susceptible-Infectious-Removed epidemic model.
Collapse
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| |
Collapse
|
9
|
Fisher HF, Boys RJ, Gillespie CS, Proctor CJ, Golightly A. Parameter inference for a stochastic kinetic model of expanded polyglutamine proteins. Biometrics 2021; 78:1195-1208. [PMID: 33837525 DOI: 10.1111/biom.13467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 03/21/2021] [Accepted: 03/24/2021] [Indexed: 11/30/2022]
Abstract
The presence of protein aggregates in cells is a known feature of many human age-related diseases, such as Huntington's disease. Simulations using fixed parameter values in a model of the dynamic evolution of expanded polyglutaime (PolyQ) proteins in cells have been used to gain a better understanding of the biological system. However, there is considerable uncertainty about the values of some of the parameters governing the system. Currently, appropriate values are chosen by ad hoc attempts to tune the parameters so that the model output matches experimental data. The problem is further complicated by the fact that the data only offer a partial insight into the underlying biological process: the data consist only of the proportions of cell death and of cells with inclusion bodies at a few time points, corrupted by measurement error. Developing inference procedures to estimate the model parameters in this scenario is a significant task. The model probabilities corresponding to the observed proportions cannot be evaluated exactly, and so they are estimated within the inference algorithm by repeatedly simulating realizations from the model. In general such an approach is computationally very expensive, and we therefore construct Gaussian process emulators for the key quantities and reformulate our algorithm around these fast stochastic approximations. We conclude by highlighting appropriate values of the model parameters leading to new insights into the underlying biological processes.
Collapse
Affiliation(s)
- H F Fisher
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK.,Population Health Sciences Institute, Newcastle University, Newcastle Upon Tyne, UK
| | - R J Boys
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK
| | - C S Gillespie
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK
| | - C J Proctor
- Institute of Cellular Medicine, Newcastle University, Newcastle Upon Tyne, UK
| | - A Golightly
- School of Mathematics, Statistics & Physics, Newcastle University, Newcastle Upon Tyne, UK
| |
Collapse
|
10
|
Kilic Z, Sgouralis I, Pressé S. Generalizing HMMs to Continuous Time for Fast Kinetics: Hidden Markov Jump Processes. Biophys J 2021; 120:409-423. [PMID: 33421415 PMCID: PMC7896036 DOI: 10.1016/j.bpj.2020.12.022] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 12/25/2020] [Accepted: 12/30/2020] [Indexed: 12/18/2022] Open
Abstract
The hidden Markov model (HMM) is a framework for time series analysis widely applied to single-molecule experiments. Although initially developed for applications outside the natural sciences, the HMM has traditionally been used to interpret signals generated by physical systems, such as single molecules, evolving in a discrete state space observed at discrete time levels dictated by the data acquisition rate. Within the HMM framework, transitions between states are modeled as occurring at the end of each data acquisition period and are described using transition probabilities. Yet, whereas measurements are often performed at discrete time levels in the natural sciences, physical systems evolve in continuous time according to transition rates. It then follows that the modeling assumptions underlying the HMM are justified if the transition rates of a physical process from state to state are small as compared to the data acquisition rate. In other words, HMMs apply to slow kinetics. The problem is, because the transition rates are unknown in principle, it is unclear, a priori, whether the HMM applies to a particular system. For this reason, we must generalize HMMs for physical systems, such as single molecules, because these switch between discrete states in "continuous time". We do so by exploiting recent mathematical tools developed in the context of inferring Markov jump processes and propose the hidden Markov jump process. We explicitly show in what limit the hidden Markov jump process reduces to the HMM. Resolving the discrete time discrepancy of the HMM has clear implications: we no longer need to assume that processes, such as molecular events, must occur on timescales slower than data acquisition and can learn transition rates even if these are on the same timescale or otherwise exceed data acquisition rates.
Collapse
Affiliation(s)
- Zeliha Kilic
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee, Knoxville, Tennessee
| | - Steve Pressé
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona; School of Molecular Sciences, Arizona State University, Tempe, Arizona.
| |
Collapse
|