1
|
Cappello L, Lo WT‘J, Zhang JZ, Xu P, Barrow D, Chopra I, Clark AG, Wells MT, Kim J. Bayesian phylodynamic inference of population dynamics with dormancy. Proc Natl Acad Sci U S A 2025; 122:e2501394122. [PMID: 40314983 PMCID: PMC12067208 DOI: 10.1073/pnas.2501394122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2025] [Accepted: 02/24/2025] [Indexed: 05/03/2025] Open
Abstract
Many organisms employ reversible dormancy, or seedbank, in response to environmental fluctuations. This life-history strategy alters fundamental ecoevolutionary forces, leading to distinct patterns of genetic diversity. Two models of dormancy have been proposed based on the average duration of dormancy relative to coalescent timescales: weak seedbank, induced by scheduled seasonality (e.g., plants, invertebrates), and strong seedbank, where individuals stochastically switch between active and dormant states (e.g., bacteria, fungi). The weak seedbank coalescent is statistically equivalent to the Kingman coalescent with a scaled mutation rate, allowing the use of existing inference methods. In contrast, the strong seedbank coalescent differs fundamentally, as only active lineages can coalesce, while dormant lineages cannot. Additionally, dormant individuals typically mutate at a slower rate than active ones. Consequently, despite the significant role of dormancy in the ecoevolutionary dynamics of many organisms, no methods currently exist for inferring population dynamics involving dormancy and associated parameters. We present a Bayesian framework for jointly inferring a latent genealogy, seedbank parameters, and evolutionary parameters from molecular sequence data under the strong seedbank coalescent. We derive the exact probability density of genealogies sampled under the strong seedbank coalescent, characterize the corresponding likelihood function, and present efficient computational algorithms for its evaluation based on our theoretical framework. We develop a tailored Markov chain Monte Carlo sampler and implement our inference framework as a package SeedbankTree within BEAST2. Our work provides both a theoretical foundation and practical inference framework for studying the population genetic and genealogical impacts of dormancy.
Collapse
Affiliation(s)
- Lorenzo Cappello
- Departments of Economics and Business, Universitat Pompeu Fabra, Barcelona08005, Spain
- Data Science Center, Barcelona School of Economics, Barcelona08005, Spain
| | - Wai Tung ‘Jack’ Lo
- Department of Computational Biology, Cornell University, Ithaca, NY14850
| | - Joy Z. Zhang
- Center for Applied Mathematics, Cornell University, Ithaca, NY14850
| | - Peiyu Xu
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY14850
| | - Daniel Barrow
- Department of Computational Biology, Cornell University, Ithaca, NY14850
| | - Ishani Chopra
- Department of Computational Biology, Cornell University, Ithaca, NY14850
| | - Andrew G. Clark
- Department of Computational Biology, Cornell University, Ithaca, NY14850
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY14850
| | - Martin T. Wells
- Department of Statistics and Data Science, Cornell University, Ithaca, NY14850
| | - Jaehee Kim
- Department of Computational Biology, Cornell University, Ithaca, NY14850
| |
Collapse
|
2
|
Biron-Lattes M, Bouchard-Côté A, Campbell T. Pseudo-marginal inference for CTMCs on infinite spaces via monotonic likelihood approximations. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2118750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
3
|
Chakraborty A, Ovaskainen O, Dunson DB. Bayesian semiparametric long memory models for discretized event data. Ann Appl Stat 2022; 16:1380-1399. [DOI: 10.1214/21-aoas1546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Otso Ovaskainen
- Department of Biological and Environmental Science, University of Jyväskylä
| | | |
Collapse
|
4
|
Riva-Palacio A, Mena RH, Walker SG. On the estimation of partially observed continuous-time Markov chains. Comput Stat 2022. [DOI: 10.1007/s00180-022-01273-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Barone R, Tancredi A. Bayesian inference for discretely observed continuous time multi‐state models. Stat Med 2022; 41:3789-3803. [DOI: 10.1002/sim.9449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 03/21/2022] [Accepted: 05/13/2022] [Indexed: 11/09/2022]
Affiliation(s)
- Rosario Barone
- Department of Methods and Models for Economics, Territory and Finance Sapienza University of Rome Rome Italy
| | - Andrea Tancredi
- Department of Methods and Models for Economics, Territory and Finance Sapienza University of Rome Rome Italy
| |
Collapse
|
6
|
Gonçalves FB, Dutra LM, Silva RWC. Exact and computationally efficient Bayesian inference for generalized Markov modulated Poisson processes. STATISTICS AND COMPUTING 2022; 32:14. [PMID: 35013655 PMCID: PMC8733934 DOI: 10.1007/s11222-021-10074-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 12/08/2021] [Indexed: 06/14/2023]
Abstract
Statistical modeling of temporal point patterns is an important problem in several areas. The Cox process, a Poisson process where the intensity function is stochastic, is a common model for such data. We present a new class of unidimensional Cox process models in which the intensity function assumes parametric functional forms that switch according to a continuous-time Markov chain. A novel methodology is introduced to perform exact (up to Monte Carlo error) Bayesian inference based on MCMC algorithms. The reliability of the algorithms depends on a variety of specifications which are carefully addressed, resulting in a computationally efficient (in terms of computing time) algorithm and enabling its use with large data sets. Simulated and real examples are presented to illustrate the efficiency and applicability of the methodology. A specific model to fit epidemic curves is proposed and used to analyze data from Dengue Fever in Brazil and COVID-19 in some countries.
Collapse
Affiliation(s)
- Flávio B. Gonçalves
- Universidade Federal de Minas Gerais, Av. Antônio Carlos, 6627 - DEST, ICEx, UFMG, Belo Horizonte, Minas Gerais 31270-901 Brazil
| | - Lívia M. Dutra
- Centro Federal de Educação Tecnológica de Minas Gerais, Belo Horizonte, Brazil
| | - Roger W. C. Silva
- Universidade Federal de Minas Gerais, Av. Antônio Carlos, 6627 - DEST, ICEx, UFMG, Belo Horizonte, Minas Gerais 31270-901 Brazil
| |
Collapse
|
7
|
Sherlock C, Thiery AH, Golightly A. Efficiency of delayed-acceptance random walk Metropolis algorithms. Ann Stat 2021. [DOI: 10.1214/21-aos2068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University
| | - Alexandre H. Thiery
- Department of Statistics and Applied Probability, National University of Singapore
| | - Andrew Golightly
- School of Mathematics, Statistics and Physics, Newcastle University
| |
Collapse
|
8
|
|
9
|
|
10
|
Affiliation(s)
- Boqian Zhang
- Department of Statistics, Purdue University, West Lafayette, IN
| | - Vinayak Rao
- Department of Statistics, Purdue University, West Lafayette, IN
| |
Collapse
|
11
|
Tancredi A. Approximate Bayesian inference for discretely observed continuous-time multi-state models. Biometrics 2019; 75:966-977. [PMID: 30648730 DOI: 10.1111/biom.13019] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 12/21/2018] [Indexed: 11/30/2022]
Abstract
Inference for continuous time multi-state models presents considerable computational difficulties when the process is only observed at discrete time points with no additional information about the state transitions. In fact, for general multi-state Markov model, evaluation of the likelihood function is possible only via intensive numerical approximations. Moreover, in real applications, transitions between states may depend on the time since entry into the current state, and semi-Markov models, where the likelihood function is not available in closed form, should be fitted to the data. Approximate Bayesian Computation (ABC) methods, which make use only of comparisons between simulated and observed summary statistics, represent a solution to intractable likelihood problems and provide alternative algorithms when the likelihood calculation is computationally too costly. In this article we investigate the potentiality of ABC techniques for multi-state models both to obtain the posterior distributions of the model parameters and to compare Markov and semi-Markov models. In addition, we will also exploit ABC methods to estimate and compare hidden Markov and semi-Markov models when observed states are subject to classification errors. We illustrate the performance of the ABC methodology both with simulated data and with a real data example.
Collapse
Affiliation(s)
- Andrea Tancredi
- Department of Methods and Models for Economics Territory and Finance, Sapienza University of Rome, Via del Castro Laurenziano 9, 00161, Rome, Italy
| |
Collapse
|
12
|
Ramírez-Cobo P, Lillo RE, Wiper MP. Nonidentifiability of the Two-State Markovian Arrival Process. J Appl Probab 2016. [DOI: 10.1239/jap/1285335400] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper we consider the problem of identifiability for the two-state Markovian arrival process (MAP2). In particular, we show that the MAP2 is not identifiable, providing the conditions under which two different sets of parameters induce identical stationary laws for the observable process.
Collapse
|
13
|
Aralis HJ, Gorbach PM, Brookmeyer R. Measuring concurrency using a joint multistate and point process model for retrospective sexual history data. Stat Med 2016; 35:4459-4473. [PMID: 27324278 DOI: 10.1002/sim.7013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 03/13/2016] [Accepted: 05/16/2016] [Indexed: 11/09/2022]
Abstract
Understanding the impact of concurrency, defined as overlapping sexual partnerships, on the spread of HIV within various communities has been complicated by difficulties in measuring concurrency. Retrospective sexual history data consisting of first and last dates of sexual intercourse for each previous and ongoing partnership is often obtained through use of cross-sectional surveys. Previous attempts to empirically estimate the magnitude and extent of concurrency among these surveyed populations have inadequately accounted for the dependence between partnerships and used only a snapshot of the available data. We introduce a joint multistate and point process model in which states are defined as the number of ongoing partnerships an individual is engaged in at a given time. Sexual partnerships starting and ending on the same date are referred to as one-offs and modeled as discrete events. The proposed method treats each individual's continuation in and transition through various numbers of ongoing partnerships as a separate stochastic process and allows the occurrence of one-offs to impact subsequent rates of partnership formation and dissolution. Estimators for the concurrent partnership distribution and mean sojourn times during which a person has k ongoing partnerships are presented. We demonstrate this modeling approach using epidemiological data collected from a sample of men having sex with men and seeking HIV testing at a Los Angeles clinic. Among this sample, the estimated point prevalence of concurrency was higher among men later diagnosed HIV positive. One-offs were associated with increased rates of subsequent partnership dissolution. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Hilary J Aralis
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA 90095, U.S.A..
| | - Pamina M Gorbach
- Department of Epidemiology, UCLA Fielding School of Public Health, University of California, Los Angeles, CA 90095, U.S.A
| | - Ron Brookmeyer
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA 90095, U.S.A
| |
Collapse
|
14
|
Miasojedow B, Niemiro W. Geometric ergodicity of Rao and Teh’s algorithm for homogeneous Markov jump processes. Stat Probab Lett 2016. [DOI: 10.1016/j.spl.2016.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
15
|
Lu S. A continuous-time HMM approach to modeling the magnitude-frequency distribution of earthquakes. J Appl Stat 2016. [DOI: 10.1080/02664763.2016.1161736] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Shaochuan Lu
- School of Statistics, Beijing Normal University, Beijing, People's Republic of China
| |
Collapse
|
16
|
Lange JM, Hubbard RA, Inoue LYT, Minin VN. A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data. Biometrics 2014; 71:90-101. [PMID: 25319319 DOI: 10.1111/biom.12252] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2014] [Revised: 07/01/2014] [Accepted: 09/01/2014] [Indexed: 12/27/2022]
Abstract
Multistate models are used to characterize individuals' natural histories through diseases with discrete states. Observational data resources based on electronic medical records pose new opportunities for studying such diseases. However, these data consist of observations of the process at discrete sampling times, which may either be pre-scheduled and non-informative, or symptom-driven and informative about an individual's underlying disease status. We have developed a novel joint observation and disease transition model for this setting. The disease process is modeled according to a latent continuous-time Markov chain; and the observation process, according to a Markov-modulated Poisson process with observation rates that depend on the individual's underlying disease status. The disease process is observed at a combination of informative and non-informative sampling times, with possible misclassification error. We demonstrate that the model is computationally tractable and devise an expectation-maximization algorithm for parameter estimation. Using simulated data, we show how estimates from our joint observation and disease transition model lead to less biased and more precise estimates of the disease rate parameters. We apply the model to a study of secondary breast cancer events, utilizing mammography and biopsy records from a sample of women with a history of primary breast cancer.
Collapse
Affiliation(s)
- Jane M Lange
- Department of Bioststatistics, University of Washington, Seattle, Washington, U.S.A
| | - Rebecca A Hubbard
- Department of Bioststatistics, University of Washington, Seattle, Washington, U.S.A.,Biostatistics Unit, Group Health Research Institute, Seattle, Washington, U.S.A
| | - Lurdes Y T Inoue
- Department of Bioststatistics, University of Washington, Seattle, Washington, U.S.A
| | - Vladimir N Minin
- Departments of Statistics and Biology, University of Washington, Seattle, Washington, U.S.A
| |
Collapse
|
17
|
Abstract
Phylogenetic stochastic mapping is a method for reconstructing the history of trait changes on a phylogenetic tree relating species/organism carrying the trait. State-of-the-art methods assume that the trait evolves according to a continuous-time Markov chain (CTMC) and works well for small state spaces. The computations slow down considerably for larger state spaces (e.g., space of codons), because current methodology relies on exponentiating CTMC infinitesimal rate matrices-an operation whose computational complexity grows as the size of the CTMC state space cubed. In this work, we introduce a new approach, based on a CTMC technique called uniformization, which does not use matrix exponentiation for phylogenetic stochastic mapping. Our method is based on a new Markov chain Monte Carlo (MCMC) algorithm that targets the distribution of trait histories conditional on the trait data observed at the tips of the tree. The computational complexity of our MCMC method grows as the size of the CTMC state space squared. Moreover, in contrast to competing matrix exponentiation methods, if the rate matrix is sparse, we can leverage this sparsity and increase the computational efficiency of our algorithm further. Using simulated data, we illustrate advantages of our MCMC algorithm and investigate how large the state space needs to be for our method to outperform matrix exponentiation approaches. We show that even on the moderately large state space of codons our MCMC method can be significantly faster than currently used matrix exponentiation methods.
Collapse
Affiliation(s)
- Jan Irvahn
- 1 Department of Statistics, University of Washington , Seattle, Washington
| | | |
Collapse
|
18
|
Vaughan TG, Kühnert D, Popinga A, Welch D, Drummond AJ. Efficient Bayesian inference under the structured coalescent. Bioinformatics 2014; 30:2272-9. [PMID: 24753484 PMCID: PMC4207426 DOI: 10.1093/bioinformatics/btu201] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Motivation: Population structure significantly affects evolutionary dynamics. Such structure may be due to spatial segregation, but may also reflect any other gene-flow-limiting aspect of a model. In combination with the structured coalescent, this fact can be used to inform phylogenetic tree reconstruction, as well as to infer parameters such as migration rates and subpopulation sizes from annotated sequence data. However, conducting Bayesian inference under the structured coalescent is impeded by the difficulty of constructing Markov Chain Monte Carlo (MCMC) sampling algorithms (samplers) capable of efficiently exploring the state space. Results: In this article, we present a new MCMC sampler capable of sampling from posterior distributions over structured trees: timed phylogenetic trees in which lineages are associated with the distinct subpopulation in which they lie. The sampler includes a set of MCMC proposal functions that offer significant mixing improvements over a previously published method. Furthermore, its implementation as a BEAST 2 package ensures maximum flexibility with respect to model and prior specification. We demonstrate the usefulness of this new sampler by using it to infer migration rates and effective population sizes of H3N2 influenza between New Zealand, New York and Hong Kong from publicly available hemagglutinin (HA) gene sequences under the structured coalescent. Availability and implementation: The sampler has been implemented as a publicly available BEAST 2 package that is distributed under version 3 of the GNU General Public License at http://compevol.github.io/MultiTypeTree. Contact:tgvaughan@gmail.com Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Timothy G Vaughan
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
| | - Denise Kühnert
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New ZealandAllan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New ZealandAllan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
| | - Alex Popinga
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New ZealandAllan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
| | - David Welch
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New ZealandAllan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
| | - Alexei J Drummond
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New ZealandAllan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North 4442, New Zealand, Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH), Zurich 8092, Switzerland and Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
| |
Collapse
|
19
|
Sherlock C, Xifara T, Telfer S, Begon M. A coupled hidden Markov model for disease interactions. J R Stat Soc Ser C Appl Stat 2013; 62:609-627. [PMID: 24223436 PMCID: PMC3813975 DOI: 10.1111/rssc.12015] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
To investigate interactions between parasite species in a host, a population of field voles was studied longitudinally, with presence or absence of six different parasites measured repeatedly. Although trapping sessions were regular, a different set of voles was caught at each session, leading to incomplete profiles for all subjects. We use a discrete time hidden Markov model for each disease with transition probabilities dependent on covariates via a set of logistic regressions. For each disease the hidden states for each of the other diseases at a given time point form part of the covariate set for the Markov transition probabilities from that time point. This allows us to gauge the influence of each parasite species on the transition probabilities for each of the other parasite species. Inference is performed via a Gibbs sampler, which cycles through each of the diseases, first using an adaptive Metropolis-Hastings step to sample from the conditional posterior of the covariate parameters for that particular disease given the hidden states for all other diseases and then sampling from the hidden states for that disease given the parameters. We find evidence for interactions between several pairs of parasites and of an acquired immune response for two of the parasites.
Collapse
|
20
|
Choi B, Rempala GA. Inference for discretely observed stochastic kinetic networks with applications to epidemic modeling. Biostatistics 2011; 13:153-65. [PMID: 21835814 DOI: 10.1093/biostatistics/kxr019] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We present a new method for Bayesian Markov Chain Monte Carlo-based inference in certain types of stochastic models, suitable for modeling noisy epidemic data. We apply the so-called uniformization representation of a Markov process, in order to efficiently generate appropriate conditional distributions in the Gibbs sampler algorithm. The approach is shown to work well in various data-poor settings, that is, when only partial information about the epidemic process is available, as illustrated on the synthetic data from SIR-type epidemics and the Center for Disease Control and Prevention data from the onset of the H1N1 pandemic in the United States.
Collapse
Affiliation(s)
- Boseung Choi
- Department of Computer Science and Statistics, Daegu University, Gyeongbuk 712-714, Republic of Korea
| | | |
Collapse
|
21
|
Affiliation(s)
- Refik Soyer
- Department of Decision Sciences, George Washington University, Washington, DC, USA
| |
Collapse
|
22
|
Abstract
In this paper we consider the problem of identifiability for the two-state Markovian arrival process (MAP2). In particular, we show that the MAP2 is not identifiable, providing the conditions under which two different sets of parameters induce identical stationary laws for the observable process.
Collapse
|
23
|
Sherlock C, Fearnhead P, Roberts GO. The Random Walk Metropolis: Linking Theory and Practice Through a Case Study. Stat Sci 2010. [DOI: 10.1214/10-sts327] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
24
|
Hobolth A, Stone EA. SIMULATION FROM ENDPOINT-CONDITIONED, CONTINUOUS-TIME MARKOV CHAINS ON A FINITE STATE SPACE, WITH APPLICATIONS TO MOLECULAR EVOLUTION. Ann Appl Stat 2009; 3:1204. [PMID: 20148133 PMCID: PMC2818752 DOI: 10.1214/09-aoas247] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Analyses of serially-sampled data often begin with the assumption that the observations represent discrete samples from a latent continuous-time stochastic process. The continuous-time Markov chain (CTMC) is one such generative model whose popularity extends to a variety of disciplines ranging from computational finance to human genetics and genomics. A common theme among these diverse applications is the need to simulate sample paths of a CTMC conditional on realized data that is discretely observed. Here we present a general solution to this sampling problem when the CTMC is defined on a discrete and finite state space. Specifically, we consider the generation of sample paths, including intermediate states and times of transition, from a CTMC whose beginning and ending states are known across a time interval of length T. We first unify the literature through a discussion of the three predominant approaches: (1) modified rejection sampling, (2) direct sampling, and (3) uniformization. We then give analytical results for the complexity and efficiency of each method in terms of the instantaneous transition rate matrix Q of the CTMC, its beginning and ending states, and the length of sampling time T. In doing so, we show that no method dominates the others across all model specifications, and we give explicit proof of which method prevails for any given Q, T, and endpoints. Finally, we introduce and compare three applications of CTMCs to demonstrate the pitfalls of choosing an inefficient sampler.
Collapse
Affiliation(s)
- Asger Hobolth
- Department of Mathematical Sciences, Aarhus University, Denmark
| | - Eric A. Stone
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, USA
| |
Collapse
|
25
|
|
26
|
Uniformization for sampling realizations of Markov processes: applications to Bayesian implementations of codon substitution models. Bioinformatics 2007; 24:56-62. [DOI: 10.1093/bioinformatics/btm532] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|