1
|
Van der Roest BR, Bootsma MCJ, Fischer EAJ, Gröschel MI, Anthony RM, de Zwaan R, Kretzschmar MEE, Klinkenberg D. Phylodynamic assessment of SNP distances from whole genome sequencing for determining Mycobacterium tuberculosis transmission. Sci Rep 2025; 15:10694. [PMID: 40155671 PMCID: PMC11953417 DOI: 10.1038/s41598-025-94646-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Accepted: 03/17/2025] [Indexed: 04/01/2025] Open
Abstract
The global tuberculosis (TB) epidemic is driven by primary transmission. Pathogen genome sequencing is increasingly used in molecular epidemiology and outbreak investigations. Based on contact tracing and epidemiological links, Single Nucleotide Polymorphism (SNP) cut-offs, ranging from 3 to 12 SNPs, identify probable transmission clusters or exclude direct transmission. However, contact tracing can be limited by recall bias and inconsistent methodologies across TB settings. We propose phylodynamic models, i.e. methods to infer transmission processes from pathogen genomes and associated epidemiological data, as an alternative reference to infer transmission events. We analyzed 2,008 whole-genome sequences from Dutch TB patients collected from 2015 to 2019. Genetic clusters were defined within a 20-SNP range, and the phylodynamic model phybreak was employed to infer transmission. Probable transmission SNP cut-offs were assessed by the proportion of inferred transmission events with a SNP distance below these cut-offs. A total of 79 clusters were identified, with a median size of 4 isolates (IQR = 3-8). A SNP cut-off of 4 captured 98% of inferred transmission events while reducing pairs without transmission links. A cut-off beyond 12 SNPs effectively excluded transmission. Phylodynamic approaches provide a valuable alternative to contact tracing for defining SNP cut-offs, allowing for a more precise assessment of transmission events.
Collapse
Affiliation(s)
- Bastiaan R Van der Roest
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, P.O.Box 85500, Utrecht, The Netherlands.
| | - Martin C J Bootsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, P.O.Box 85500, Utrecht, The Netherlands
- Department of Mathematics, Faculty of Science, University Utrecht, Utrecht, The Netherlands
- Centre for Complex System Studies (CCSS), University Utrecht, Utrecht, The Netherlands
| | - Egil A J Fischer
- Population Health Sciences, Faculty of Veterinary Medicine, University Utrecht, Utrecht, The Netherlands
| | - Matthias I Gröschel
- Department of Infectious Diseases, Respiratory and Critical Care Medicine, Charité- Universitätsmedizin Berlin, Berlin, Germany
| | - Richard M Anthony
- Tuberculosis Reference Laboratory, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Rina de Zwaan
- Tuberculosis Reference Laboratory, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Mirjam E E Kretzschmar
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, P.O.Box 85500, Utrecht, The Netherlands
- Centre for Complex System Studies (CCSS), University Utrecht, Utrecht, The Netherlands
- Institute of Epidemiology and Social Medicine, University of Münster, Münster, Germany
| | - Don Klinkenberg
- National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| |
Collapse
|
2
|
May MR, Rannala B. Early detection of highly transmissible viral variants using phylogenomics. SCIENCE ADVANCES 2024; 10:eadk7623. [PMID: 39141727 PMCID: PMC11323880 DOI: 10.1126/sciadv.adk7623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 07/09/2024] [Indexed: 08/16/2024]
Abstract
As demonstrated by the SARS-CoV-2 pandemic, the emergence of novel viral strains with increased transmission rates poses a serious threat to global health. Statistical models of genome sequence evolution may provide a critical tool for early detection of these strains. Using a novel stochastic model that links transmission rates to the entire viral genome sequence, we study the utility of phylogenetic methods that use a phylogenetic tree relating viral samples versus count-based methods that use case counts of variants over time exclusively to detect increased transmission rates and identify candidate causative mutations. We find that phylogenies in particular can detect novel transmission-enhancing variants very soon after their origin and may facilitate the development of early detection systems for outbreak surveillance.
Collapse
Affiliation(s)
- Michael R. May
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | | |
Collapse
|
3
|
Tang M, Dudas G, Bedford T, Minin VN. Fitting stochastic epidemic models to gene genealogies using linear noise approximation. Ann Appl Stat 2023. [DOI: 10.1214/21-aoas1583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Mingwei Tang
- Department of Statistics, University of Washington, Seattle
| | - Gytis Dudas
- Gothenburg Global Biodiversity Centre (GGBC)
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center
| | | |
Collapse
|
4
|
Chao E, Chato C, Vender R, Olabode AS, Ferreira RC, Poon AFY. Molecular source attribution. PLoS Comput Biol 2022; 18:e1010649. [PMID: 36395093 PMCID: PMC9671344 DOI: 10.1371/journal.pcbi.1010649] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Elisa Chao
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Connor Chato
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Reid Vender
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
- School of Medicine, Queen’s University, Kingston, Ontario, Canada
| | - Abayomi S. Olabode
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Roux-Cil Ferreira
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Art F. Y. Poon
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
- * E-mail:
| |
Collapse
|
5
|
Saad-Roy CM, Metcalf CJE, Grenfell BT. Immuno-epidemiology and the predictability of viral evolution. Science 2022; 376:1161-1162. [PMID: 35679395 DOI: 10.1126/science.abn9410] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Understanding viral evolution depends on a synthesis of evolutionary biology and immuno-epidemiology.
Collapse
Affiliation(s)
- Chadi M Saad-Roy
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - C Jessica E Metcalf
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA.,Princeton School of Public and International Affairs, Princeton University, Princeton, NJ, USA
| | - Bryan T Grenfell
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA.,Princeton School of Public and International Affairs, Princeton University, Princeton, NJ, USA
| |
Collapse
|
6
|
KING AARONA, LIN QIANYING, IONIDES EDWARDL. Markov genealogy processes. Theor Popul Biol 2022; 143:77-91. [PMID: 34896438 PMCID: PMC8846264 DOI: 10.1016/j.tpb.2021.11.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 11/19/2021] [Accepted: 11/22/2021] [Indexed: 02/03/2023]
Abstract
We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calculations with several examples. Existing full-information approaches for phylodynamic inference are special cases of the theory.
Collapse
Affiliation(s)
- AARON A. KING
- Department of Ecology & Evolutionary Biology, Center for the Study of Complex Systems, Center for Computational Medicine & Biology, and Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109 USA
| | - QIANYING LIN
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109 USA
| | - EDWARD L. IONIDES
- Department of Statistics and Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|
7
|
Ning N, Ionides EL, Ritov Y. Scalable Monte Carlo inference and rescaled local asymptotic normality. BERNOULLI 2021. [DOI: 10.3150/20-bej1321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Ning Ning
- Department of Statistics, University of Michigan, Ann Arbor
| | | | - Ya’acov Ritov
- Department of Statistics, University of Michigan, Ann Arbor
| |
Collapse
|
8
|
Wang S, Wang L. Particle Gibbs sampling for Bayesian phylogenetic inference. Bioinformatics 2021; 37:642-649. [PMID: 33045053 DOI: 10.1093/bioinformatics/btaa867] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 08/10/2020] [Accepted: 09/24/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The combinatorial sequential Monte Carlo (CSMC) has been demonstrated to be an efficient complementary method to the standard Markov chain Monte Carlo (MCMC) for Bayesian phylogenetic tree inference using biological sequences. It is appealing to combine the CSMC and MCMC in the framework of the particle Gibbs (PG) sampler to jointly estimate the phylogenetic trees and evolutionary parameters. However, the Markov chain of the PG may mix poorly for high dimensional problems (e.g. phylogenetic trees). Some remedies, including the PG with ancestor sampling and the interacting particle MCMC, have been proposed to improve the PG. But they either cannot be applied to or remain inefficient for the combinatorial tree space. RESULTS We introduce a novel CSMC method by proposing a more efficient proposal distribution. It also can be combined into the PG sampler framework to infer parameters in the evolutionary model. The new algorithm can be easily parallelized by allocating samples over different computing cores. We validate that the developed CSMC can sample trees more efficiently in various PG samplers via numerical experiments. AVAILABILITY AND IMPLEMENTATION The implementation of our method and the data underlying this article are available at https://github.com/liangliangwangsfu/phyloPMCMC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shijia Wang
- School of Statistic and Data Science, LPMC and KLMDASR, Nankai University, Nankai Qu 300071, China
| | - Liangliang Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| |
Collapse
|
9
|
Pokharel G, Deardon R. Emulation‐based inference for spatial infectious disease transmission models incorporating event time uncertainty. Scand Stat Theory Appl 2021. [DOI: 10.1111/sjos.12523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Gyanendra Pokharel
- Mathematics and Statistics University of Winnipeg Winnipeg Manitoba Canada
| | - Rob Deardon
- Production Animal Health & Mathematics and Statistics University of Calgary Calgary Alberta Canada
| |
Collapse
|
10
|
Henderson D, Zhu S(J, Cole CB, Lunter G. Demographic inference from multiple whole genomes using a particle filter for continuous Markov jump processes. PLoS One 2021; 16:e0247647. [PMID: 33651801 PMCID: PMC7924771 DOI: 10.1371/journal.pone.0247647] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 02/10/2021] [Indexed: 12/12/2022] Open
Abstract
Demographic events shape a population's genetic diversity, a process described by the coalescent-with-recombination model that relates demography and genetics by an unobserved sequence of genealogies along the genome. As the space of genealogies over genomes is large and complex, inference under this model is challenging. Formulating the coalescent-with-recombination model as a continuous-time and -space Markov jump process, we develop a particle filter for such processes, and use waypoints that under appropriate conditions allow the problem to be reduced to the discrete-time case. To improve inference, we generalise the Auxiliary Particle Filter for discrete-time models, and use Variational Bayes to model the uncertainty in parameter estimates for rare events, avoiding biases seen with Expectation Maximization. Using real and simulated genomes, we show that past population sizes can be accurately inferred over a larger range of epochs than was previously possible, opening the possibility of jointly analyzing multiple genomes under complex demographic models. Code is available at https://github.com/luntergroup/smcsmc.
Collapse
Affiliation(s)
| | - Sha (Joe) Zhu
- Wellcome Centre for Human Genetics, Oxford, United Kingdom
- Big Data Institute, Oxford, United Kingdom
| | - Christopher B. Cole
- MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford, United Kingdom
| | - Gerton Lunter
- MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford, United Kingdom
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
11
|
Vaughan TG, Leventhal GE, Rasmussen DA, Drummond AJ, Welch D, Stadler T. Estimating Epidemic Incidence and Prevalence from Genomic Data. Mol Biol Evol 2020; 36:1804-1816. [PMID: 31058982 PMCID: PMC6681632 DOI: 10.1093/molbev/msz106] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Modern phylodynamic methods interpret an inferred phylogenetic tree as a partial transmission chain providing information about the dynamic process of transmission and removal (where removal may be due to recovery, death, or behavior change). Birth–death and coalescent processes have been introduced to model the stochastic dynamics of epidemic spread under common epidemiological models such as the SIS and SIR models and are successfully used to infer phylogenetic trees together with transmission (birth) and removal (death) rates. These methods either integrate analytically over past incidence and prevalence to infer rate parameters, and thus cannot explicitly infer past incidence or prevalence, or allow such inference only in the coalescent limit of large population size. Here, we introduce a particle filtering framework to explicitly infer prevalence and incidence trajectories along with phylogenies and epidemiological model parameters from genomic sequences and case count data in a manner consistent with the underlying birth–death model. After demonstrating the accuracy of this method on simulated data, we use it to assess the prevalence through time of the early 2014 Ebola outbreak in Sierra Leone.
Collapse
Affiliation(s)
- Timothy G Vaughan
- Centre for Computational Evolution, University of Auckland, Auckland, New Zealand.,Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Gabriel E Leventhal
- Institute of Integrative Biology, ETH Zürich, Zurich, Switzerland.,Department of Civil and Environmental Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA
| | - David A Rasmussen
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC.,Bioinformatics Research Center, North Carolina State University, Raleigh, NC
| | - Alexei J Drummond
- Centre for Computational Evolution, University of Auckland, Auckland, New Zealand.,School of Computer Science, University of Auckland, Auckland, New Zealand
| | - David Welch
- Centre for Computational Evolution, University of Auckland, Auckland, New Zealand.,School of Computer Science, University of Auckland, Auckland, New Zealand
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| |
Collapse
|
12
|
Funk S, King AA. Choices and trade-offs in inference with infectious disease models. Epidemics 2019; 30:100383. [PMID: 32007792 DOI: 10.1016/j.epidem.2019.100383] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 09/29/2019] [Accepted: 12/11/2019] [Indexed: 12/23/2022] Open
Abstract
Inference using mathematical models of infectious disease dynamics can be an invaluable tool for the interpretation and analysis of epidemiological data. However, researchers wishing to use this tool are faced with a choice of models and model types, simulation methods, inference methods and software packages. Given the multitude of options, it can be challenging to decide on the best approach. Here, we delineate the choices and trade-offs involved in deciding on an approach for inference, and discuss aspects that might inform this decision. We provide examples of inference with a dataset of influenza cases using the R packages pomp and rbi.
Collapse
Affiliation(s)
- Sebastian Funk
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK; Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK
| | - Aaron A King
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA; Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI, USA; Department of Mathematics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
13
|
Comparison of catalytic performance of metal-modified SAPO-34: a molecular simulation study. J Mol Model 2019; 25:270. [DOI: 10.1007/s00894-019-4158-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 08/14/2019] [Indexed: 10/26/2022]
|
14
|
Bretó C, Ionides EL, King AA. Panel Data Analysis via Mechanistic Models. J Am Stat Assoc 2019; 115:1178-1188. [PMID: 32905476 PMCID: PMC7472993 DOI: 10.1080/01621459.2019.1604367] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 03/16/2019] [Indexed: 12/15/2022]
Abstract
Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of panel systems is that the dynamic interaction between units should be negligible. Panel models therefore consist of a collection of independent stochastic processes, generally linked through shared parameters while also having unit-specific parameters. To give the scientist flexibility in model specification, we are motivated to develop a framework for inference on panel data permitting the consideration of arbitrary nonlinear, partially observed panel models. We build on iterated filtering techniques that provide likelihood-based inference on nonlinear partially observed Markov process models for time series data. Our methodology depends on the latent Markov process only through simulation; this plug-and-play property ensures applicability to a large class of models. We demonstrate our methodology on a toy example and two epidemiological case studies. We address inferential and computational issues arising due to the combination of model complexity and dataset size. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Carles Bretó
- Department of Statistics, University of Michigan, Ann Arbor, MI
- Departament d’Anàlisi Econòmica, Universitat de València, València, Spain
| | | | - Aaron A. King
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
15
|
Wang L, Wang S, Bouchard-Côté A. An Annealed Sequential Monte Carlo Method for Bayesian Phylogenetics. Syst Biol 2019; 69:155-183. [DOI: 10.1093/sysbio/syz028] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 04/12/2019] [Accepted: 04/20/2019] [Indexed: 01/07/2023] Open
Abstract
Abstract
We describe an “embarrassingly parallel” method for Bayesian phylogenetic inference, annealed Sequential Monte Carlo (SMC), based on recent advances in the SMC literature such as adaptive determination of annealing parameters. The algorithm provides an approximate posterior distribution over trees and evolutionary parameters as well as an unbiased estimator for the marginal likelihood. This unbiasedness property can be used for the purpose of testing the correctness of posterior simulation software. We evaluate the performance of phylogenetic annealed SMC by reviewing and comparing with other computational Bayesian phylogenetic methods, in particular, different marginal likelihood estimation methods. Unlike previous SMC methods in phylogenetics, our annealed method can utilize standard Markov chain Monte Carlo (MCMC) tree moves and hence benefit from the large inventory of such moves available in the literature. Consequently, the annealed SMC method should be relatively easy to incorporate into existing phylogenetic software packages based on MCMC algorithms. We illustrate our method using simulation studies and real data analysis.
Collapse
Affiliation(s)
- Liangliang Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| | - Shijia Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| | - Alexandre Bouchard-Côté
- Department of Statistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
16
|
Volz EM, Siveroni I. Bayesian phylodynamic inference with complex models. PLoS Comput Biol 2018; 14:e1006546. [PMID: 30422979 PMCID: PMC6258546 DOI: 10.1371/journal.pcbi.1006546] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 11/27/2018] [Accepted: 10/05/2018] [Indexed: 12/20/2022] Open
Abstract
Population genetic modeling can enhance Bayesian phylogenetic inference by providing a realistic prior on the distribution of branch lengths and times of common ancestry. The parameters of a population genetic model may also have intrinsic importance, and simultaneous estimation of a phylogeny and model parameters has enabled phylodynamic inference of population growth rates, reproduction numbers, and effective population size through time. Phylodynamic inference based on pathogen genetic sequence data has emerged as useful supplement to epidemic surveillance, however commonly-used mechanistic models that are typically fitted to non-genetic surveillance data are rarely fitted to pathogen genetic data due to a dearth of software tools, and the theory required to conduct such inference has been developed only recently. We present a framework for coalescent-based phylogenetic and phylodynamic inference which enables highly-flexible modeling of demographic and epidemiological processes. This approach builds upon previous structured coalescent approaches and includes enhancements for computational speed, accuracy, and stability. A flexible markup language is described for translating parametric demographic or epidemiological models into a structured coalescent model enabling simultaneous estimation of demographic or epidemiological parameters and time-scaled phylogenies. We demonstrate the utility of these approaches by fitting compartmental epidemiological models to Ebola virus and Influenza A virus sequence data, demonstrating how important features of these epidemics, such as the reproduction number and epidemic curves, can be gleaned from genetic data. These approaches are provided as an open-source package PhyDyn for the BEAST2 phylogenetics platform.
Collapse
Affiliation(s)
- Erik M. Volz
- Department of Infectious Disease Epidemiology and the MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, United Kingdom
| | - Igor Siveroni
- Department of Infectious Disease Epidemiology and the MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, United Kingdom
| |
Collapse
|
17
|
Baele G, Dellicour S, Suchard MA, Lemey P, Vrancken B. Recent advances in computational phylodynamics. Curr Opin Virol 2018; 31:24-32. [PMID: 30248578 DOI: 10.1016/j.coviro.2018.08.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 08/16/2018] [Accepted: 08/20/2018] [Indexed: 01/02/2023]
Abstract
Time-stamped, trait-annotated phylogenetic trees built from virus genome data are increasingly used for outbreak investigation and monitoring ongoing epidemics. This routinely involves reconstructing the spatial and demographic processes from large data sets to help unveil the patterns and drivers of virus spread. Such phylodynamic inferences can however become quite time-consuming as the dimensions of the data increase, which has led to a myriad of approaches that aim to tackle this complexity. To elucidate the current state of the art in the field of phylodynamics, we discuss recent developments in Bayesian inference and accompanying software, highlight methods for improving computational efficiency and relevant visualisation tools. As an alternative to fully Bayesian approaches, we touch upon conditional software pipelines that compromise between statistical coherence and turn-around-time, and we highlight the available software packages. Finally, we outline future directions that may facilitate the large-scale tracking of epidemics in near real time.
Collapse
Affiliation(s)
- Guy Baele
- KU Leuven Department of Microbiology and Immunology, Rega Institute, Laboratory of Evolutionary and Computational Virology, Leuven, Belgium.
| | - Simon Dellicour
- KU Leuven Department of Microbiology and Immunology, Rega Institute, Laboratory of Evolutionary and Computational Virology, Leuven, Belgium; Spatial Epidemiology Lab (SpELL), Université Libre de Bruxelles, Bruxelles, Belgium
| | - Marc A Suchard
- Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA; Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Philippe Lemey
- KU Leuven Department of Microbiology and Immunology, Rega Institute, Laboratory of Evolutionary and Computational Virology, Leuven, Belgium
| | - Bram Vrancken
- KU Leuven Department of Microbiology and Immunology, Rega Institute, Laboratory of Evolutionary and Computational Virology, Leuven, Belgium
| |
Collapse
|
18
|
Abstract
To achieve complete polio eradication, the live oral poliovirus vaccine (OPV) currently used must be phased out after the end of wild poliovirus transmission. However, poorly understood threats may arise when OPV use is stopped. To counter these threats, better models than those currently available are needed. Two articles recently published in BMC Medicine address these issues. Mercer et al. (BMC Med 15:180, 2017) developed a statistical model analysis of polio case data and characteristics of cases occurring in several districts in Pakistan to inform resource allocation decisions. Nevertheless, despite having the potential to accelerate the elimination of polio cases, their analyses are unlikely to advance our understanding OPV cessation threats. McCarthy et al. (BMC Med 15:175, 2017) explored one such threat, namely the emergence and transmission of serotype 2 circulating vaccine derived poliovirus (cVDPV2) after OPV2 cessation, and found that the risk of persistent spread of cVDPV2 to new areas increases rapidly 1-5 years after OPV2 cessation. Thus, recently developed models and analysis methods have the potential to guide the required steps to surpass these threats. 'Big data' scientists could help with this; however, datasets covering all eradication efforts should be made readily available.Please see related articles: https://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-017-0937-y and https://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-017-0941-2 .
Collapse
Affiliation(s)
- James S Koopman
- Deparment of Epidemiology, 1415 E. Washington Heights, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
19
|
Ionides EL, Breto C, Park J, Smith RA, King AA. Monte Carlo profile confidence intervals for dynamic systems. J R Soc Interface 2017; 14:20170126. [PMID: 28679663 PMCID: PMC5550967 DOI: 10.1098/rsif.2017.0126] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Accepted: 06/09/2017] [Indexed: 12/21/2022] Open
Abstract
Monte Carlo methods to evaluate and maximize the likelihood function enable the construction of confidence intervals and hypothesis tests, facilitating scientific investigation using models for which the likelihood function is intractable. When Monte Carlo error can be made small, by sufficiently exhaustive computation, then the standard theory and practice of likelihood-based inference applies. As datasets become larger, and models more complex, situations arise where no reasonable amount of computation can render Monte Carlo error negligible. We develop profile likelihood methodology to provide frequentist inferences that take into account Monte Carlo uncertainty. We investigate the role of this methodology in facilitating inference for computationally challenging dynamic latent variable models. We present examples arising in the study of infectious disease transmission, demonstrating our methodology for inference on nonlinear dynamic models using genetic sequence data and panel time-series data. We also discuss applicability to nonlinear time-series and spatio-temporal data.
Collapse
Affiliation(s)
- E L Ionides
- Department of Statistics, The University of Michigan, Ann Arbor, MI, USA
| | - C Breto
- Department of Statistics, The University of Michigan, Ann Arbor, MI, USA
| | - J Park
- Department of Statistics, The University of Michigan, Ann Arbor, MI, USA
| | - R A Smith
- Department of Bioinformatics, The University of Michigan, Ann Arbor, MI, USA
| | - A A King
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, MI, USA
- Department of Mathematics, The University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|