1
|
Matteson NL, Hassler GW, Kurzban E, Schwab MA, Perkins SA, Gangavarapu K, Levy JI, Parker E, Pride D, Hakim A, De Hoff P, Cheung W, Castro-Martinez A, Rivera A, Veder A, Rivera A, Wauer C, Holmes J, Wilson J, Ngo SN, Plascencia A, Lawrence ES, Smoot EW, Eisner ER, Tsai R, Chacón M, Baer NA, Seaver P, Salido RA, Aigner S, Ngo TT, Barber T, Ostrander T, Fielding-Miller R, Simmons EH, Zazueta OE, Serafin-Higuera I, Sanchez-Alavez M, Moreno-Camacho JL, García-Gil A, Murphy Schafer AR, McDonald E, Corrigan J, Malone JD, Stous S, Shah S, Moshiri N, Weiss A, Anderson C, Aceves CM, Spencer EG, Hufbauer EC, Lee JJ, King AJ, Ramesh KS, Nguyen KN, Saucedo K, Robles-Sikisaka R, Fisch KM, Gonias SL, Birmingham A, McDonald D, Karthikeyan S, Martin NK, Schooley RT, Negrete AJ, Reyna HJ, Chavez JR, Garcia ML, Cornejo-Bravo JM, Becker D, Isaksson M, Washington NL, Lee W, Garfein RS, Luna-Ruiz Esparza MA, Alcántar-Fernández J, Henson B, Jepsen K, Olivares-Flores B, Barrera-Badillo G, Lopez-Martínez I, Ramírez-González JE, Flores-León R, Kingsmore SF, Sanders A, Pradenas A, White B, Matthews G, Hale M, McLawhon RW, Reed SL, Winbush T, McHardy IH, Fielding RA, Nicholson L, Quigley MM, Harding A, Mendoza A, Bakhtar O, Browne SH, Olivas Flores J, Rincon Rodríguez DG, Gonzalez Ibarra M, Robles Ibarra LC, Arellano Vera BJ, Gonzalez Garcia J, Harvey-Vera A, Knight R, Laurent LC, Yeo GW, Wertheim JO, Ji X, Worobey M, Suchard MA, Andersen KG, Campos-Romero A, Wohl S, Zeller M. Genomic surveillance reveals dynamic shifts in the connectivity of COVID-19 epidemics. Cell 2023; 186:5690-5704.e20. [PMID: 38101407 PMCID: PMC10795731 DOI: 10.1016/j.cell.2023.11.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 08/21/2023] [Accepted: 11/21/2023] [Indexed: 12/17/2023]
Abstract
The maturation of genomic surveillance in the past decade has enabled tracking of the emergence and spread of epidemics at an unprecedented level. During the COVID-19 pandemic, for example, genomic data revealed that local epidemics varied considerably in the frequency of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) lineage importation and persistence, likely due to a combination of COVID-19 restrictions and changing connectivity. Here, we show that local COVID-19 epidemics are driven by regional transmission, including across international boundaries, but can become increasingly connected to distant locations following the relaxation of public health interventions. By integrating genomic, mobility, and epidemiological data, we find abundant transmission occurring between both adjacent and distant locations, supported by dynamic mobility patterns. We find that changing connectivity significantly influences local COVID-19 incidence. Our findings demonstrate a complex meaning of "local" when investigating connected epidemics and emphasize the importance of collaborative interventions for pandemic prevention and mitigation.
Collapse
Affiliation(s)
| | - Gabriel W Hassler
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ezra Kurzban
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Madison A Schwab
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Sarah A Perkins
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Karthik Gangavarapu
- Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, Los Angeles, CA, USA; Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Joshua I Levy
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Edyth Parker
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - David Pride
- Department of Pathology, University of California, San Diego, La Jolla, CA, USA; Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Abbas Hakim
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Diego, La Jolla, CA, USA; COVID-19 Detection, Investigation, Surveillance, Clinical, and Outbreak Response, California Department of Public Health, Richmond, CA, USA
| | - Peter De Hoff
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Diego, La Jolla, CA, USA; COVID-19 Detection, Investigation, Surveillance, Clinical, and Outbreak Response, California Department of Public Health, Richmond, CA, USA
| | - Willi Cheung
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Diego, La Jolla, CA, USA; COVID-19 Detection, Investigation, Surveillance, Clinical, and Outbreak Response, California Department of Public Health, Richmond, CA, USA
| | - Anelizze Castro-Martinez
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Diego, La Jolla, CA, USA; Sanford Consortium of Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Andrea Rivera
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Anthony Veder
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Ariana Rivera
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Cassandra Wauer
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Jacqueline Holmes
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Jedediah Wilson
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Shayla N Ngo
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Ashley Plascencia
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Elijah S Lawrence
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Elizabeth W Smoot
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Emily R Eisner
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Rebecca Tsai
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Marisol Chacón
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Nathan A Baer
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Phoebe Seaver
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Rodolfo A Salido
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Stefan Aigner
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Toan T Ngo
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Tom Barber
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Tyler Ostrander
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Rebecca Fielding-Miller
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, La Jolla, CA, USA; Division of Infectious Disease and Global Public Health, University of California, San Diego, La Jolla, CA, USA
| | | | - Oscar E Zazueta
- Department of Epidemiology, Secretaria de Salud de Baja California, Tijuana, Baja California, Mexico
| | | | - Manuel Sanchez-Alavez
- Centro de Diagnostico COVID-19 UABC, Tijuana, Baja California, Mexico; Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | | | - Abraham García-Gil
- Clinical Laboratory Department, Salud Digna, A.C, Tijuana, Baja California, Mexico
| | | | - Eric McDonald
- County of San Diego Health and Human Services Agency, San Diego, CA, USA
| | - Jeremy Corrigan
- County of San Diego Health and Human Services Agency, San Diego, CA, USA
| | - John D Malone
- County of San Diego Health and Human Services Agency, San Diego, CA, USA
| | - Sarah Stous
- County of San Diego Health and Human Services Agency, San Diego, CA, USA
| | - Seema Shah
- County of San Diego Health and Human Services Agency, San Diego, CA, USA
| | - Niema Moshiri
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Alana Weiss
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Catelyn Anderson
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Christine M Aceves
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Emily G Spencer
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Emory C Hufbauer
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Justin J Lee
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Alison J King
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Karthik S Ramesh
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Kelly N Nguyen
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Kieran Saucedo
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | | | - Kathleen M Fisch
- Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Diego, La Jolla, CA, USA; Center for Computational Biology and Bioinformatics, University of California San Diego, La Jolla, CA, USA
| | - Steven L Gonias
- Department of Pathology, University of California, San Diego, La Jolla, CA, USA
| | - Amanda Birmingham
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Daniel McDonald
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Smruthi Karthikeyan
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Natasha K Martin
- Division of Infectious Disease and Global Public Health, University of California, San Diego, La Jolla, CA, USA
| | - Robert T Schooley
- Division of Infectious Disease and Global Public Health, University of California, San Diego, La Jolla, CA, USA
| | - Agustin J Negrete
- Facultad de Ciencias de la Salud Universidad Autonoma de Baja California Valle de Las Palmas, Tijuana, Baja California, Mexico
| | - Horacio J Reyna
- Facultad de Ciencias de la Salud Universidad Autonoma de Baja California Valle de Las Palmas, Tijuana, Baja California, Mexico
| | - Jose R Chavez
- Facultad de Ciencias de la Salud Universidad Autonoma de Baja California Valle de Las Palmas, Tijuana, Baja California, Mexico
| | - Maria L Garcia
- Facultad de Ciencias de la Salud Universidad Autonoma de Baja California Valle de Las Palmas, Tijuana, Baja California, Mexico
| | - Jose M Cornejo-Bravo
- Facultad de Ciencias Quimicas e Ingenieria, Universidad Autonoma de Baja California, Tijuana, Baja California, Mexico
| | | | | | | | | | - Richard S Garfein
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, La Jolla, CA, USA
| | | | | | - Benjamin Henson
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Kristen Jepsen
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Beatriz Olivares-Flores
- Instituto de Diagnóstico y Referencia Epidemiológicos (InDRE), Ciudad de México, CDMX, Mexico
| | - Gisela Barrera-Badillo
- Instituto de Diagnóstico y Referencia Epidemiológicos (InDRE), Ciudad de México, CDMX, Mexico
| | - Irma Lopez-Martínez
- Instituto de Diagnóstico y Referencia Epidemiológicos (InDRE), Ciudad de México, CDMX, Mexico
| | - José E Ramírez-González
- Instituto de Diagnóstico y Referencia Epidemiológicos (InDRE), Ciudad de México, CDMX, Mexico
| | - Rita Flores-León
- Instituto de Diagnóstico y Referencia Epidemiológicos (InDRE), Ciudad de México, CDMX, Mexico
| | | | - Alison Sanders
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Allorah Pradenas
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Benjamin White
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Gary Matthews
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Matt Hale
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Ronald W McLawhon
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Sharon L Reed
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | - Terri Winbush
- Return to Learn, University of California, San Diego, La Jolla, CA, USA
| | | | | | | | | | | | | | | | - Sara H Browne
- Division of Infectious Disease and Global Public Health, University of California, San Diego, La Jolla, CA, USA; Specialist in Global Health, Encinitas, CA, USA
| | - Jocelyn Olivas Flores
- Facultad de Ciencias Quimicas e Ingenieria, Universidad Autonoma de Baja California, Tijuana, Baja California, Mexico; University of HealthMx, Tijuana, Baja California, Mexico
| | - Diana G Rincon Rodríguez
- University of HealthMx, Tijuana, Baja California, Mexico; Facultad de Medicina, Universidad Xochicalco, Tijuana, Baja California, Mexico
| | - Martin Gonzalez Ibarra
- University of HealthMx, Tijuana, Baja California, Mexico; Facultad de Medicina, Universidad Xochicalco, Tijuana, Baja California, Mexico
| | - Luis C Robles Ibarra
- University of HealthMx, Tijuana, Baja California, Mexico; Instituto de Seguridad y Servicios Sociales de los Trabajadores del Estado, Tijuana, Baja California, Mexico
| | - Betsy J Arellano Vera
- University of HealthMx, Tijuana, Baja California, Mexico; Instituto Mexicano del Seguro Social, Tijuana, Baja California, Mexico
| | - Jonathan Gonzalez Garcia
- University of HealthMx, Tijuana, Baja California, Mexico; SIMNSA, Tijuana, Baja California, Mexico
| | | | - Rob Knight
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Louise C Laurent
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Diego, La Jolla, CA, USA; Sanford Consortium of Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Gene W Yeo
- Expedited COVID Identification Environment (EXCITE) Laboratory, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA; Sanford Consortium of Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Joel O Wertheim
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Xiang Ji
- Department of Mathematics, School of Science and Engineering, Tulane University, New Orleans, LA, USA
| | - Michael Worobey
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Marc A Suchard
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Kristian G Andersen
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA.
| | - Abraham Campos-Romero
- Innovation and Research Department, Salud Digna, A.C, Tijuana, Baja California, Mexico
| | - Shirlee Wohl
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA
| | - Mark Zeller
- Department of Immunology and Microbiology, Scripps Research, La Jolla, CA, USA.
| |
Collapse
|
2
|
Van der Roest BR, Bootsma MCJ, Fischer EAJ, Klinkenberg D, Kretzschmar MEE. A Bayesian inference method to estimate transmission trees with multiple introductions; applied to SARS-CoV-2 in Dutch mink farms. PLoS Comput Biol 2023; 19:e1010928. [PMID: 38011266 PMCID: PMC10703282 DOI: 10.1371/journal.pcbi.1010928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 12/07/2023] [Accepted: 11/12/2023] [Indexed: 11/29/2023] Open
Abstract
Knowledge of who infected whom during an outbreak of an infectious disease is important to determine risk factors for transmission and to design effective control measures. Both whole-genome sequencing of pathogens and epidemiological data provide useful information about the transmission events and underlying processes. Existing models to infer transmission trees usually assume that the pathogen is introduced only once from outside into the population of interest. However, this is not always true. For instance, SARS-CoV-2 is suggested to be introduced multiple times in mink farms in the Netherlands from the SARS-CoV-2 pandemic among humans. Here, we developed a Bayesian inference method combining whole-genome sequencing data and epidemiological data, allowing for multiple introductions of the pathogen in the population. Our method does not a priori split the outbreak into multiple phylogenetic clusters, nor does it break the dependency between the processes of mutation, within-host dynamics, transmission, and observation. We implemented our method as an additional feature in the R-package phybreak. On simulated data, our method correctly identifies the number of introductions, with an accuracy depending on the proportion of all observed cases that are introductions. Moreover, when a single introduction was simulated, our method produced similar estimates of parameters and transmission trees as the existing package. When applied to data from a SARS-CoV-2 outbreak in Dutch mink farms, the method provides strong evidence for independent introductions of the pathogen at 13 farms, infecting a total of 63 farms. Using the new feature of the phybreak package, transmission routes of a more complex class of infectious disease outbreaks can be inferred which will aid infection control in future outbreaks.
Collapse
Affiliation(s)
- Bastiaan R. Van der Roest
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Martin C. J. Bootsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Department of Mathematics, Faculty of Science, Utrecht University, Utrecht, Netherlands
| | - Egil A. J. Fischer
- Department of Population Health Sciences, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
| | - Don Klinkenberg
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Mirjam E. E. Kretzschmar
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| |
Collapse
|
3
|
Horsfield ST, Tonkin-Hill G, Croucher NJ, Lees JA. Accurate and fast graph-based pangenome annotation and clustering with ggCaller. Genome Res 2023; 33:1622-1637. [PMID: 37620118 PMCID: PMC10620059 DOI: 10.1101/gr.277733.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023]
Abstract
Bacterial genomes differ in both gene content and sequence mutations, which underlie extensive phenotypic diversity, including variation in susceptibility to antimicrobials or vaccine-induced immunity. To identify and quantify important variants, all genes within a population must be predicted, functionally annotated, and clustered, representing the "pangenome." Despite the volume of genome data available, gene prediction and annotation are currently conducted in isolation on individual genomes, which is computationally inefficient and frequently inconsistent across genomes. Here, we introduce the open-source software graph-gene-caller (ggCaller). ggCaller combines gene prediction, functional annotation, and clustering into a single workflow using population-wide de Bruijn graphs, removing redundancy in gene annotation and resulting in more accurate gene predictions and orthologue clustering. We applied ggCaller to simulated and real-world bacterial data sets containing hundreds or thousands of genomes, comparing it to current state-of-the-art tools. ggCaller has considerable speed-ups with equivalent or greater accuracy, particularly with data sets containing complex sources of error, such as assembly contamination or fragmentation. ggCaller is also an important extension to bacterial genome-wide association studies, enabling querying of annotated graphs for functional analyses. We highlight this application by functionally annotating DNA sequences with significant associations to tetracycline and macrolide resistance in Streptococcus pneumoniae, identifying key resistance determinants that were missed when using only a single reference genome. ggCaller is a novel bacterial genome analysis tool with applications in bacterial evolution and epidemiology.
Collapse
Affiliation(s)
- Samuel T Horsfield
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London W12 0BZ, United Kingdom;
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Gerry Tonkin-Hill
- Department of Biostatistics, University of Oslo, Blindern, 0372 Oslo, Norway
| | - Nicholas J Croucher
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London W12 0BZ, United Kingdom
| | - John A Lees
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London W12 0BZ, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| |
Collapse
|
4
|
Ke Z, Vikalo H. Graph-Based Reconstruction and Analysis of Disease Transmission Networks Using Viral Genomic Data. J Comput Biol 2023. [PMID: 37347892 DOI: 10.1089/cmb.2022.0373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023] Open
Abstract
Understanding the patterns of viral disease transmissions helps establish public health policies and aids in controlling and ending a disease outbreak. Classical methods for studying disease transmission dynamics that rely on epidemiological data, such as times of sample collection and duration of exposure intervals, struggle to provide desired insight due to limited informativeness of such data. A more precise characterization of disease transmissions may be acquired from sequencing data that reveal genetic distance between viral genomes in patient samples. Indeed, genetic distance between viral strains present in hosts contains valuable information about transmission history, thus motivating the design of methods that rely on genomic data to reconstruct a directed disease transmission network, detect transmission clusters, and identify significant network nodes (e.g., super-spreaders). In this article, we present a novel end-to-end framework for the analysis of viral transmissions utilizing viral genomic (sequencing) data. The proposed framework groups infected hosts into transmission clusters based on the reconstructed viral strains infecting them; the genetic distance between a pair of hosts is calculated using Earth Mover's Distance, and further used to infer transmission direction between the hosts. To quantify the significance of a host in the transmission network, the importance score is calculated by a graph convolutional autoencoder. The viral transmission network is represented by a directed minimum spanning tree utilizing the Edmond's algorithm modified to incorporate constraints on the importance scores of the hosts. The proposed framework outperforms state-of-the-art techniques for the analysis of viral transmission dynamics in several experiments on semiexperimental as well as experimental data.
Collapse
Affiliation(s)
- Ziqi Ke
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| | - Haris Vikalo
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
5
|
Tang M, Dudas G, Bedford T, Minin VN. Fitting stochastic epidemic models to gene genealogies using linear noise approximation. Ann Appl Stat 2023. [DOI: 10.1214/21-aoas1583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Mingwei Tang
- Department of Statistics, University of Washington, Seattle
| | - Gytis Dudas
- Gothenburg Global Biodiversity Centre (GGBC)
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center
| | | |
Collapse
|
6
|
Beck-Johnson LM, Gorsich EE, Hallman C, Tildesley MJ, Miller RS, Webb CT. An exploration of within-herd dynamics of a transboundary livestock disease: A foot and mouth disease case study. Epidemics 2023; 42:100668. [PMID: 36696830 DOI: 10.1016/j.epidem.2023.100668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 12/20/2022] [Accepted: 01/09/2023] [Indexed: 01/19/2023] Open
Abstract
Transboundary livestock diseases are a high priority for policy makers because of the serious economic burdens associated with infection. In order to make well informed preparedness and response plans, policy makers often utilize mathematical models to understand possible outcomes of different control strategies and outbreak scenarios. Many of these models focus on the transmission between herds and the overall trajectory of the outbreak. While the course of infection within herds has not been the focus of the majority of models, a thorough understanding of within-herd dynamics can provide valuable insight into a disease system by providing information on herd-level biological properties of the infection, which can be used to inform decision making in both endemic and outbreak settings and to inform larger between-herd models. In this study, we develop three stochastic simulation models to study within-herd foot and mouth disease dynamics and the implications of different empirical data-based assumptions about the timing of the onset of infectiousness and clinical signs. We also study the influence of herd size and the proportion of the herd that is initially infected on the outcome of the infection. We find that increasing herd size increases the duration of infectiousness and that the size of the herd plays a more significant role in determining this duration than the number of initially infected cattle in that herd. We also find that the assumptions made regarding the onset of infectiousness and clinical signs, which are based on contradictory empirical findings, can result in the predictions about when infection would be detectable differing by several days. Therefore, the disease progression used to characterize the course of infection in a single bovine host could have significant implications for determining when herds can be detected and subsequently controlled; the timing of which could influence the overall predicted trajectory of outbreaks.
Collapse
Affiliation(s)
| | - Erin E Gorsich
- Department of Biology, Colorado State University, United States of America
| | - Clayton Hallman
- USDA APHIS Veterinary Services, Center for Epidemiology and Animal Health, United States of America
| | - Michael J Tildesley
- Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), School of Life Sciences and Mathematics Institute, University of Warwick, United Kingdom
| | - Ryan S Miller
- USDA APHIS Veterinary Services, Center for Epidemiology and Animal Health, United States of America
| | - Colleen T Webb
- Department of Biology, Colorado State University, United States of America
| |
Collapse
|
7
|
Danesh G, Saulnier E, Gascuel O, Choisy M, Alizon S. TiPS
: Rapidly simulating trajectories and phylogenies from compartmental models. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.14038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Gonché Danesh
- MIVEGEC, CNRS, IRD Université de Montpellier Montpellie France
| | - Emma Saulnier
- MIVEGEC, CNRS, IRD Université de Montpellier Montpellie France
| | | | - Marc Choisy
- Centre for Tropical Medicine and Global Health Nuffield Department of Medicine, University of Oxford Oxford UK
- Oxford University Clinical Research Unit Ho Chi Minh City Vietnam
| | - Samuel Alizon
- MIVEGEC, CNRS, IRD Université de Montpellier Montpellie France
- Center for Interdisciplinary Research in Biology (CIRB) College de France, CNRS, INSERM, Université PSL Paris France
| |
Collapse
|
8
|
Zhang Y, Britton T, Zhou X. Monitoring real-time transmission heterogeneity from incidence data. PLoS Comput Biol 2022; 18:e1010078. [PMID: 36455043 DOI: 10.1371/journal.pcbi.1010078] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 12/13/2022] [Accepted: 11/16/2022] [Indexed: 12/03/2022] Open
Abstract
The transmission heterogeneity of an epidemic is associated with a complex mixture of host, pathogen and environmental factors. And it may indicate superspreading events to reduce the efficiency of population-level control measures and to sustain the epidemic over a larger scale and a longer duration. Methods have been proposed to identify significant transmission heterogeneity in historic epidemics based on several data sources, such as contact history, viral genomes and spatial information, which may not be available, and more importantly ignore the temporal trend of transmission heterogeneity. Here we attempted to establish a convenient method to estimate real-time heterogeneity over an epidemic. Within the branching process framework, we introduced an instant-individualheterogenous infectiousness model to jointly characterize the variation in infectiousness both between individuals and among different times. With this model, we could simultaneously estimate the transmission heterogeneity and the reproduction number from incidence time series. We validated the model with data of both simulated and real outbreaks. Our estimates of the overall and real-time heterogeneities of the six epidemics were consistent with those presented in the literature. Additionally, our model is robust to the ubiquitous bias of under-reporting and misspecification of serial interval. By analyzing recent data from South Africa, we found evidence that the Omicron might be of more significant transmission heterogeneity than Delta. Our model based on incidence data was proved to be reliable in estimating the real-time transmission heterogeneity.
Collapse
Affiliation(s)
- Yunjun Zhang
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China.,Center for Statistical Science, Peking University, Beijing, China
| | - Tom Britton
- Department of Mathematics, Stockholm University, Stockholm, Sweden
| | - Xiaohua Zhou
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China.,Center for Statistical Science, Peking University, Beijing, China.,Beijing International Center for Mathematical Research, Peking University, Beijing, China.,School of Mathematical Sciences, Peking University, Beijing, China
| |
Collapse
|
9
|
Optimized phylogenetic clustering of HIV-1 sequence data for public health applications. PLoS Comput Biol 2022; 18:e1010745. [PMID: 36449514 PMCID: PMC9744331 DOI: 10.1371/journal.pcbi.1010745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 12/12/2022] [Accepted: 11/17/2022] [Indexed: 12/02/2022] Open
Abstract
Clusters of genetically similar infections suggest rapid transmission and may indicate priorities for public health action or reveal underlying epidemiological processes. However, clusters often require user-defined thresholds and are sensitive to non-epidemiological factors, such as non-random sampling. Consequently the ideal threshold for public health applications varies substantially across settings. Here, we show a method which selects optimal thresholds for phylogenetic (subset tree) clustering based on population. We evaluated this method on HIV-1 pol datasets (n = 14, 221 sequences) from four sites in USA (Tennessee, Washington), Canada (Northern Alberta) and China (Beijing). Clusters were defined by tips descending from an ancestral node (with a minimum bootstrap support of 95%) through a series of branches, each with a length below a given threshold. Next, we used pplacer to graft new cases to the fixed tree by maximum likelihood. We evaluated the effect of varying branch-length thresholds on cluster growth as a count outcome by fitting two Poisson regression models: a null model that predicts growth from cluster size, and an alternative model that includes mean collection date as an additional covariate. The alternative model was favoured by AIC across most thresholds, with optimal (greatest difference in AIC) thresholds ranging 0.007-0.013 across sites. The range of optimal thresholds was more variable when re-sampling 80% of the data by location (IQR 0.008 - 0.016, n = 100 replicates). Our results use prospective phylogenetic cluster growth and suggest that there is more variation in effective thresholds for public health than those typically used in clustering studies.
Collapse
|
10
|
Chao E, Chato C, Vender R, Olabode AS, Ferreira RC, Poon AFY. Molecular source attribution. PLoS Comput Biol 2022; 18:e1010649. [PMID: 36395093 PMCID: PMC9671344 DOI: 10.1371/journal.pcbi.1010649] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Elisa Chao
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Connor Chato
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Reid Vender
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
- School of Medicine, Queen’s University, Kingston, Ontario, Canada
| | - Abayomi S. Olabode
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Roux-Cil Ferreira
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| | - Art F. Y. Poon
- Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
- * E-mail:
| |
Collapse
|
11
|
Skums P, Mohebbi F, Tsyvina V, Baykal PI, Nemira A, Ramachandran S, Khudyakov Y. SOPHIE: Viral outbreak investigation and transmission history reconstruction in a joint phylogenetic and network theory framework. Cell Syst 2022; 13:844-856.e4. [PMID: 36265470 PMCID: PMC9590096 DOI: 10.1016/j.cels.2022.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 07/05/2022] [Accepted: 07/19/2022] [Indexed: 01/26/2023]
Abstract
Genomic epidemiology is now widely used for viral outbreak investigations. Still, this methodology faces many challenges. First, few methods account for intra-host viral diversity. Second, maximum parsimony principle continues to be employed for phylogenetic inference of transmission histories, even though maximum likelihood or Bayesian models are usually more consistent. Third, many methods utilize case-specific data, such as sampling times or infection exposure intervals. This impedes study of persistent infections in vulnerable groups, where such information has a limited use. Finally, most methods implicitly assume that transmission events are independent, although common source outbreaks violate this assumption. We propose a maximum likelihood framework, SOPHIE, based on the integration of phylogenetic and random graph models. It infers transmission networks from viral phylogenies and expected properties of inter-host social networks modeled as random graphs with given expected degree distributions. SOPHIE is scalable, accounts for intra-host diversity, and accurately infers transmissions without case-specific epidemiological data.
Collapse
Affiliation(s)
- Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, GA, USA.
| | - Fatemeh Mohebbi
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Vyacheslav Tsyvina
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Pelin Icer Baykal
- Department of Biosystems Science & Engineering, ETH Zurich, Basel, Switzerland
| | - Alina Nemira
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Sumathi Ramachandran
- Division of Viral Hepatitis, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Yury Khudyakov
- Division of Viral Hepatitis, Centers for Disease Control and Prevention, Atlanta, GA, USA
| |
Collapse
|
12
|
Shrestha S, Winglee K, Hill AN, Shaw T, Smith JP, Kammerer JS, Silk BJ, Marks SM, Dowdy D. Model-based Analysis of Tuberculosis Genotype Clusters in the United States Reveals High Degree of Heterogeneity in Transmission and State-level Differences Across California, Florida, New York, and Texas. Clin Infect Dis 2022; 75:1433-1441. [PMID: 35143641 PMCID: PMC9412192 DOI: 10.1093/cid/ciac121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Reductions in tuberculosis (TB) transmission have been instrumental in lowering TB incidence in the United States. Sustaining and augmenting these reductions are key public health priorities. METHODS We fit mechanistic transmission models to distributions of genotype clusters of TB cases reported to the Centers for Disease Control and Prevention during 2012-2016 in the United States and separately in California, Florida, New York, and Texas. We estimated the mean number of secondary cases generated per infectious case (R0) and individual-level heterogeneity in R0 at state and national levels and assessed how different definitions of clustering affected these estimates. RESULTS In clusters of genotypically linked TB cases that occurred within a state over a 5-year period (reference scenario), the estimated R0 was 0.29 (95% confidence interval [CI], .28-.31) in the United States. Transmission was highly heterogeneous; 0.24% of simulated cases with individual R0 >10 generated 19% of all recent secondary transmissions. R0 estimate was 0.16 (95% CI, .15-.17) when a cluster was defined as cases occurring within the same county over a 3-year period. Transmission varied across states: estimated R0s were 0.34 (95% CI, .3-.4) in California, 0.28 (95% CI, .24-.36) in Florida, 0.19 (95% CI, .15-.27) in New York, and 0.38 (95% CI, .33-.46) in Texas. CONCLUSIONS TB transmission in the United States is characterized by pronounced heterogeneity at the individual and state levels. Improving detection of transmission clusters through incorporation of whole-genome sequencing and identifying the drivers of this heterogeneity will be essential to reducing TB transmission.
Collapse
Affiliation(s)
- Sourya Shrestha
- Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Kathryn Winglee
- Division of Tuberculosis Elimination, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Andrew N Hill
- Division of Tuberculosis Elimination, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Tambi Shaw
- California Department of Public Health, Richmond, California, USA
| | - Jonathan P Smith
- Department of Policy and Administration, Yale University, New Haven, Connecticut, USA
| | - J Steve Kammerer
- Division of Tuberculosis Elimination, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Benjamin J Silk
- Division of Tuberculosis Elimination, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Suzanne M Marks
- Division of Tuberculosis Elimination, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - David Dowdy
- Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| |
Collapse
|
13
|
Alamil M, Thébaud G, Berthier K, Soubeyrand S. Characterizing viral within-host diversity in fast and non-equilibrium demo-genetic dynamics. Front Microbiol 2022; 13:983938. [PMID: 36274731 PMCID: PMC9581327 DOI: 10.3389/fmicb.2022.983938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 09/08/2022] [Indexed: 11/13/2022] Open
Abstract
High-throughput sequencing has opened the route for a deep assessment of within-host genetic diversity that can be used, e.g., to characterize microbial communities and to infer transmission links in infectious disease outbreaks. The performance of such characterizations and inferences cannot be analytically assessed in general and are often grounded on computer-intensive evaluations. Then, being able to simulate within-host genetic diversity across time under various demo-genetic assumptions is paramount to assess the performance of the approaches of interest. In this context, we built an original model that can be simulated to investigate the temporal evolution of genotypes and their frequencies under various demo-genetic assumptions. The model describes the growth and the mutation of genotypes at the nucleotide resolution conditional on an overall within-host viral kinetics, and can be tuned to generate fast non-equilibrium demo-genetic dynamics. We ran simulations of this model and computed classic diversity indices to characterize the temporal variation of within-host genetic diversity (from high-throughput amplicon sequences) of virus populations under three demographic kinetic models of viral infection. Our results highlight how demographic (viral load) and genetic (mutation, selection, or drift) factors drive variations in within-host diversity during the course of an infection. In particular, we observed a non-monotonic relationship between pathogen population size and genetic diversity, and a reduction of the impact of mutation on diversity when a non-specific host immune response is activated. The large variation in the diversity patterns generated in our simulations suggests that the underlying model provides a flexible basis to produce very diverse demo-genetic scenarios and test, for instance, methods for the inference of transmission links during outbreaks.
Collapse
Affiliation(s)
- Maryam Alamil
- INRAE, BioSP, Avignon, France
- Department of Mathematics and Computer Science, Alfaisal University, Riyadh, Saudi Arabia
- *Correspondence: Maryam Alamil ;
| | - Gaël Thébaud
- PHIM Plant Health Institute, INRAE, Univ Montpellier, CIRAD, Institut Agro, IRD, Montpellier, France
| | | | | |
Collapse
|
14
|
Lundgren E, Romero-Severson E, Albert J, Leitner T. Combining biomarker and virus phylogenetic models improves HIV-1 epidemiological source identification. PLoS Comput Biol 2022; 18:e1009741. [PMID: 36026480 PMCID: PMC9455879 DOI: 10.1371/journal.pcbi.1009741] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 09/08/2022] [Accepted: 08/02/2022] [Indexed: 01/07/2023] Open
Abstract
To identify and stop active HIV transmission chains new epidemiological techniques are needed. Here, we describe the development of a multi-biomarker augmentation to phylogenetic inference of the underlying transmission history in a local population. HIV biomarkers are measurable biological quantities that have some relationship to the amount of time someone has been infected with HIV. To train our model, we used five biomarkers based on real data from serological assays, HIV sequence data, and target cell counts in longitudinally followed, untreated patients with known infection times. The biomarkers were modeled with a mixed effects framework to allow for patient specific variation and general trends, and fit to patient data using Markov Chain Monte Carlo (MCMC) methods. Subsequently, the density of the unobserved infection time conditional on observed biomarkers were obtained by integrating out the random effects from the model fit. This probabilistic information about infection times was incorporated into the likelihood function for the transmission history and phylogenetic tree reconstruction, informed by the HIV sequence data. To critically test our methodology, we developed a coalescent-based simulation framework that generates phylogenies and biomarkers given a specific or general transmission history. Testing on many epidemiological scenarios showed that biomarker augmented phylogenetics can reach 90% accuracy under idealized situations. Under realistic within-host HIV-1 evolution, involving substantial within-host diversification and frequent transmission of multiple lineages, the average accuracy was at about 50% in transmission clusters involving 5-50 hosts. Realistic biomarker data added on average 16 percentage points over using the phylogeny alone. Using more biomarkers improved the performance. Shorter temporal spacing between transmission events and increased transmission heterogeneity reduced reconstruction accuracy, but larger clusters were not harder to get right. More sequence data per infected host also improved accuracy. We show that the method is robust to incomplete sampling and that adding biomarkers improves reconstructions of real HIV-1 transmission histories. The technology presented here could allow for better prevention programs by providing data for locally informed and tailored strategies.
Collapse
Affiliation(s)
- Erik Lundgren
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Ethan Romero-Severson
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Jan Albert
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden
| | - Thomas Leitner
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- * E-mail:
| |
Collapse
|
15
|
Abstract
Phylogenetic models have long assumed that lineages diverge independently. Processes of diversification that are of interest in biogeography, epidemiology, and genome evolution violate this assumption by affecting multiple evolutionary lineages. To relax the assumption of independent divergences and infer patterns of divergences predicted by such processes, we introduce a way of conceptualizing, modeling, and inferring phylogenetic trees. We apply the approach to genomic data from geckos distributed across the Philippines and find support for patterns of shared divergences predicted by repeated fragmentation of the archipelago by interglacial rises in sea level. Many processes of biological diversification can simultaneously affect multiple evolutionary lineages. Examples include multiple members of a gene family diverging when a region of a chromosome is duplicated, multiple viral strains diverging at a “super-spreading” event, and a geological event fragmenting whole communities of species. It is difficult to test for patterns of shared divergences predicted by such processes because all phylogenetic methods assume that lineages diverge independently. We introduce a Bayesian phylogenetic approach to relax the assumption of independent, bifurcating divergences by expanding the space of topologies to include trees with shared and multifurcating divergences. This allows us to jointly infer phylogenetic relationships, divergence times, and patterns of divergences predicted by processes of diversification that affect multiple evolutionary lineages simultaneously or lead to more than two descendant lineages. Using simulations, we find that the method accurately infers shared and multifurcating divergence events when they occur and performs as well as current phylogenetic methods when divergences are independent and bifurcating. We apply our approach to genomic data from two genera of geckos from across the Philippines to test if past changes to the islands’ landscape caused bursts of speciation. Unlike previous analyses restricted to only pairs of gecko populations, we find evidence for patterns of shared divergences. By generalizing the space of phylogenetic trees in a way that is independent from the likelihood model, our approach opens many avenues for future research into processes of diversification across the life sciences.
Collapse
|
16
|
Huber JH, Hsiang MS, Dlamini N, Murphy M, Vilakati S, Nhlabathi N, Lerch A, Nielsen R, Ntshalintshali N, Greenhouse B, Perkins TA. Inferring person-to-person networks of Plasmodium falciparum transmission: are analyses of routine surveillance data up to the task? Malar J 2022; 21:58. [PMID: 35189905 PMCID: PMC8860266 DOI: 10.1186/s12936-022-04072-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 01/31/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Inference of person-to-person transmission networks using surveillance data is increasingly used to estimate spatiotemporal patterns of pathogen transmission. Several data types can be used to inform transmission network inferences, yet the sensitivity of those inferences to different data types is not routinely evaluated. METHODS The influence of different combinations of spatial, temporal, and travel-history data on transmission network inferences for Plasmodium falciparum malaria were evaluated. RESULTS The information content of these data types may be limited for inferring person-to-person transmission networks and may lead to an overestimate of transmission. Only when outbreaks were temporally focal or travel histories were accurate was the algorithm able to accurately estimate the reproduction number under control, Rc. Applying this approach to data from Eswatini indicated that inferences of Rc and spatiotemporal patterns therein depend upon the choice of data types and assumptions about travel-history data. CONCLUSIONS These results suggest that transmission network inferences made with routine malaria surveillance data should be interpreted with caution.
Collapse
Affiliation(s)
- John H Huber
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA.
| | - Michelle S Hsiang
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, USA.,Malaria Elimination Initiative, Global Health Group, University of California, San Francisco, CA, USA.,Department of Pediatrics, University of California, San Francisco,, CA, USA
| | - Nomcebo Dlamini
- National Malaria Elimination Programme, Ministry of Health, Manzini, Eswatini
| | - Maxwell Murphy
- Department of Medicine, University of California, San Francisco, CA, USA
| | | | - Nomcebo Nhlabathi
- National Malaria Elimination Programme, Ministry of Health, Manzini, Eswatini
| | - Anita Lerch
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA
| | - Rasmus Nielsen
- Department of Integrative Biology and Statistics, University of California, Berkeley, CA, USA
| | | | - Bryan Greenhouse
- Department of Medicine, University of California, San Francisco, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - T Alex Perkins
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA.
| |
Collapse
|
17
|
Methods Combining Genomic and Epidemiological Data in the Reconstruction of Transmission Trees: A Systematic Review. Pathogens 2022; 11:pathogens11020252. [PMID: 35215195 PMCID: PMC8875843 DOI: 10.3390/pathogens11020252] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/08/2022] [Accepted: 02/11/2022] [Indexed: 11/17/2022] Open
Abstract
In order to better understand transmission dynamics and appropriately target control and preventive measures, studies have aimed to identify who-infected-whom in actual outbreaks. Numerous reconstruction methods exist, each with their own assumptions, types of data, and inference strategy. Thus, selecting a method can be difficult. Following PRISMA guidelines, we systematically reviewed the literature for methods combing epidemiological and genomic data in transmission tree reconstruction. We identified 22 methods from the 41 selected articles. We defined three families according to how genomic data was handled: a non-phylogenetic family, a sequential phylogenetic family, and a simultaneous phylogenetic family. We discussed methods according to the data needed as well as the underlying sequence mutation, within-host evolution, transmission, and case observation. In the non-phylogenetic family consisting of eight methods, pairwise genetic distances were estimated. In the phylogenetic families, transmission trees were inferred from phylogenetic trees either simultaneously (nine methods) or sequentially (five methods). While a majority of methods (17/22) modeled the transmission process, few (8/22) took into account imperfect case detection. Within-host evolution was generally (7/8) modeled as a coalescent process. These practical and theoretical considerations were highlighted in order to help select the appropriate method for an outbreak.
Collapse
|
18
|
Zarebski AE, du Plessis L, Parag KV, Pybus OG. A computationally tractable birth-death model that combines phylogenetic and epidemiological data. PLoS Comput Biol 2022; 18:e1009805. [PMID: 35148311 PMCID: PMC8903285 DOI: 10.1371/journal.pcbi.1009805] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 03/08/2022] [Accepted: 01/05/2022] [Indexed: 11/19/2022] Open
Abstract
Inferring the dynamics of pathogen transmission during an outbreak is an important problem in infectious disease epidemiology. In mathematical epidemiology, estimates are often informed by time series of confirmed cases, while in phylodynamics genetic sequences of the pathogen, sampled through time, are the primary data source. Each type of data provides different, and potentially complementary, insight. Recent studies have recognised that combining data sources can improve estimates of the transmission rate and the number of infected individuals. However, inference methods are typically highly specialised and field-specific and are either computationally prohibitive or require intensive simulation, limiting their real-time utility. We present a novel birth-death phylogenetic model and derive a tractable analytic approximation of its likelihood, the computational complexity of which is linear in the size of the dataset. This approach combines epidemiological and phylodynamic data to produce estimates of key parameters of transmission dynamics and the unobserved prevalence. Using simulated data, we show (a) that the approximation agrees well with existing methods, (b) validate the claim of linear complexity and (c) explore robustness to model misspecification. This approximation facilitates inference on large datasets, which is increasingly important as large genomic sequence datasets become commonplace.
Collapse
Affiliation(s)
| | - Louis du Plessis
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | - Kris Varun Parag
- MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, United Kingdom
| | | |
Collapse
|
19
|
Gallagher SK, Follmann D. Branching Process Models to Identify Risk Factors for Infectious Disease Transmission. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2021.2000871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Shannon K. Gallagher
- Biostatistics Research Branch, National Insitute of Allergy and Infectious Diseases, Rockville, MD
| | - Dean Follmann
- Biostatistics Research Branch, National Insitute of Allergy and Infectious Diseases, Rockville, MD
| |
Collapse
|
20
|
Ribado JV, Li NJ, Thiele E, Lyons H, Cotton JA, Weiss A, Tchindebet Ouakou P, Moundai T, Zirimwabagabo H, Guagliardo SAJ, Chabot-Couture G, Proctor JL. Linked surveillance and genetic data uncovers programmatically relevant geographic scale of Guinea worm transmission in Chad. PLoS Negl Trop Dis 2021; 15:e0009609. [PMID: 34310598 PMCID: PMC8341693 DOI: 10.1371/journal.pntd.0009609] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 08/05/2021] [Accepted: 06/29/2021] [Indexed: 11/25/2022] Open
Abstract
Background Guinea worm (Dracunculus medinensis) was detected in Chad in 2010 after a supposed ten-year absence, posing a challenge to the global eradication effort. Initiation of a village-based surveillance system in 2012 revealed a substantial number of dogs infected with Guinea worm, raising questions about paratenic hosts and cross-species transmission. Methodology/principal findings We coupled genomic and surveillance case data from 2012-2018 to investigate the modes of transmission between dog and human hosts and the geographic connectivity of worms. Eighty-six variants across four genes in the mitochondrial genome identified 41 genetically distinct worm genotypes. Spatiotemporal modeling revealed worms with the same genotype (‘genetically identical’) were within a median range of 18.6 kilometers of each other, but largely within approximately 50 kilometers. Genetically identical worms varied in their degree of spatial clustering, suggesting there may be different factors that favor or constrain transmission. Each worm was surrounded by five to ten genetically distinct worms within a 50 kilometer radius. As expected, we observed a change in the genetic similarity distribution between pairs of worms using variants across the complete mitochondrial genome in an independent population. Conclusions/significance In the largest study linking genetic and surveillance data to date of Guinea worm cases in Chad, we show genetic identity and modeling can facilitate the understanding of local transmission. The co-occurrence of genetically non-identical worms in quantitatively identified transmission ranges highlights the necessity for genomic tools to link cases. The improved discrimination between pairs of worms from variants identified across the complete mitochondrial genome suggests that expanding the number of genomic markers could link cases at a finer scale. These results suggest that scaling up genomic surveillance for Guinea worm may provide additional value for programmatic decision-making critical for monitoring cases and intervention efficacy to achieve elimination. The global eradication effort for Guinea worm disease has dramatically decreased the global burden of the disease and enabled 187 countries to be certified by the World Health Organization to be free of endemic transmission. Despite this progress, several countries continue to have endemic transmission. In Chad, a long absence of reported cases was interrupted with the identification of new Guinea worm cases, prompting a substantial scale up of surveillance and intervention efforts. Here, we study the value of increasing genomic surveillance as a tool for programmatic evaluation of surveillance and intervention efforts in Chad. Linking surveillance and genomic samples, parsimonious spatial models help reveal a consistent geographic clustering of similar genetic sequences across Chad. We also demonstrate that expanding the sequencing can offer better resolution for distinguishing Guinea worm samples. In this retrospective study, we found evidence that scaling up genomic surveillance can be an important monitoring and evaluation tool for the eradication program in Chad.
Collapse
Affiliation(s)
- Jessica V. Ribado
- Institute for Disease Modeling, Global Health Division of the Bill and Melinda Gates Foundation, Seattle, Washington, United States of America
| | - Nancy J. Li
- Institute for Disease Modeling, Global Health Division of the Bill and Melinda Gates Foundation, Seattle, Washington, United States of America
| | - Elizabeth Thiele
- Vassar College, Poughkeepsie, New York, United States of America
| | - Hil Lyons
- Institute for Disease Modeling, Global Health Division of the Bill and Melinda Gates Foundation, Seattle, Washington, United States of America
| | - James A. Cotton
- Wellcome Sanger Institute, Hinxton, Cambridgeshire, United Kingdom
| | - Adam Weiss
- The Carter Center, Atlanta, Georgia, United States of America
| | | | - Tchonfienet Moundai
- National Guinea Worm Eradication Program, Ministry of Public Health, N’Djamena, Chad
| | | | - Sarah Anne J. Guagliardo
- The Carter Center, Atlanta, Georgia, United States of America
- Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America
| | - Guillaume Chabot-Couture
- Institute for Disease Modeling, Global Health Division of the Bill and Melinda Gates Foundation, Seattle, Washington, United States of America
| | - Joshua L. Proctor
- Institute for Disease Modeling, Global Health Division of the Bill and Melinda Gates Foundation, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
21
|
Robert A, Funk S, Kucharski AJ. o2geosocial: Reconstructing who-infected-whom from routinely collected surveillance data. F1000Res 2021; 10:31. [PMID: 36998981 PMCID: PMC10044721.2 DOI: 10.12688/f1000research.28073.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/28/2021] [Indexed: 11/20/2022] Open
Abstract
Reconstructing the history of individual transmission events between cases is key to understanding what factors facilitate the spread of an infectious disease. Since conducting extended contact-tracing investigations can be logistically challenging and costly, statistical inference methods have been developed to reconstruct transmission trees from onset dates and genetic sequences. However, these methods are not as effective if the mutation rate of the virus is very slow, or if sequencing data is sparse. We developed the package o2geosocial to combine variables from routinely collected surveillance data with a simple transmission process model. The model reconstructs transmission trees when full genetic sequences are unavailable, or uninformative. Our model incorporates the reported age-group, onset date, location and genotype of infected cases to infer probabilistic transmission trees. The package also includes functions to summarise and visualise the inferred cluster size distribution. The results generated by o2geosocial can highlight regions where importations repeatedly caused large outbreaks, which may indicate a higher regional susceptibility to infections. It can also be used to generate the individual number of secondary transmissions, and show the features associated with individuals involved in high transmission events. The package is available for download from the Comprehensive R Archive Network (CRAN) and GitHub.
Collapse
Affiliation(s)
- Alexis Robert
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
| | - Sebastian Funk
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
| | - Adam J Kucharski
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
| |
Collapse
|
22
|
Dawson D, Rasmussen D, Peng X, Lanzas C. Inferring environmental transmission using phylodynamics: a case-study using simulated evolution of an enteric pathogen. J R Soc Interface 2021; 18:20210041. [PMID: 34102084 DOI: 10.1098/rsif.2021.0041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Indirect (environmental) and direct (host-host) transmission pathways cannot easily be distinguished when they co-occur in epidemics, particularly when they occur on similar time scales. Phylodynamic reconstruction is a potential approach to this problem that combines epidemiological information (temporal, spatial information) with pathogen whole-genome sequencing data to infer transmission trees of epidemics. However, factors such as differences in mutation and transmission rates between host and non-host environments may obscure phylogenetic inference from these methods. In this study, we used a network-based transmission model that explicitly models pathogen evolution to simulate epidemics with both direct and indirect transmission. Epidemics were simulated according to factorial combinations of direct/indirect transmission proportions, host mutation rates and conditions of environmental pathogen growth. Transmission trees were then reconstructed using the phylodynamic approach SCOTTI (structured coalescent transmission tree inference) and evaluated. We found that although insufficient diversity sets a lower bound on when accurate phylodynamic inferences can be made, transmission routes and assumed pathogen lifestyle affected pathogen population structure and subsequently influenced both reconstruction success and the likelihood of direct versus indirect pathways being reconstructed. We conclude that prior knowledge of the likely ecology and population structure of pathogens in host and non-host environments is critical to fully using phylodynamic techniques.
Collapse
Affiliation(s)
- Daniel Dawson
- Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA
| | - David Rasmussen
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA.,Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, USA
| | - Xinxia Peng
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA.,Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA
| | - Cristina Lanzas
- Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
23
|
Yamamoto T, Sawai K, Nishi T, Fukai K, Kato T, Hayama Y, Murato Y, Shimizu Y, Yamaguchi E. Subgrouping and analysis of relationships between classical swine fever virus identified during the 2018-2020 epidemic in Japan by a novel approach using shared genomic variants. Transbound Emerg Dis 2021; 69:1166-1177. [PMID: 33730417 DOI: 10.1111/tbed.14076] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 03/01/2021] [Accepted: 03/15/2021] [Indexed: 11/29/2022]
Abstract
Classical swine fever (CSF) is a worldwide devastating disease of the pig industry caused by classical swine fever virus (CSFV). In September 2018, an outbreak of CSF occurred in Japan where the disease had been eradicated and was officially designated a CSF-free country since 2015. Following the detection of the first 2018 case on a farm in Gifu Prefecture, the disease spread among both farm pigs and wild boars and still continues. Epigenome analysis using whole-genome information is helpful in identifying the infection route, but the current approaches provide an insufficient resolution. In this study, a novel method of using single-nucleotide variants (SNVs) was employed to identify the associations among 158 isolates (65 from farms and 93 from wild boars). The identified groups of CSFV strains were plotted in different colours on a map, identifying the location where each strain was collected. The lack of an SNV set shared between the index case and the other strains suggested the first infection in Japan during the outbreak occurred in wild boars, not at the index farm. For the Atsumi Peninsula outbreaks, where nine farms were found infected within a 10-km radius area, the farm strains were assembled into three groups, suggesting these outbreaks resulted from at least three different infection events in this area. For the infections in the area around Saitama Prefecture, an area remote from the epicentre, strains from both the farms and wild boars were identified as being in the same group, suggesting they resulted from one viral introduction. Likewise, seven infected farms in Okinawa Prefecture, almost 1,500 km from Gifu Prefecture, were identified as being in a common, but separate group. By demonstrating the variety of transmission routes and possibility of long-distance infection, these results will help improve disease control measures.
Collapse
Affiliation(s)
- Takehisa Yamamoto
- Epidemiology Unit, Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki, Japan
| | - Kotaro Sawai
- Epidemiology Unit, Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki, Japan
| | - Tatsuya Nishi
- Foot and Mouth Disease Unit, Division of Transboundary Animal Diseases, National Institute of Animal Health, National Agriculture and Food Research Organization, Kodaira, Japan
| | - Katsuhiko Fukai
- Foot and Mouth Disease Unit, Division of Transboundary Animal Diseases, National Institute of Animal Health, National Agriculture and Food Research Organization, Kodaira, Japan
| | - Tomoko Kato
- Foot and Mouth Disease Unit, Division of Transboundary Animal Diseases, National Institute of Animal Health, National Agriculture and Food Research Organization, Kodaira, Japan
| | - Yoko Hayama
- Epidemiology Unit, Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki, Japan
| | - Yoshinori Murato
- Epidemiology Unit, Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki, Japan
| | - Yumiko Shimizu
- Epidemiology Unit, Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki, Japan
| | - Emi Yamaguchi
- Epidemiology Unit, Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki, Japan
| |
Collapse
|
24
|
Didelot X, Kendall M, Xu Y, White PJ, McCarthy N. Genomic Epidemiology Analysis of Infectious Disease Outbreaks Using TransPhylo. Curr Protoc 2021; 1:e60. [PMID: 33617114 PMCID: PMC7995038 DOI: 10.1002/cpz1.60] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Comparing the pathogen genomes from several cases of an infectious disease has the potential to help us understand and control outbreaks. Many methods exist to reconstruct a phylogeny from such genomes, which represents how the genomes are related to one another. However, such a phylogeny is not directly informative about transmission events between individuals. TransPhylo is a software tool implemented as an R package designed to bridge the gap between pathogen phylogenies and transmission trees. TransPhylo is based on a combined model of transmission between hosts and pathogen evolution within each host. It can simulate both phylogenies and transmission trees jointly under this combined model. TransPhylo can also reconstruct a transmission tree based on a dated phylogeny, by exploring the space of transmission trees compatible with the phylogeny. A transmission tree can be represented as a coloring of a phylogeny where each color represents a different host of the pathogen, and TransPhylo provides convenient ways to plot these colorings and explore the results. This article presents the basic protocols that can be used to make the most of TransPhylo. © 2021 The Authors. Basic Protocol 1: First steps with TransPhylo Basic Protocol 2: Simulation of outbreak data Basic Protocol 3: Inference of transmission Basic Protocol 4: Exploring the results of inference.
Collapse
Affiliation(s)
- Xavier Didelot
- School of Life Sciences and Department of StatisticsUniversity of WarwickUnited Kingdom
| | - Michelle Kendall
- School of Life Sciences and Department of StatisticsUniversity of WarwickUnited Kingdom
| | - Yuanwei Xu
- Center for Computational Biology, Institute of Cancer and Genomic SciencesUniversity of BirminghamUnited Kingdom
| | - Peter J. White
- Department of Infectious Disease Epidemiology, School of Public HealthImperial College LondonUnited Kingdom
- Medical Research Council Centre for Global Infectious Disease Analysis, School of Public HealthImperial College LondonUnited Kingdom
- National Institute for Health Research Health Protection Research Unit in Modelling and Health Economics, School of Public HealthImperial College LondonUnited Kingdom
- Modelling and Economics Unit, National Infection ServicePublic Health EnglandLondonUnited Kingdom
| | - Noel McCarthy
- Warwick Medical SchoolUniversity of WarwickUnited Kingdom
| |
Collapse
|
25
|
Collienne L, Gavryushkin A. Computing nearest neighbour interchange distances between ranked phylogenetic trees. J Math Biol 2021; 82:8. [PMID: 33492606 PMCID: PMC7835203 DOI: 10.1007/s00285-021-01567-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 10/20/2020] [Accepted: 01/07/2021] [Indexed: 11/26/2022]
Abstract
Many popular algorithms for searching the space of leaf-labelled (phylogenetic) trees are based on tree rearrangement operations. Under any such operation, the problem is reduced to searching a graph where vertices are trees and (undirected) edges are given by pairs of trees connected by one rearrangement operation (sometimes called a move). Most popular are the classical nearest neighbour interchange, subtree prune and regraft, and tree bisection and reconnection moves. The problem of computing distances, however, is [Formula: see text]-hard in each of these graphs, making tree inference and comparison algorithms challenging to design in practice. Although anked phylogenetic trees are one of the central objects of interest in applications such as cancer research, immunology, and epidemiology, the computational complexity of the shortest path problem for these trees remained unsolved for decades. In this paper, we settle this problem for the ranked nearest neighbour interchange operation by establishing that the complexity depends on the weight difference between the two types of tree rearrangements (rank moves and edge moves), and varies from quadratic, which is the lowest possible complexity for this problem, to [Formula: see text]-hard, which is the highest. In particular, our result provides the first example of a phylogenetic tree rearrangement operation for which shortest paths, and hence the distance, can be computed efficiently. Specifically, our algorithm scales to trees with tens of thousands of leaves (and likely hundreds of thousands if implemented efficiently).
Collapse
Affiliation(s)
- Lena Collienne
- Department of Computer Science, University of Otago, Dunedin, New Zealand
| | - Alex Gavryushkin
- Department of Computer Science, University of Otago, Dunedin, New Zealand
| |
Collapse
|
26
|
Knyazev S, Hughes L, Skums P, Zelikovsky A. Epidemiological data analysis of viral quasispecies in the next-generation sequencing era. Brief Bioinform 2021; 22:96-108. [PMID: 32568371 PMCID: PMC8485218 DOI: 10.1093/bib/bbaa101] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 04/24/2020] [Accepted: 05/04/2020] [Indexed: 01/04/2023] Open
Abstract
The unprecedented coverage offered by next-generation sequencing (NGS) technology has facilitated the assessment of the population complexity of intra-host RNA viral populations at an unprecedented level of detail. Consequently, analysis of NGS datasets could be used to extract and infer crucial epidemiological and biomedical information on the levels of both infected individuals and susceptible populations, thus enabling the development of more effective prevention strategies and antiviral therapeutics. Such information includes drug resistance, infection stage, transmission clusters and structures of transmission networks. However, NGS data require sophisticated analysis dealing with millions of error-prone short reads per patient. Prior to the NGS era, epidemiological and phylogenetic analyses were geared toward Sanger sequencing technology; now, they must be redesigned to handle the large-scale NGS datasets and properly model the evolution of heterogeneous rapidly mutating viral populations. Additionally, dedicated epidemiological surveillance systems require big data analytics to handle millions of reads obtained from thousands of patients for rapid outbreak investigation and management. We survey bioinformatics tools analyzing NGS data for (i) characterization of intra-host viral population complexity including single nucleotide variant and haplotype calling; (ii) downstream epidemiological analysis and inference of drug-resistant mutations, age of infection and linkage between patients; and (iii) data collection and analytics in surveillance systems for fast response and control of outbreaks.
Collapse
|
27
|
Identifying likely transmissions in Mycobacterium bovis infected populations of cattle and badgers using the Kolmogorov Forward Equations. Sci Rep 2020; 10:21980. [PMID: 33319838 PMCID: PMC7738532 DOI: 10.1038/s41598-020-78900-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 11/20/2020] [Indexed: 11/16/2022] Open
Abstract
Established methods for whole-genome-sequencing (WGS) technology allow for the detection of single-nucleotide polymorphisms (SNPs) in the pathogen genomes sourced from host samples. The information obtained can be used to track the pathogen’s evolution in time and potentially identify ‘who-infected-whom’ with unprecedented accuracy. Successful methods include ‘phylodynamic approaches’ that integrate evolutionary and epidemiological data. However, they are typically computationally intensive, require extensive data, and are best applied when there is a strong molecular clock signal and substantial pathogen diversity. To determine how much transmission information can be inferred when pathogen genetic diversity is low and metadata limited, we propose an analytical approach that combines pathogen WGS data and sampling times from infected hosts. It accounts for ‘between-scale’ processes, in particular within-host pathogen evolution and between-host transmission. We applied this to a well-characterised population with an endemic Mycobacterium bovis (the causative agent of bovine/zoonotic tuberculosis, bTB) infection. Our results show that, even with such limited data and low diversity, the computation of the transmission probability between host pairs can help discriminate between likely and unlikely infection pathways and therefore help to identify potential transmission networks. However, the method can be sensitive to assumptions about within-host evolution.
Collapse
|
28
|
Szabó PM, Szalay D, Kecskeméti S, Molnár T, Szabó I, Bálint Á. Investigations on spreading of PRRSV among swine herds by improved minimum spanning network analysis. Sci Rep 2020; 10:19217. [PMID: 33154401 PMCID: PMC7645787 DOI: 10.1038/s41598-020-75516-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Accepted: 09/07/2020] [Indexed: 11/09/2022] Open
Abstract
In Hungary, the economic losses caused by porcine reproductive and respiratory syndrome virus (PRRSV) led to the launching of a national PRRSV Eradication Program. An important element of the program was investigating the spread of PRRSV among swine herds and the possible ways of introduction by sequencing of the open reading frame 5 (ORF5) gene. However, the classical phylogenetic tree presentation cannot explain several genetic relationships clearly, while more precise visualization can be represented by network tree diagram. In this paper, we describe a practical and easy-to-follow enriched minimum spanning similarity network application for improved representation of phylogenetic relations among viral strains. This method eliminated the necessity of applying a predefined, arbitrary cut-off or computationally extensive algorithms. The network-based visualization allowed processing and visualizing large amount of data equally for the laboratory, private and official veterinarians, and helped identify the potential connections between different viral sequences that support data-driven decisions in the eradication program. By applying network analysis, previously unknown epidemiological connections between infected herds were identified, and virus spreading was analyzed within short period of time. In our study, we successfully built and applied network analysis tools in the course of the Hungarian PRRSV Eradication Program.
Collapse
Affiliation(s)
- Péter Márton Szabó
- Hungarian Academy of Sciences and Semmelweis University, Szigony u. 43., Budapest, 1083, Hungary
| | - Dóra Szalay
- Department of Virology, National Food Chain Safety Office Veterinary Diagnostic Directorate, Tabornok u. 2., Budapest, 1143, Hungary
| | - Sándor Kecskeméti
- Department of Virology, National Food Chain Safety Office Veterinary Diagnostic Directorate, Tabornok u. 2., Budapest, 1143, Hungary
| | - Tamás Molnár
- National PRRS Eradication Committee, Keleti Karoly u. 24., Budapest, 1024, Hungary
| | - István Szabó
- National PRRS Eradication Committee, Keleti Karoly u. 24., Budapest, 1024, Hungary
| | - Ádám Bálint
- Department of Virology, National Food Chain Safety Office Veterinary Diagnostic Directorate, Tabornok u. 2., Budapest, 1143, Hungary.
| |
Collapse
|
29
|
Montazeri H, Little S, Legha MM, Beerenwinkel N, DeGruttola V. Bayesian reconstruction of transmission trees from genetic sequences and uncertain infection times. Stat Appl Genet Mol Biol 2020; 19:/j/sagmb.ahead-of-print/sagmb-2019-0026/sagmb-2019-0026.xml. [PMID: 33085643 PMCID: PMC8212962 DOI: 10.1515/sagmb-2019-0026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 09/16/2020] [Indexed: 11/15/2022]
Abstract
Genetic sequence data of pathogens are increasingly used to investigate transmission dynamics in both endemic diseases and disease outbreaks. Such research can aid in the development of appropriate interventions and in the design of studies to evaluate them. Several computational methods have been proposed to infer transmission chains from sequence data; however, existing methods do not generally reliably reconstruct transmission trees because genetic sequence data or inferred phylogenetic trees from such data contain insufficient information for accurate estimation of transmission chains. Here, we show by simulation studies that incorporating infection times, even when they are uncertain, can greatly improve the accuracy of reconstruction of transmission trees. To achieve this improvement, we propose a Bayesian inference methods using Markov chain Monte Carlo that directly draws samples from the space of transmission trees under the assumption of complete sampling of the outbreak. The likelihood of each transmission tree is computed by a phylogenetic model by treating its internal nodes as transmission events. By a simulation study, we demonstrate that accuracy of the reconstructed transmission trees depends mainly on the amount of information available on times of infection; we show superiority of the proposed method to two alternative approaches when infection times are known up to specified degrees of certainty. In addition, we illustrate the use of a multiple imputation framework to study features of epidemic dynamics, such as the relationship between characteristics of nodes and average number of outbound edges or inbound edges, signifying possible transmission events from and to nodes. We apply the proposed method to a transmission cluster in San Diego and to a dataset from the 2014 Sierra Leone Ebola virus outbreak and investigate the impact of biological, behavioral, and demographic factors.
Collapse
Affiliation(s)
- Hesam Montazeri
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Susan Little
- Department of Medicine, University of California San Diego, California, USA
| | - Mozhgan Mozaffari Legha
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | |
Collapse
|
30
|
Boskova V, Stadler T. PIQMEE: Bayesian Phylodynamic Method for Analysis of Large Data Sets with Duplicate Sequences. Mol Biol Evol 2020; 37:3061-3075. [PMID: 32492139 PMCID: PMC7530608 DOI: 10.1093/molbev/msaa136] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Next-generation sequencing of pathogen quasispecies within a host yields data sets of tens to hundreds of unique sequences. However, the full data set often contains thousands of sequences, because many of those unique sequences have multiple identical copies. Data sets of this size represent a computational challenge for currently available Bayesian phylogenetic and phylodynamic methods. Through simulations, we explore how large data sets with duplicate sequences affect the speed and accuracy of phylogenetic and phylodynamic analysis within BEAST 2. We show that using unique sequences only leads to biases, and using a random subset of sequences yields imprecise parameter estimates. To overcome these shortcomings, we introduce PIQMEE, a BEAST 2 add-on that produces reliable parameter estimates from full data sets with increased computational efficiency as compared with the currently available methods within BEAST 2. The principle behind PIQMEE is to resolve the tree structure of the unique sequences only, while simultaneously estimating the branching times of the duplicate sequences. Distinguishing between unique and duplicate sequences allows our method to perform well even for very large data sets. Although the classic method converges poorly for data sets of 6,000 sequences when allowed to run for 7 days, our method converges in slightly more than 1 day. In fact, PIQMEE can handle data sets of around 21,000 sequences with 20 unique sequences in 14 days. Finally, we apply the method to a real, within-host HIV sequencing data set with several thousand sequences per patient.
Collapse
Affiliation(s)
- Veronika Boskova
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics (SIB), Switzerland
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna, Austria
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics (SIB), Switzerland
| |
Collapse
|
31
|
Lequime S, Bastide P, Dellicour S, Lemey P, Baele G. nosoi: A stochastic agent-based transmission chain simulation framework in r. Methods Ecol Evol 2020; 11:1002-1007. [PMID: 32983401 PMCID: PMC7496779 DOI: 10.1111/2041-210x.13422] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 05/13/2020] [Indexed: 12/22/2022]
Abstract
The transmission process of an infectious agent creates a connected chain of hosts linked by transmission events, known as a transmission chain. Reconstructing transmission chains remains a challenging endeavour, except in rare cases characterized by intense surveillance and epidemiological inquiry. Inference frameworks attempt to estimate or approximate these transmission chains but the accuracy and validity of such methods generally lack formal assessment on datasets for which the actual transmission chain was observed.We here introduce nosoi, an open-source r package that offers a complete, tunable and expandable agent-based framework to simulate transmission chains under a wide range of epidemiological scenarios for single-host and dual-host epidemics. nosoi is accessible through GitHub and CRAN, and is accompanied by extensive documentation, providing help and practical examples to assist users in setting up their own simulations.Once infected, each host or agent can undergo a series of events during each time step, such as moving (between locations) or transmitting the infection, all of these being driven by user-specified rules or data, such as travel patterns between locations. nosoi is able to generate a multitude of epidemic scenarios, that can-for example-be used to validate a wide range of reconstruction methods, including epidemic modelling and phylodynamic analyses. nosoi also offers a comprehensive framework to leverage empirically acquired data, allowing the user to explore how variations in parameters can affect epidemic potential. Aside from research questions, nosoi can provide lecturers with a complete teaching tool to offer students a hands-on exploration of the dynamics of epidemiological processes and the factors that impact it. Because the package does not rely on mathematical formalism but uses a more intuitive algorithmic approach, even extensive changes of the entire model can be easily and quickly implemented.
Collapse
Affiliation(s)
- Sebastian Lequime
- Department of Microbiology, Immunology and TransplantationRega InstituteKU LeuvenLeuvenBelgium
- Cluster of Microbial EcologyGroningen Institute for Evolutionary Life SciencesUniversity of GroningenGroningenThe Netherlands
| | - Paul Bastide
- Department of Microbiology, Immunology and TransplantationRega InstituteKU LeuvenLeuvenBelgium
- IMAGCNRSUniversity of MontpellierMontpellierFrance
| | - Simon Dellicour
- Department of Microbiology, Immunology and TransplantationRega InstituteKU LeuvenLeuvenBelgium
- Spatial Epidemiology Lab (SpELL)Université Libre de BruxellesBrusselsBelgium
| | - Philippe Lemey
- Department of Microbiology, Immunology and TransplantationRega InstituteKU LeuvenLeuvenBelgium
| | - Guy Baele
- Department of Microbiology, Immunology and TransplantationRega InstituteKU LeuvenLeuvenBelgium
| |
Collapse
|
32
|
Abstract
MOTIVATION The combination of genomic and epidemiological data holds the potential to enable accurate pathogen transmission history inference. However, the inference of outbreak transmission histories remains challenging due to various factors such as within-host pathogen diversity and multi-strain infections. Current computational methods ignore within-host diversity and/or multi-strain infections, often failing to accurately infer the transmission history. Thus, there is a need for efficient computational methods for transmission tree inference that accommodate the complexities of real data. RESULTS We formulate the direct transmission inference (DTI) problem for inferring transmission trees that support multi-strain infections given a timed phylogeny and additional epidemiological data. We establish hardness for the decision and counting version of the DTI problem. We introduce Transmission Tree Uniform Sampler (TiTUS), a method that uses SATISFIABILITY to almost uniformly sample from the space of transmission trees. We introduce criteria that prioritize parsimonious transmission trees that we subsequently summarize using a novel consensus tree approach. We demonstrate TiTUS's ability to accurately reconstruct transmission trees on simulated data as well as a documented HIV transmission chain. AVAILABILITY AND IMPLEMENTATION https://github.com/elkebir-group/TiTUS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Palash Sashittal
- Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, Urbama, IL 61801, USA
| | - Mohammed El-Kebir
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbama, IL 61801, USA
| |
Collapse
|
33
|
Robert A, Kucharski AJ, Gastañaduy PA, Paul P, Funk S. Probabilistic reconstruction of measles transmission clusters from routinely collected surveillance data. J R Soc Interface 2020; 17:20200084. [PMID: 32603651 PMCID: PMC7423430 DOI: 10.1098/rsif.2020.0084] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 06/08/2020] [Indexed: 12/24/2022] Open
Abstract
Pockets of susceptibility resulting from spatial or social heterogeneity in vaccine coverage can drive measles outbreaks, as cases imported into such pockets are likely to cause further transmission and lead to large transmission clusters. Characterizing the dynamics of transmission is essential for identifying which individuals and regions might be most at risk. As data from detailed contact-tracing investigations are not available in many settings, we developed an R package called o2geosocial to reconstruct the transmission clusters and the importation status of the cases from their age, location, genotype and onset date. We compared our inferred cluster size distributions to 737 transmission clusters identified through detailed contact-tracing in the USA between 2001 and 2016. We were able to reconstruct the importation status of the cases and found good agreement between the inferred and reference clusters. The results were improved when the contact-tracing investigations were used to set the importation status before running the model. Spatial heterogeneity in vaccine coverage is difficult to measure directly. Our approach was able to highlight areas with potential for local transmission using a minimal number of variables and could be applied to assess the intensity of ongoing transmission in a region.
Collapse
Affiliation(s)
- Alexis Robert
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
- Centre for the Mathematical Modelling of Infectious Disease, London School of Hygiene and Tropical Medicine, London, UK
| | - Adam J. Kucharski
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
- Centre for the Mathematical Modelling of Infectious Disease, London School of Hygiene and Tropical Medicine, London, UK
| | - Paul A. Gastañaduy
- Division of Viral Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Prabasaj Paul
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Sebastian Funk
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
- Centre for the Mathematical Modelling of Infectious Disease, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
34
|
Moshiri N, Ragonnet-Cronin M, Wertheim JO, Mirarab S. FAVITES: simultaneous simulation of transmission networks, phylogenetic trees and sequences. Bioinformatics 2020; 35:1852-1861. [PMID: 30395173 DOI: 10.1093/bioinformatics/bty921] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 10/29/2018] [Accepted: 11/01/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The ability to simulate epidemics as a function of model parameters allows insights that are unobtainable from real datasets. Further, reconstructing transmission networks for fast-evolving viruses like Human Immunodeficiency Virus (HIV) may have the potential to greatly enhance epidemic intervention, but transmission network reconstruction methods have been inadequately studied, largely because it is difficult to obtain 'truth' sets on which to test them and properly measure their performance. RESULTS We introduce FrAmework for VIral Transmission and Evolution Simulation (FAVITES), a robust framework for simulating realistic datasets for epidemics that are caused by fast-evolving pathogens like HIV. FAVITES creates a generative model to produce contact networks, transmission networks, phylogenetic trees and sequence datasets, and to add error to the data. FAVITES is designed to be extensible by dividing the generative model into modules, each of which is expressed as a fixed API that can be implemented using various models. We use FAVITES to simulate HIV datasets and study the realism of the simulated datasets. We then use the simulated data to study the impact of the increased treatment efforts on epidemiological outcomes. We also study two transmission network reconstruction methods and their effectiveness in detecting fast-growing clusters. AVAILABILITY AND IMPLEMENTATION FAVITES is available at https://github.com/niemasd/FAVITES, and a Docker image can be found on DockerHub (https://hub.docker.com/r/niemasd/favites). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Niema Moshiri
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, La Jolla, USA
| | | | | | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, USA
| |
Collapse
|
35
|
Cassidy R, Kypraios T, O'Neill PD. Modelling, Bayesian inference, and model assessment for nosocomial pathogens using whole-genome-sequence data. Stat Med 2020; 39:1746-1765. [PMID: 32142587 PMCID: PMC7217057 DOI: 10.1002/sim.8510] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 01/15/2020] [Accepted: 01/31/2020] [Indexed: 12/28/2022]
Abstract
Whole‐genome sequencing of pathogens in outbreaks of infectious disease provides the potential to reconstruct transmission pathways and enhance the information contained in conventional epidemiological data. In recent years, there have been numerous new methods and models developed to exploit such high‐resolution genetic data. However, corresponding methods for model assessment have been largely overlooked. In this article, we develop both new modelling methods and new model assessment methods, specifically by building on the work of Worby et al. Although the methods are generic in nature, we focus specifically on nosocomial pathogens and analyze a dataset collected during an outbreak of MRSA in a hospital setting.
Collapse
Affiliation(s)
- Rosanna Cassidy
- School of Mathematical Sciences, University of Nottingham, Nottingham, UK
| | - Theodore Kypraios
- School of Mathematical Sciences, University of Nottingham, Nottingham, UK
| | - Philip D O'Neill
- School of Mathematical Sciences, University of Nottingham, Nottingham, UK
| |
Collapse
|
36
|
Guimaraes AMS, Zimpel CK. Mycobacterium bovis: From Genotyping to Genome Sequencing. Microorganisms 2020; 8:E667. [PMID: 32375210 PMCID: PMC7285088 DOI: 10.3390/microorganisms8050667] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 04/17/2020] [Accepted: 04/21/2020] [Indexed: 12/15/2022] Open
Abstract
Mycobacterium bovis is the main pathogen of bovine, zoonotic, and wildlife tuberculosis. Despite the existence of programs for bovine tuberculosis (bTB) control in many regions, the disease remains a challenge for the veterinary and public health sectors, especially in developing countries and in high-income nations with wildlife reservoirs. Current bTB control programs are mostly based on test-and-slaughter, movement restrictions, and post-mortem inspection measures. In certain settings, contact tracing and surveillance has benefited from M. bovis genotyping techniques. More recently, whole-genome sequencing (WGS) has become the preferential technique to inform outbreak response through contact tracing and source identification for many infectious diseases. As the cost per genome decreases, the application of WGS to bTB control programs is inevitable moving forward. However, there are technical challenges in data analyses and interpretation that hinder the implementation of M. bovis WGS as a molecular epidemiology tool. Therefore, the aim of this review is to describe M. bovis genotyping techniques and discuss current standards and challenges of the use of M. bovis WGS for transmission investigation, surveillance, and global lineages distribution. We compiled a series of associated research gaps to be explored with the ultimate goal of implementing M. bovis WGS in a standardized manner in bTB control programs.
Collapse
Affiliation(s)
- Ana M. S. Guimaraes
- Laboratory of Applied Research in Mycobacteria, Department of Microbiology, University of São Paulo, São Paulo 01246-904, Brazil;
| | - Cristina K. Zimpel
- Laboratory of Applied Research in Mycobacteria, Department of Microbiology, University of São Paulo, São Paulo 01246-904, Brazil;
- Department of Preventive Veterinary Medicine and Animal Health, University of São Paulo, São Paulo 01246-904, Brazil
| |
Collapse
|
37
|
Chaters GL, Johnson PCD, Cleaveland S, Crispell J, de Glanville WA, Doherty T, Matthews L, Mohr S, Nyasebwa OM, Rossi G, Salvador LCM, Swai E, Kao RR. Analysing livestock network data for infectious disease control: an argument for routine data collection in emerging economies. Philos Trans R Soc Lond B Biol Sci 2020; 374:20180264. [PMID: 31104601 DOI: 10.1098/rstb.2018.0264] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Livestock movements are an important mechanism of infectious disease transmission. Where these are well recorded, network analysis tools have been used to successfully identify system properties, highlight vulnerabilities to transmission, and inform targeted surveillance and control. Here we highlight the main uses of network properties in understanding livestock disease epidemiology and discuss statistical approaches to infer network characteristics from biased or fragmented datasets. We use a 'hurdle model' approach that predicts (i) the probability of movement and (ii) the number of livestock moved to generate synthetic 'complete' networks of movements between administrative wards, exploiting routinely collected government movement permit data from northern Tanzania. We demonstrate that this model captures a significant amount of the observed variation. Combining the cattle movement network with a spatial between-ward contact layer, we create a multiplex, over which we simulated the spread of 'fast' ( R0 = 3) and 'slow' ( R0 = 1.5) pathogens, and assess the effects of random versus targeted disease control interventions (vaccination and movement ban). The targeted interventions substantially outperform those randomly implemented for both fast and slow pathogens. Our findings provide motivation to encourage routine collection and centralization of movement data to construct representative networks. This article is part of the theme issue 'Modelling infectious disease outbreaks in humans, animals and plants: epidemic forecasting and control'. This theme issue is linked with the earlier issue 'Modelling infectious disease outbreaks in humans, animals and plants: approaches and important themes'.
Collapse
Affiliation(s)
- G L Chaters
- 1 Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow , Glasgow G12 8QQ , UK
| | - P C D Johnson
- 1 Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow , Glasgow G12 8QQ , UK
| | - S Cleaveland
- 1 Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow , Glasgow G12 8QQ , UK
| | - J Crispell
- 2 School of Veterinary Medicine, University College Dublin , Dublin , Ireland
| | - W A de Glanville
- 1 Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow , Glasgow G12 8QQ , UK
| | - T Doherty
- 3 Royal (Dick) School of Veterinary Studies and Roslin Institute, University of Edinburgh , Easter Bush Campus, Midlothian EH25 9RG , UK
| | - L Matthews
- 1 Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow , Glasgow G12 8QQ , UK
| | - S Mohr
- 1 Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow , Glasgow G12 8QQ , UK
| | - O M Nyasebwa
- 6 Department of Veterinary Services, Ministry of Livestock and Fisheries, Nelson Mandela Road , Dar Es Salaam , Tanzania
| | - G Rossi
- 3 Royal (Dick) School of Veterinary Studies and Roslin Institute, University of Edinburgh , Easter Bush Campus, Midlothian EH25 9RG , UK
| | - L C M Salvador
- 3 Royal (Dick) School of Veterinary Studies and Roslin Institute, University of Edinburgh , Easter Bush Campus, Midlothian EH25 9RG , UK.,4 Department of Infectious Diseases, University of Georgia , Athens, GA 30602 , USA.,5 Institute of Bioinformatics, University of Georgia , Athens, GA 30602 , USA
| | - E Swai
- 6 Department of Veterinary Services, Ministry of Livestock and Fisheries, Nelson Mandela Road , Dar Es Salaam , Tanzania
| | - R R Kao
- 3 Royal (Dick) School of Veterinary Studies and Roslin Institute, University of Edinburgh , Easter Bush Campus, Midlothian EH25 9RG , UK
| |
Collapse
|
38
|
Polonsky JA, Baidjoe A, Kamvar ZN, Cori A, Durski K, Edmunds WJ, Eggo RM, Funk S, Kaiser L, Keating P, de Waroux OLP, Marks M, Moraga P, Morgan O, Nouvellet P, Ratnayake R, Roberts CH, Whitworth J, Jombart T. Outbreak analytics: a developing data science for informing the response to emerging pathogens. Philos Trans R Soc Lond B Biol Sci 2020; 374:20180276. [PMID: 31104603 PMCID: PMC6558557 DOI: 10.1098/rstb.2018.0276] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Despite continued efforts to improve health systems worldwide, emerging pathogen epidemics remain a major public health concern. Effective response to such outbreaks relies on timely intervention, ideally informed by all available sources of data. The collection, visualization and analysis of outbreak data are becoming increasingly complex, owing to the diversity in types of data, questions and available methods to address them. Recent advances have led to the rise of outbreak analytics, an emerging data science focused on the technological and methodological aspects of the outbreak data pipeline, from collection to analysis, modelling and reporting to inform outbreak response. In this article, we assess the current state of the field. After laying out the context of outbreak response, we critically review the most common analytics components, their inter-dependencies, data requirements and the type of information they can provide to inform operations in real time. We discuss some challenges and opportunities and conclude on the potential role of outbreak analytics for improving our understanding of, and response to outbreaks of emerging pathogens. This article is part of the theme issue ‘Modelling infectious disease outbreaks in humans, animals and plants: epidemic forecasting and control‘. This theme issue is linked with the earlier issue ‘Modelling infectious disease outbreaks in humans, animals and plants: approaches and important themes’.
Collapse
Affiliation(s)
- Jonathan A Polonsky
- 1 Department of Health Emergency Information and Risk Assessment, World Health Organization , Avenue Appia 20, 1211 Geneva , Switzerland.,3 Faculty of Medicine, University of Geneva , 1 rue Michel-Servet, 1211 Geneva , Switzerland
| | - Amrish Baidjoe
- 4 Department of Infectious Disease Epidemiology, School of Public Health, MRC Centre for Global Infectious Disease Analysis, Imperial College London , Medical School Building, St Mary's Campus, Norfolk Place London W2 1PG , UK
| | - Zhian N Kamvar
- 4 Department of Infectious Disease Epidemiology, School of Public Health, MRC Centre for Global Infectious Disease Analysis, Imperial College London , Medical School Building, St Mary's Campus, Norfolk Place London W2 1PG , UK
| | - Anne Cori
- 4 Department of Infectious Disease Epidemiology, School of Public Health, MRC Centre for Global Infectious Disease Analysis, Imperial College London , Medical School Building, St Mary's Campus, Norfolk Place London W2 1PG , UK
| | - Kara Durski
- 2 Department of Infectious Hazard Management, World Health Organization , Avenue Appia 20, 1211 Geneva , Switzerland
| | - W John Edmunds
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,6 Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK
| | - Rosalind M Eggo
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,6 Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK
| | - Sebastian Funk
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,6 Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK
| | - Laurent Kaiser
- 3 Faculty of Medicine, University of Geneva , 1 rue Michel-Servet, 1211 Geneva , Switzerland
| | - Patrick Keating
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,8 UK Public Health Rapid Support Team , London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT , UK
| | - Olivier le Polain de Waroux
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,8 UK Public Health Rapid Support Team , London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT , UK.,9 Public Health England , Wellington House, 133-155 Waterloo Road, London SE1 8UG , UK
| | - Michael Marks
- 7 Clinical Research Department, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK
| | - Paula Moraga
- 10 Centre for Health Informatics, Computing and Statistics (CHICAS), Lancaster Medical School, Lancaster University , Lancaster LA1 4YW , UK
| | - Oliver Morgan
- 1 Department of Health Emergency Information and Risk Assessment, World Health Organization , Avenue Appia 20, 1211 Geneva , Switzerland
| | - Pierre Nouvellet
- 4 Department of Infectious Disease Epidemiology, School of Public Health, MRC Centre for Global Infectious Disease Analysis, Imperial College London , Medical School Building, St Mary's Campus, Norfolk Place London W2 1PG , UK.,11 School of Life Sciences, University of Sussex , Sussex House, Brighton BN1 9RH , UK
| | - Ruwan Ratnayake
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,6 Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK
| | - Chrissy H Roberts
- 7 Clinical Research Department, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK
| | - Jimmy Whitworth
- 5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,8 UK Public Health Rapid Support Team , London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT , UK
| | - Thibaut Jombart
- 4 Department of Infectious Disease Epidemiology, School of Public Health, MRC Centre for Global Infectious Disease Analysis, Imperial College London , Medical School Building, St Mary's Campus, Norfolk Place London W2 1PG , UK.,5 Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine , Keppel St, London WC1E 7HT , UK.,8 UK Public Health Rapid Support Team , London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT , UK
| |
Collapse
|
39
|
Abstract
PURPOSE OF REVIEW Within-host diversity complicates transmission models because it recognizes that between-host virus phylogenies are not identical to the transmission history among the infected hosts. This review presents the biological and theoretical foundations for recent development in this field, and shows that modern phylodynamic methods are capable of inferring realistic transmission histories from HIV sequence data. RECENT FINDINGS Transmission of single or multiple genetic variants from a donor's HIV population results in donor-recipient phylogenies with combinations of monophyletic, paraphyletic, and polyphyletic patterns. Large-scale simulations and analyses of many real HIV datasets have established that transmission direction, directness, or common source often can be inferred based on HIV sequence data. Phylodynamic reconstruction of HIV transmissions that include within-host HIV diversity have recently been established and made available in several software packages. SUMMARY Phylodynamic methods that include realistic features of HIV genetic diversification have come of age, significantly improving inference of key epidemiological parameters. This opens the door to more accurate surveillance and better-informed prevention campaigns.
Collapse
|
40
|
Han AX, Parker E, Maurer-Stroh S, Russell CA. Inferring putative transmission clusters with Phydelity. Virus Evol 2019; 5:vez039. [PMID: 31616568 PMCID: PMC6785678 DOI: 10.1093/ve/vez039] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Current phylogenetic clustering approaches for identifying pathogen transmission clusters are limited by their dependency on arbitrarily defined genetic distance thresholds for within-cluster divergence. Incomplete knowledge of a pathogen’s underlying dynamics often reduces the choice of distance threshold to an exploratory, ad hoc exercise that is difficult to standardise across studies. Phydelity is a new tool for the identification of transmission clusters in pathogen phylogenies. It identifies groups of sequences that are more closely related than the ensemble distribution of the phylogeny under a statistically principled and phylogeny-informed framework, without the introduction of arbitrary distance thresholds. Relative to other distance threshold- and model-based methods, Phydelity outputs clusters with higher purity and lower probability of misclassification in simulated phylogenies. Applying Phydelity to empirical datasets of hepatitis B and C virus infections showed that Phydelity identified clusters with better correspondence to individuals that are more likely to be linked by transmission events relative to other widely used non-parametric phylogenetic clustering methods without the need for parameter calibration. Phydelity is generalisable to any pathogen and can be used to identify putative direct transmission events. Phydelity is freely available at https://github.com/alvinxhan/Phydelity.
Collapse
Affiliation(s)
- Alvin X Han
- Protein Sequence Analysis Group, Bioinformatics Institute, Agency for Science, Technology and Research (ASTAR), 30 Biopolis Street, 138671 Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore (NUS), 21 Lower Kent Ridge, 119077 Singapore.,Laboratory of Applied Evolutionary Biology, Department of Medical Microbiology, Academic Medical Centre, Meibergdreef 9, 1105 AZ Amsterdam-Zuidoost, The Netherlands
| | - Edyth Parker
- Laboratory of Applied Evolutionary Biology, Department of Medical Microbiology, Academic Medical Centre, Meibergdreef 9, 1105 AZ Amsterdam-Zuidoost, The Netherlands.,Department of Veterinary Medicine, University of Cambridge, Madingley Rd, Cambridge CB3 0ES, UK
| | - Sebastian Maurer-Stroh
- Protein Sequence Analysis Group, Bioinformatics Institute, Agency for Science, Technology and Research (ASTAR), 30 Biopolis Street, 138671 Singapore.,Department of Biological Sciences, National University of Singapore, 16 Science Drive 4, 117558 Singapore
| | - Colin A Russell
- Laboratory of Applied Evolutionary Biology, Department of Medical Microbiology, Academic Medical Centre, Meibergdreef 9, 1105 AZ Amsterdam-Zuidoost, The Netherlands
| |
Collapse
|
41
|
Fujikura Y, Hamamoto T, Kanayama A, Kaku K, Yamagishi J, Kawana A. Bayesian reconstruction of a vancomycin-resistant Enterococcus transmission route using epidemiologic data and genomic variants from whole genome sequencing. J Hosp Infect 2019; 103:395-403. [PMID: 31425718 DOI: 10.1016/j.jhin.2019.08.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 08/12/2019] [Indexed: 11/16/2022]
Abstract
BACKGROUND Outbreaks of vancomycin-resistant enterococcus (VRE) are a serious problem in hospitals. Inferring the transmission route is an important factor to institute appropriate infection control measures; however, the methodology has not been fully established. AIM To reconstruct and evaluate the transmission model using sequence variants extracted from whole genome sequencing (WGS) data and epidemiological information from patients involved in a VRE outbreak. METHODS During a VRE outbreak in our hospital, 23 samples were collected from patients and environmental surfaces and analysed using WGS. By combining genome alignment information with patient epidemiological data, the VRE transmission route was reconstructed using a Bayesian approach. With the transmission model, evaluation and further analyses were performed to identify risk factors that contributed to the outbreak. FINDINGS All VREs were identified as Enterococcus faecium belonging to sequence type 17, which consisted of two VRE genotypes: vanA (N = 8, including one environmental sample) and vanB (N = 15). The reconstruction model using the Bayesian approach showed the transmission direction with posterior probability and revealed transmission through an environmental surface. In addition, some cases acting as VRE spreaders were identified, which can interfere with appropriate infection control. Vancomycin administration was identified as a significant risk factor for spreaders. CONCLUSION A Bayesian approach for transmission route reconstruction using epidemiologic data and genomic variants from WGS can be applied in actual VRE outbreaks. This may contribute to the design and implementation of effective infection control measures.
Collapse
Affiliation(s)
- Y Fujikura
- Department of Medical Risk Management and Infection Control, National Defense Medical College Hospital, Saitama, Japan; Division of Infectious Diseases and Respiratory Medicine, Department of Internal Medicine, National Defense Medical College, Saitama, Japan.
| | - T Hamamoto
- Department of Clinical Laboratory, National Defense Medical College Hospital, Saitama, Japan
| | - A Kanayama
- Division of Infectious Diseases Epidemiology and Control, National Defense Medical College Research Institute, Saitama, Japan
| | - K Kaku
- Division of Infectious Diseases Epidemiology and Control, National Defense Medical College Research Institute, Saitama, Japan
| | - J Yamagishi
- Research Center for Zoonosis Control, Hokkaido University, Sapporo, Japan
| | - A Kawana
- Division of Infectious Diseases and Respiratory Medicine, Department of Internal Medicine, National Defense Medical College, Saitama, Japan
| |
Collapse
|
42
|
Theys K, Lemey P, Vandamme AM, Baele G. Advances in Visualization Tools for Phylogenomic and Phylodynamic Studies of Viral Diseases. Front Public Health 2019; 7:208. [PMID: 31428595 PMCID: PMC6688121 DOI: 10.3389/fpubh.2019.00208] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Accepted: 07/12/2019] [Indexed: 01/28/2023] Open
Abstract
Genomic and epidemiological monitoring have become an integral part of our response to emerging and ongoing epidemics of viral infectious diseases. Advances in high-throughput sequencing, including portable genomic sequencing at reduced costs and turnaround time, are paralleled by continuing developments in methodology to infer evolutionary histories (dynamics/patterns) and to identify factors driving viral spread in space and time. The traditionally static nature of visualizing phylogenetic trees that represent these evolutionary relationships/processes has also evolved, albeit perhaps at a slower rate. Advanced visualization tools with increased resolution assist in drawing conclusions from phylogenetic estimates and may even have potential to better inform public health and treatment decisions, but the design (and choice of what analyses are shown) is hindered by the complexity of information embedded within current phylogenetic models and the integration of available meta-data. In this review, we discuss visualization challenges for the interpretation and exploration of reconstructed histories of viral epidemics that arose from increasing volumes of sequence data and the wealth of additional data layers that can be integrated. We focus on solutions that address joint temporal and spatial visualization but also consider what the future may bring in terms of visualization and how this may become of value for the coming era of real-time digital pathogen surveillance, where actionable results and adequate intervention strategies need to be obtained within days.
Collapse
Affiliation(s)
- Kristof Theys
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Clinical and Epidemiological Virology, KU Leuven, Leuven, Belgium
| | - Philippe Lemey
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Clinical and Epidemiological Virology, KU Leuven, Leuven, Belgium
| | - Anne-Mieke Vandamme
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Clinical and Epidemiological Virology, KU Leuven, Leuven, Belgium
| | - Guy Baele
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Clinical and Epidemiological Virology, KU Leuven, Leuven, Belgium
| |
Collapse
|
43
|
Abstract
One approach to the reconstruction of infectious disease transmission trees from pathogen genomic data has been to use a phylogenetic tree, reconstructed from pathogen sequences, and annotate its internal nodes to provide a reconstruction of which host each lineage was in at each point in time. If only one pathogen lineage can be transmitted to a new host (i.e., the transmission bottleneck is complete), this corresponds to partitioning the nodes of the phylogeny into connected regions, each of which represents evolution in an individual host. These partitions define the possible transmission trees that are consistent with a given phylogenetic tree. However, the mathematical properties of the transmission trees given a phylogeny remain largely unexplored. Here, we describe a procedure to calculate the number of possible transmission trees for a given phylogeny, and we then show how to uniformly sample from these transmission trees. The procedure is outlined for situations where one sample is available from each host and trees do not have branch lengths, and we also provide extensions for incomplete sampling, multiple sampling, and the application to time trees in a situation where limits on the period during which each host could have been infected and infectious are known. The sampling algorithm is available as an R package (STraTUS).
Collapse
Affiliation(s)
- Matthew D Hall
- Nuffield Department of Medicine, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Caroline Colijn
- Department of Mathematics, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
44
|
Ciccozzi M, Lai A, Zehender G, Borsetti A, Cella E, Ciotti M, Sagnelli E, Sagnelli C, Angeletti S. The phylogenetic approach for viral infectious disease evolution and epidemiology: An updating review. J Med Virol 2019; 91:1707-1724. [PMID: 31243773 DOI: 10.1002/jmv.25526] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 06/24/2019] [Indexed: 12/16/2022]
Abstract
In the last decade, the phylogenetic approach is recurrent in molecular evolutionary analysis. On 12 May, 2019, about 2 296 213 papers are found, but typing "phylogeny" or "epidemiology AND phylogeny" only 199 804 and 20 133 are retrieved, respectively. Molecular epidemiology in infectious diseases is widely used to define the source of infection as so as the ancestral relationships of individuals sampled from a population. Coalescent theory and phylogeographic analysis have had scientific application in several, recent pandemic events, and nosocomial outbreaks. Hepatitis viruses and immunodeficiency virus (human immunodeficiency virus) have been largely studied. Phylogenetic analysis has been recently applied on Polyomaviruses so as in the more recent outbreaks due to different arboviruses type as Zika and chikungunya viruses discovering the source of infection and the geographic spread. Data on sequences isolated by the microorganism are essential to apply the phylogenetic tools and research in the field of infectious disease phylodinamics is growing up. There is the need to apply molecular phylogenetic and evolutionary methods in areas out of infectious diseases, as translational genomics and personalized medicine. Lastly, the application of these tools in vaccine strategy so as in antibiotic and antiviral researchers are encouraged.
Collapse
Affiliation(s)
- Massimo Ciccozzi
- Unit of Medical Statistics and Molecular Epidemiology, University Campus Bio-Medico of Rome, Rome, Italy
| | - Alessia Lai
- Department of Biomedical and Clinical Sciences 'L. Sacco', University of Milan, Milan, Italy
| | - Gianguglielmo Zehender
- Department of Biomedical and Clinical Sciences 'L. Sacco', University of Milan, Milan, Italy
| | - Alessandra Borsetti
- National HIV/AIDS Research Center, Istituto Superiore di Sanità, Roma, Italy
| | - Eleonora Cella
- Unit of Medical Statistics and Molecular Epidemiology, University Campus Bio-Medico of Rome, Rome, Italy
| | - Marco Ciotti
- Laboratory of Molecular Virology, Polyclinic Tor Vergata Foundation, Rome, Italy
| | - Evangelista Sagnelli
- Department of Mental Health and Public Medicine, Section of Infectious Diseases, University of Campania Luigi Vanvitelli, Naples, Italy
| | - Caterina Sagnelli
- Department of Mental Health and Public Medicine, Section of Infectious Diseases, University of Campania Luigi Vanvitelli, Naples, Italy
| | - Silvia Angeletti
- Unit of Clinical Laboratory Science, University Campus Bio-Medico of Rome, Rome, Italy
| |
Collapse
|
45
|
Hayama Y, Firestone SM, Stevenson MA, Yamamoto T, Nishi T, Shimizu Y, Tsutsui T. Reconstructing a transmission network and identifying risk factors of secondary transmissions in the 2010 foot-and-mouth disease outbreak in Japan. Transbound Emerg Dis 2019; 66:2074-2086. [PMID: 31131968 DOI: 10.1111/tbed.13256] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 05/17/2019] [Accepted: 05/17/2019] [Indexed: 11/27/2022]
Abstract
Research aimed at understanding transmission networks, representing a network of "who infected whom" for an infectious disease outbreak, have been actively conducted in recent years. Transmission network models incorporating epidemiological and genetic data are valuable for elucidating disease transmission pathways. In this study, we reconstructed the transmission network of the foot-and-mouth disease (FMD) epidemic in Japan in 2010, and explored farm-level risk factors associated with increased risk of secondary transmission. A published, systematic Bayesian transmission network model was applied to epidemiological data of 292 infected farms and whole genome sequence data of 104 of the infected farms. This model can make inferences for known infected farms even lacking genetic data. After estimating the consensus network, the accuracy of the network was examined by comparison with epidemiological data. Then, risk factors inferred to have been sources of secondary transmission were explored using zero-inflated Poisson regression model. As far as we are aware, this study represents the largest FMD outbreak transmission network to be published by such means combining epidemiological and genetic data. The consensus network reasonably generated the epidemiological links, which were estimated from the actual epidemiological investigation. Among 292 farms, 101 farms (35%) were inferred to have been the sources of secondary transmission, and amongst these farms, the median number of secondary cases was 2 (min:1-max:18) farms. The farm-type (small and large -sized pig farms), the number of days from onset to notification, and the number of susceptible farms within a 1-km radius were significantly associated with secondary transmission. Transmission network modelling enabled inference of the connections between infected farms during the FMD epidemic and identified important factors for controlling the risk of secondary transmission. This study demonstrated that the predominant susceptible species held on a farm, farm size, and animal density were associated with increased onwards transmission.
Collapse
Affiliation(s)
- Yoko Hayama
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Japan
| | - Simon M Firestone
- Faculty of Veterinary and Agricultural Sciences, Melbourne Veterinary School, Asia-Pacific Centre for Animal Health, The University of Melbourne, Parkville, Victoria, Australia
| | - Mark A Stevenson
- Faculty of Veterinary and Agricultural Sciences, Melbourne Veterinary School, Asia-Pacific Centre for Animal Health, The University of Melbourne, Parkville, Victoria, Australia
| | - Takehisa Yamamoto
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Japan
| | - Tatsuya Nishi
- Exotic Disease Research Station, National Institute of Animal Health, National Agriculture and Food Research Organization, Kodaira, Japan
| | - Yumiko Shimizu
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Japan
| | - Toshiyuki Tsutsui
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Japan
| |
Collapse
|
46
|
Abstract
HIV is one of the fastest evolving organisms known. It evolves about 1 million times faster than its host, humans. Because HIV establishes chronic infections, with continuous evolution, its divergence within a single infected human surpasses the divergence of the entire humanoid history. Yet, it is still the same virus, infecting the same cell types and using the same replication machinery year after year. Hence, one would think that most mutations that HIV accumulates are neutral. But the picture is more complicated than that. HIV evolution is also a clear example of strong positive selection, that is, mutants have a survival advantage. How do these facts come together?
Collapse
Affiliation(s)
- Thomas Leitner
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM
| |
Collapse
|
47
|
Gilbertson MLJ, Fountain-Jones NM, Craft ME. Incorporating genomic methods into contact networks to reveal new insights into animal behavior and infectious disease dynamics. BEHAVIOUR 2019; 155:759-791. [PMID: 31680698 DOI: 10.1163/1568539x-00003471] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Utilization of contact networks has provided opportunities for assessing the dynamic interplay between pathogen transmission and host behavior. Genomic techniques have, in their own right, provided new insight into complex questions in disease ecology, and the increasing accessibility of genomic approaches means more researchers may seek out these tools. The integration of network and genomic approaches provides opportunities to examine the interaction between behavior and pathogen transmission in new ways and with greater resolution. While a number of studies have begun to incorporate both contact network and genomic approaches, a great deal of work has yet to be done to better integrate these techniques. In this review, we give a broad overview of how network and genomic approaches have each been used to address questions regarding the interaction of social behavior and infectious disease, and then discuss current work and future horizons for the merging of these techniques.
Collapse
Affiliation(s)
- Marie L J Gilbertson
- Department of Veterinary Population Medicine, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Nicholas M Fountain-Jones
- Department of Veterinary Population Medicine, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Meggan E Craft
- Department of Veterinary Population Medicine, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
48
|
Firestone SM, Hayama Y, Bradhurst R, Yamamoto T, Tsutsui T, Stevenson MA. Reconstructing foot-and-mouth disease outbreaks: a methods comparison of transmission network models. Sci Rep 2019; 9:4809. [PMID: 30886211 PMCID: PMC6423326 DOI: 10.1038/s41598-019-41103-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 02/28/2019] [Indexed: 12/22/2022] Open
Abstract
A number of transmission network models are available that combine genomic and epidemiological data to reconstruct networks of who infected whom during infectious disease outbreaks. For such models to reliably inform decision-making they must be transparently validated, robust, and capable of producing accurate predictions within the short data collection and inference timeframes typical of outbreak responses. A lack of transparent multi-model comparisons reduces confidence in the accuracy of transmission network model outputs, negatively impacting on their more widespread use as decision-support tools. We undertook a formal comparison of the performance of nine published transmission network models based on a set of foot-and-mouth disease outbreaks simulated in a previously free country, with corresponding simulated phylogenies and genomic samples from animals on infected premises. Of the transmission network models tested, Lau’s systematic Bayesian integration framework was found to be the most accurate for inferring the transmission network and timing of exposures, correctly identifying the source of 73% of the infected premises (with 91% accuracy for sources with model support >0.80). The Structured COalescent Transmission Tree Inference provided the most accurate inference of molecular clock rates. This validation study points to which models might be reliably used to reconstruct similar future outbreaks and how to interpret the outputs to inform control. Further research could involve extending the best-performing models to explicitly represent within-host diversity so they can handle next-generation sequencing data, incorporating additional animal and farm-level covariates and combining predictions using Ensemble methods and other approaches.
Collapse
Affiliation(s)
- Simon M Firestone
- Asia-Pacific Centre for Animal Health, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Yoko Hayama
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Ibaraki, 305-0856, Japan
| | - Richard Bradhurst
- Centre of Excellence for Biosecurity Risk Analysis, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Takehisa Yamamoto
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Ibaraki, 305-0856, Japan
| | - Toshiyuki Tsutsui
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Ibaraki, 305-0856, Japan
| | - Mark A Stevenson
- Asia-Pacific Centre for Animal Health, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
49
|
Campbell F, Didelot X, Fitzjohn R, Ferguson N, Cori A, Jombart T. outbreaker2: a modular platform for outbreak reconstruction. BMC Bioinformatics 2018; 19:363. [PMID: 30343663 PMCID: PMC6196407 DOI: 10.1186/s12859-018-2330-z] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Reconstructing individual transmission events in an infectious disease outbreak can provide valuable information and help inform infection control policy. Recent years have seen considerable progress in the development of methodologies for reconstructing transmission chains using both epidemiological and genetic data. However, only a few of these methods have been implemented in software packages, and with little consideration for customisability and interoperability. Users are therefore limited to a small number of alternatives, incompatible tools with fixed functionality, or forced to develop their own algorithms at considerable personal effort. RESULTS Here we present outbreaker2, a flexible framework for outbreak reconstruction. This R package re-implements and extends the original model introduced with outbreaker, but most importantly also provides a modular platform allowing users to specify custom models within an optimised inferential framework. As a proof of concept, we implement the within-host evolutionary model introduced with TransPhylo, which is very distinct from the original genetic model in outbreaker, and demonstrate how even complex model results can be successfully included with minimal effort. CONCLUSIONS outbreaker2 provides a valuable starting point for future outbreak reconstruction tools, and represents a unifying platform that promotes customisability and interoperability. Implemented in the R software, outbreaker2 joins a growing body of tools for outbreak analysis.
Collapse
Affiliation(s)
- Finlay Campbell
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | - Xavier Didelot
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | - Rich Fitzjohn
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | - Neil Ferguson
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | - Anne Cori
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | - Thibaut Jombart
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| |
Collapse
|
50
|
Skums P, Zelikovsky A, Singh R, Gussler W, Dimitrova Z, Knyazev S, Mandric I, Ramachandran S, Campo D, Jha D, Bunimovich L, Costenbader E, Sexton C, O'Connor S, Xia GL, Khudyakov Y. QUENTIN: reconstruction of disease transmissions from viral quasispecies genomic data. Bioinformatics 2018; 34:163-170. [PMID: 29304222 DOI: 10.1093/bioinformatics/btx402] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Accepted: 06/15/2017] [Indexed: 01/08/2023] Open
Abstract
Motivation Genomic analysis has become one of the major tools for disease outbreak investigations. However, existing computational frameworks for inference of transmission history from viral genomic data often do not consider intra-host diversity of pathogens and heavily rely on additional epidemiological data, such as sampling times and exposure intervals. This impedes genomic analysis of outbreaks of highly mutable viruses associated with chronic infections, such as human immunodeficiency virus and hepatitis C virus, whose transmissions are often carried out through minor intra-host variants, while the additional epidemiological information often is either unavailable or has a limited use. Results The proposed framework QUasispecies Evolution, Network-based Transmission INference (QUENTIN) addresses the above challenges by evolutionary analysis of intra-host viral populations sampled by deep sequencing and Bayesian inference using general properties of social networks relevant to infection dissemination. This method allows inference of transmission direction even without the supporting case-specific epidemiological information, identify transmission clusters and reconstruct transmission history. QUENTIN was validated on experimental and simulated data, and applied to investigate HCV transmission within a community of hosts with high-risk behavior. It is available at https://github.com/skumsp/QUENTIN. Contact pskums@gsu.edu or alexz@cs.gsu.edu or rahul@sfsu.edu or yek0@cdc.gov. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pavel Skums
- Department of Computer Science, Georgia State University.,Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| | | | - Rahul Singh
- Department of Computer Science, San Francisco State University, San Francisco, CA 94132, USA
| | - Walker Gussler
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| | - Zoya Dimitrova
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University
| | - Igor Mandric
- Department of Computer Science, Georgia State University
| | - Sumathi Ramachandran
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| | - David Campo
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| | - Deeptanshu Jha
- Department of Computer Science, San Francisco State University, San Francisco, CA 94132, USA
| | - Leonid Bunimovich
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30313, USA
| | | | - Connie Sexton
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA.,Division of Global HIV and TB, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Siobhan O'Connor
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA.,Division of HIV/AIDS Prevention, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Guo-Liang Xia
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| | - Yury Khudyakov
- Centers for Disease Control and Prevention, Division of Viral Hepatitis, Atlanta, GA 30303, USA
| |
Collapse
|