1
|
Vaughan TG, Scire J, Nadeau SA, Stadler T. Estimates of early outbreak-specific SARS-CoV-2 epidemiological parameters from genomic data. Proc Natl Acad Sci U S A 2024; 121:e2308125121. [PMID: 38175864 PMCID: PMC10786264 DOI: 10.1073/pnas.2308125121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 12/02/2023] [Indexed: 01/06/2024] Open
Abstract
We estimate the basic reproductive number and case counts for 15 distinct Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreaks, distributed across 11 populations (10 countries and one cruise ship), based solely on phylodynamic analyses of genomic data. Our results indicate that, prior to significant public health interventions, the reproductive numbers for 10 (out of 15) of these outbreaks are similar, with median posterior estimates ranging between 1.4 and 2.8. These estimates provide a view which is complementary to that provided by those based on traditional line listing data. The genomic-based view is arguably less susceptible to biases resulting from differences in testing protocols, testing intensity, and import of cases into the community of interest. In the analyses reported here, the genomic data primarily provide information regarding which samples belong to a particular outbreak. We observe that once these outbreaks are identified, the sampling dates carry the majority of the information regarding the reproductive number. Finally, we provide genome-based estimates of the cumulative number of infections for each outbreak. For 7 out of 11 of the populations studied, the number of confirmed cases is much bigger than the cumulative number of infections estimated from the sequence data, a possible explanation being the presence of unsequenced outbreaks in these populations.
Collapse
Affiliation(s)
- Timothy G. Vaughan
- Department of Biosystems Science and Engineering, Eidgenössiche Technische Hochschule Zurich, Basel4058, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne1015, Switzerland
| | - Jérémie Scire
- Department of Biosystems Science and Engineering, Eidgenössiche Technische Hochschule Zurich, Basel4058, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne1015, Switzerland
| | - Sarah A. Nadeau
- Department of Biosystems Science and Engineering, Eidgenössiche Technische Hochschule Zurich, Basel4058, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne1015, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, Eidgenössiche Technische Hochschule Zurich, Basel4058, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne1015, Switzerland
| |
Collapse
|
2
|
Volz E. Fitness, growth and transmissibility of SARS-CoV-2 genetic variants. Nat Rev Genet 2023; 24:724-734. [PMID: 37328556 DOI: 10.1038/s41576-023-00610-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2023] [Indexed: 06/18/2023]
Abstract
The massive scale of the global SARS-CoV-2 sequencing effort created new opportunities and challenges for understanding SARS-CoV-2 evolution. Rapid detection and assessment of new variants has become one of the principal objectives of genomic surveillance of SARS-CoV-2. Because of the pace and scale of sequencing, new strategies have been developed for characterizing fitness and transmissibility of emerging variants. In this Review, I discuss a wide range of approaches that have been rapidly developed in response to the public health threat posed by emerging variants, ranging from new applications of classic population genetics models to contemporary synthesis of epidemiological models and phylodynamic analysis. Many of these approaches can be adapted to other pathogens and will have increasing relevance as large-scale pathogen sequencing becomes a regular feature of many public health systems.
Collapse
Affiliation(s)
- Erik Volz
- Department of Infectious Disease Epidemiology, MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, UK.
| |
Collapse
|
3
|
Hollingsworth BD, Grubaugh ND, Lazzaro BP, Murdock CC. Leveraging insect-specific viruses to elucidate mosquito population structure and dynamics. PLoS Pathog 2023; 19:e1011588. [PMID: 37651317 PMCID: PMC10470969 DOI: 10.1371/journal.ppat.1011588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023] Open
Abstract
Several aspects of mosquito ecology that are important for vectored disease transmission and control have been difficult to measure at epidemiologically important scales in the field. In particular, the ability to describe mosquito population structure and movement rates has been hindered by difficulty in quantifying fine-scale genetic variation among populations. The mosquito virome represents a possible avenue for quantifying population structure and movement rates across multiple spatial scales. Mosquito viromes contain a diversity of viruses, including several insect-specific viruses (ISVs) and "core" viruses that have high prevalence across populations. To date, virome studies have focused on viral discovery and have only recently begun examining viral ecology. While nonpathogenic ISVs may be of little public health relevance themselves, they provide a possible route for quantifying mosquito population structure and dynamics. For example, vertically transmitted viruses could behave as a rapidly evolving extension of the host's genome. It should be possible to apply established analytical methods to appropriate viral phylogenies and incidence data to generate novel approaches for estimating mosquito population structure and dispersal over epidemiologically relevant timescales. By studying the virome through the lens of spatial and genomic epidemiology, it may be possible to investigate otherwise cryptic aspects of mosquito ecology. A better understanding of mosquito population structure and dynamics are key for understanding mosquito-borne disease ecology and methods based on ISVs could provide a powerful tool for informing mosquito control programs.
Collapse
Affiliation(s)
- Brandon D Hollingsworth
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
- Cornell Institute for Host Microbe Interaction and Disease, Cornell University, Ithaca, New York, United States of America
| | - Nathan D Grubaugh
- Yale School of Public Health, New Haven, Connecticut, United States of America
- Yale University, New Haven, Connecticut, United States of America
| | - Brian P Lazzaro
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
- Cornell Institute for Host Microbe Interaction and Disease, Cornell University, Ithaca, New York, United States of America
| | - Courtney C Murdock
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
- Cornell Institute for Host Microbe Interaction and Disease, Cornell University, Ithaca, New York, United States of America
- Northeast Regional Center for Excellence in Vector-borne Diseases, Cornell University, Ithaca, New York, United States of America
| |
Collapse
|
4
|
Park Y, Martin MA, Koelle K. Epidemiological inference for emerging viruses using segregating sites. Nat Commun 2023; 14:3105. [PMID: 37248255 DOI: 10.1038/s41467-023-38809-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 05/16/2023] [Indexed: 05/31/2023] Open
Abstract
Epidemiological models are commonly fit to case and pathogen sequence data to estimate parameters and to infer unobserved disease dynamics. Here, we present an inference approach based on sequence data that is well suited for model fitting early on during the expansion of a viral lineage. Our approach relies on a trajectory of segregating sites to infer epidemiological parameters within a Sequential Monte Carlo framework. Using simulated data, we first show that our approach accurately recovers key epidemiological quantities under a single-introduction scenario. We then apply our approach to SARS-CoV-2 sequence data from France, estimating a basic reproduction number of approximately 2.3-2.7 under an epidemiological model that allows for multiple introductions. Our approach presented here indicates that inference approaches that rely on simple population genetic summary statistics can be informative of epidemiological parameters and can be used for reconstructing infectious disease dynamics during the early expansion of a viral lineage.
Collapse
Affiliation(s)
- Yeongseon Park
- Graduate Program in Population Biology, Ecology, and Evolution, Emory University, Atlanta, GA, 30322, USA
| | - Michael A Martin
- Graduate Program in Population Biology, Ecology, and Evolution, Emory University, Atlanta, GA, 30322, USA
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Katia Koelle
- Department of Biology, Emory University, Atlanta, GA, 30322, USA.
- Emory Center of Excellence for Influenza Research and Response (CEIRR), Atlanta, GA, USA.
| |
Collapse
|
5
|
Tang M, Dudas G, Bedford T, Minin VN. Fitting stochastic epidemic models to gene genealogies using linear noise approximation. Ann Appl Stat 2023. [DOI: 10.1214/21-aoas1583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Mingwei Tang
- Department of Statistics, University of Washington, Seattle
| | - Gytis Dudas
- Gothenburg Global Biodiversity Centre (GGBC)
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center
| | | |
Collapse
|
6
|
Hajjej A, Abdrakhmanova S, Turganbekova A, Almawi WY. HLA allele and haplotype frequencies in Kazakhstani Russians and their relationship with other populations. HLA 2023; 101:249-261. [PMID: 36502279 DOI: 10.1111/tan.14937] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 09/12/2022] [Accepted: 12/02/2022] [Indexed: 12/14/2022]
Abstract
HLA class I and class II genotypes from 947 Kazakhstani individuals of Russian origin were analyzed for investigating their most likely origin. The results were compared with similar data from other Russians (East and West), and also Worldwide populations, using standard genetic distances, neighbor-joining dendrograms, correspondence and haplotype analysis. Of the five HLA loci analyzed (HLA-A, HLA-C, HLA-B, HLA-DRB1, and HLA-DQB1) genotyped, 216 HLA alleles were identified. The most frequent alleles were A*02:01 (26.5%), B*07:02 (11.1%), C*04:01 (13.5%) and C*06:02 (12.1%), DRB1*07:01 (13.8%) and DRB1*15:01 (12.2%), and DQB1*03:01 (19.7%). Significant linkage disequilibrium was noted between all HLA pairs. DRB1*15:01 ~ DQB1*06:02 (10.5%), B*07:02 ~ C*07:02 (10.0%), B*07:02 ~ DRB1*15:01 (6.3%), and A*01:01 ~ B*08:01 (4.5%) were the most frequent two-locus haplotypes identified. Subsequent analyses showed that Kazakhstani Russians were closely related to West Russia-residing populations (Northwest Slavic, Vologda, Chelyabinsk, Moscow), East Europeans (Belarus Brest, Ukraine, Poland) and Scandinavians (Swedish, Finns), but distinct from East Russia-residing populations (Tuvians, Siberians from Chukotka, Kamchatka, and Ulchi) and East Mediterraneans (Levantines, Turks, North Macedonians, Albanians), and East Asians (Koreans, Japanese, Taiwanese, Mongolians). These results are in accordance with historical data indicating that the Russians of central Asia originate mainly from European Russia during the migratory flow of 18th and 19th centuries.
Collapse
Affiliation(s)
- Abdelhafidh Hajjej
- Department of Immunogenetics, National Blood Transfusion Center, Tunis, Tunisia
| | - Saniya Abdrakhmanova
- Research and Production Center of Transfusion, Kazakhstan Ministry of Health, Astana, Kazakhstan
| | - Aida Turganbekova
- Research and Production Center of Transfusion, Kazakhstan Ministry of Health, Astana, Kazakhstan
| | - Wassim Y Almawi
- Department of Biomedical Sciences, School of Medicine, Nazarbayev University, Astana, Kazakhstan.,Faculty of Sciences, El-Manar University, Tunis, Tunisia
| |
Collapse
|
7
|
Attwood SW, Hill SC, Aanensen DM, Connor TR, Pybus OG. Phylogenetic and phylodynamic approaches to understanding and combating the early SARS-CoV-2 pandemic. Nat Rev Genet 2022; 23:547-562. [PMID: 35459859 PMCID: PMC9028907 DOI: 10.1038/s41576-022-00483-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2022] [Indexed: 01/05/2023]
Abstract
Determining the transmissibility, prevalence and patterns of movement of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections is central to our understanding of the impact of the pandemic and to the design of effective control strategies. Phylogenies (evolutionary trees) have provided key insights into the international spread of SARS-CoV-2 and enabled investigation of individual outbreaks and transmission chains in specific settings. Phylodynamic approaches combine evolutionary, demographic and epidemiological concepts and have helped track virus genetic changes, identify emerging variants and inform public health strategy. Here, we review and synthesize studies that illustrate how phylogenetic and phylodynamic techniques were applied during the first year of the pandemic, and summarize their contributions to our understanding of SARS-CoV-2 transmission and control.
Collapse
Affiliation(s)
- Stephen W Attwood
- Department of Zoology, University of Oxford, Oxford, UK.
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK.
| | - Sarah C Hill
- Department of Pathobiology and Population Sciences, Royal Veterinary College, University of London, London, UK
| | - David M Aanensen
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Thomas R Connor
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK
- School of Biosciences, Cardiff University, Cardiff, UK
| | - Oliver G Pybus
- Department of Zoology, University of Oxford, Oxford, UK.
- Department of Pathobiology and Population Sciences, Royal Veterinary College, University of London, London, UK.
| |
Collapse
|
8
|
Guinat C, Valenzuela Agüí C, Vaughan TG, Scire J, Pohlmann A, Staubach C, King J, Świętoń E, Dán Á, Černíková L, Ducatez MF, Stadler T. Disentangling the role of poultry farms and wild birds in the spread of highly pathogenic avian influenza virus in Europe. Virus Evol 2022; 8:veac073. [PMID: 36533150 PMCID: PMC9752641 DOI: 10.1093/ve/veac073] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 07/21/2022] [Accepted: 08/18/2022] [Indexed: 08/12/2023] Open
Abstract
In winter 2016-7, Europe was severely hit by an unprecedented epidemic of highly pathogenic avian influenza viruses (HPAIVs), causing a significant impact on animal health, wildlife conservation, and livestock economic sustainability. By applying phylodynamic tools to virus sequences collected during the epidemic, we investigated when the first infections occurred, how many infections were unreported, which factors influenced virus spread, and how many spillover events occurred. HPAIV was likely introduced into poultry farms during the autumn, in line with the timing of wild birds' migration. In Germany, Hungary, and Poland, the epidemic was dominated by farm-to-farm transmission, showing that understanding of how farms are connected would greatly help control efforts. In the Czech Republic, the epidemic was dominated by wild bird-to-farm transmission, implying that more sustainable prevention strategies should be developed to reduce HPAIV exposure from wild birds. Inferred transmission parameters will be useful to parameterize predictive models of HPAIV spread. None of the predictors related to live poultry trade, poultry census, and geographic proximity were identified as supportive predictors of HPAIV spread between farms across borders. These results are crucial to better understand HPAIV transmission dynamics at the domestic-wildlife interface with the view to reduce the impact of future epidemics.
Collapse
Affiliation(s)
- Claire Guinat
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse, Basel 4058, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge, Lausanne 1015, Switzerland
| | - Cecilia Valenzuela Agüí
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse, Basel 4058, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge, Lausanne 1015, Switzerland
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse, Basel 4058, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge, Lausanne 1015, Switzerland
| | - Jérémie Scire
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse, Basel 4058, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge, Lausanne 1015, Switzerland
| | - Anne Pohlmann
- Friedrich-Loeffler-Institut, Suedufer 10, Greifswald – Insel Riems 17489, Germany
| | - Christoph Staubach
- Friedrich-Loeffler-Institut, Suedufer 10, Greifswald – Insel Riems 17489, Germany
| | - Jacqueline King
- Friedrich-Loeffler-Institut, Suedufer 10, Greifswald – Insel Riems 17489, Germany
| | - Edyta Świętoń
- Department of Poultry Diseases, National Veterinary Research Institute, Al. Partyzantow 57, Pulawy 24-100, Poland
| | - Ádám Dán
- DaNAm Vet Molbiol, Herman Ottó utca 5, Kőszeg 9730, Hungary
| | - Lenka Černíková
- State Veterinary Institute Prague, Sidlistni 136/24, Prague 165 03, Czech Republic
| | - Mariette F Ducatez
- IHAP, Université de Toulouse, INRAE, ENVT, 23 chemin des capelles, Toulouse 31076, France
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse, Basel 4058, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge, Lausanne 1015, Switzerland
| |
Collapse
|
9
|
Featherstone LA, Zhang JM, Vaughan TG, Duchene S. Epidemiological Inference From Pathogen Genomes: A Review of Phylodynamic Models and Applications. Virus Evol 2022; 8:veac045. [PMID: 35775026 PMCID: PMC9241095 DOI: 10.1093/ve/veac045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 05/23/2022] [Accepted: 06/02/2022] [Indexed: 11/24/2022] Open
Abstract
Phylodynamics requires an interdisciplinary understanding of phylogenetics, epidemiology, and statistical inference. It has also experienced more intense application than ever before amid the SARS-CoV-2 pandemic. In light of this, we present a review of phylodynamic models beginning with foundational models and assumptions. Our target audience is public health researchers, epidemiologists, and biologists seeking a working knowledge of the links between epidemiology, evolutionary models, and resulting epidemiological inference. We discuss the assumptions linking evolutionary models of pathogen population size to epidemiological models of the infected population size. We then describe statistical inference for phylodynamic models and list how output parameters can be rearranged for epidemiological interpretation. We go on to cover more sophisticated models and finish by highlighting future directions.
Collapse
Affiliation(s)
- Leo A Featherstone
- Peter Doherty Institute for Infection and Immunity, University of Melbourne , Australia
| | - Joshua M Zhang
- Peter Doherty Institute for Infection and Immunity, University of Melbourne , Australia
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich , Basel, Switzerland
- Swiss Institute of Bioinformatics
| | - Sebastian Duchene
- Peter Doherty Institute for Infection and Immunity, University of Melbourne , Australia
| |
Collapse
|
10
|
Andréoletti J, Zwaans A, Warnock RCM, Aguirre-Fernández G, Barido-Sottani J, Gupta A, Stadler T, Manceau M. The Occurrence Birth-Death Process for combined-evidence analysis in macroevolution and epidemiology. Syst Biol 2022; 71:1440-1452. [PMID: 35608305 PMCID: PMC9558841 DOI: 10.1093/sysbio/syac037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 05/02/2022] [Accepted: 05/06/2022] [Indexed: 11/28/2022] Open
Abstract
Phylodynamic models generally aim at jointly inferring phylogenetic relationships, model parameters, and more recently, the number of lineages through time, based on molecular sequence data. In the fields of epidemiology and macroevolution, these models can be used to estimate, respectively, the past number of infected individuals (prevalence) or the past number of species (paleodiversity) through time. Recent years have seen the development of “total-evidence” analyses, which combine molecular and morphological data from extant and past sampled individuals in a unified Bayesian inference framework. Even sampled individuals characterized only by their sampling time, that is, lacking morphological and molecular data, which we call occurrences, provide invaluable information to estimate the past number of lineages. Here, we present new methodological developments around the fossilized birth–death process enabling us to (i) incorporate occurrence data in the likelihood function; (ii) consider piecewise-constant birth, death, and sampling rates; and (iii) estimate the past number of lineages, with or without knowledge of the underlying tree. We implement our method in the RevBayes software environment, enabling its use along with a large set of models of molecular and morphological evolution, and validate the inference workflow using simulations under a wide range of conditions. We finally illustrate our new implementation using two empirical data sets stemming from the fields of epidemiology and macroevolution. In epidemiology, we infer the prevalence of the coronavirus disease 2019 outbreak on the Diamond Princess ship, by taking into account jointly the case count record (occurrences) along with viral sequences for a fraction of infected individuals. In macroevolution, we infer the diversity trajectory of cetaceans using molecular and morphological data from extant taxa, morphological data from fossils, as well as numerous fossil occurrences. The joint modeling of occurrences and trees holds the promise to further bridge the gap between traditional epidemiology and pathogen genomics, as well as paleontology and molecular phylogenetics. [Birth–death model; epidemiology; fossils; macroevolution; occurrences; phylogenetics; skyline.]
Collapse
Affiliation(s)
- Jérémy Andréoletti
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Antoine Zwaans
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Rachel C M Warnock
- GeoZentrum Nordbayern,Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | | | - Joëlle Barido-Sottani
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, USA
| | - Ankit Gupta
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Marc Manceau
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| |
Collapse
|
11
|
Cappello L, Kim J, Liu S, Palacios JA. Statistical Challenges in Tracking the Evolution of SARS-CoV-2. Stat Sci 2022; 37:162-182. [PMID: 36034090 PMCID: PMC9409356 DOI: 10.1214/22-sts853] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Genomic surveillance of SARS-CoV-2 has been instrumental in tracking the spread and evolution of the virus during the pandemic. The availability of SARS-CoV-2 molecular sequences isolated from infected individuals, coupled with phylodynamic methods, have provided insights into the origin of the virus, its evolutionary rate, the timing of introductions, the patterns of transmission, and the rise of novel variants that have spread through populations. Despite enormous global efforts of governments, laboratories, and researchers to collect and sequence molecular data, many challenges remain in analyzing and interpreting the data collected. Here, we describe the models and methods currently used to monitor the spread of SARS-CoV-2, discuss long-standing and new statistical challenges, and propose a method for tracking the rise of novel variants during the epidemic.
Collapse
Affiliation(s)
- Lorenzo Cappello
- Lorenzo Cappello is Assistant Professor, Departments of Economics and Business, Universitat Pompeu Fabra, 08005, Spain
| | - Jaehee Kim
- Jaehee Kim is Assistant Professor, Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
| | - Sifan Liu
- Sifan Liu is a Ph.D. student, Department of Statistics, Stanford University, Stanford, California 94305, USA
| | - Julia A. Palacios
- Julia A. Palacios is Assistant Professor, Departments of Statistics and Biomedical Data Sciences, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
12
|
Methods Combining Genomic and Epidemiological Data in the Reconstruction of Transmission Trees: A Systematic Review. Pathogens 2022; 11:pathogens11020252. [PMID: 35215195 PMCID: PMC8875843 DOI: 10.3390/pathogens11020252] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/08/2022] [Accepted: 02/11/2022] [Indexed: 11/17/2022] Open
Abstract
In order to better understand transmission dynamics and appropriately target control and preventive measures, studies have aimed to identify who-infected-whom in actual outbreaks. Numerous reconstruction methods exist, each with their own assumptions, types of data, and inference strategy. Thus, selecting a method can be difficult. Following PRISMA guidelines, we systematically reviewed the literature for methods combing epidemiological and genomic data in transmission tree reconstruction. We identified 22 methods from the 41 selected articles. We defined three families according to how genomic data was handled: a non-phylogenetic family, a sequential phylogenetic family, and a simultaneous phylogenetic family. We discussed methods according to the data needed as well as the underlying sequence mutation, within-host evolution, transmission, and case observation. In the non-phylogenetic family consisting of eight methods, pairwise genetic distances were estimated. In the phylogenetic families, transmission trees were inferred from phylogenetic trees either simultaneously (nine methods) or sequentially (five methods). While a majority of methods (17/22) modeled the transmission process, few (8/22) took into account imperfect case detection. Within-host evolution was generally (7/8) modeled as a coalescent process. These practical and theoretical considerations were highlighted in order to help select the appropriate method for an outbreak.
Collapse
|
13
|
Zarebski AE, du Plessis L, Parag KV, Pybus OG. A computationally tractable birth-death model that combines phylogenetic and epidemiological data. PLoS Comput Biol 2022; 18:e1009805. [PMID: 35148311 PMCID: PMC8903285 DOI: 10.1371/journal.pcbi.1009805] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 03/08/2022] [Accepted: 01/05/2022] [Indexed: 11/19/2022] Open
Abstract
Inferring the dynamics of pathogen transmission during an outbreak is an important problem in infectious disease epidemiology. In mathematical epidemiology, estimates are often informed by time series of confirmed cases, while in phylodynamics genetic sequences of the pathogen, sampled through time, are the primary data source. Each type of data provides different, and potentially complementary, insight. Recent studies have recognised that combining data sources can improve estimates of the transmission rate and the number of infected individuals. However, inference methods are typically highly specialised and field-specific and are either computationally prohibitive or require intensive simulation, limiting their real-time utility. We present a novel birth-death phylogenetic model and derive a tractable analytic approximation of its likelihood, the computational complexity of which is linear in the size of the dataset. This approach combines epidemiological and phylodynamic data to produce estimates of key parameters of transmission dynamics and the unobserved prevalence. Using simulated data, we show (a) that the approximation agrees well with existing methods, (b) validate the claim of linear complexity and (c) explore robustness to model misspecification. This approximation facilitates inference on large datasets, which is increasingly important as large genomic sequence datasets become commonplace.
Collapse
Affiliation(s)
| | - Louis du Plessis
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | - Kris Varun Parag
- MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, United Kingdom
| | | |
Collapse
|
14
|
KING AARONA, LIN QIANYING, IONIDES EDWARDL. Markov genealogy processes. Theor Popul Biol 2022; 143:77-91. [PMID: 34896438 PMCID: PMC8846264 DOI: 10.1016/j.tpb.2021.11.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 11/19/2021] [Accepted: 11/22/2021] [Indexed: 02/03/2023]
Abstract
We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calculations with several examples. Existing full-information approaches for phylodynamic inference are special cases of the theory.
Collapse
Affiliation(s)
- AARON A. KING
- Department of Ecology & Evolutionary Biology, Center for the Study of Complex Systems, Center for Computational Medicine & Biology, and Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109 USA
| | - QIANYING LIN
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109 USA
| | - EDWARD L. IONIDES
- Department of Statistics and Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|
15
|
Glennon EE, Bruijning M, Lessler J, Miller IF, Rice BL, Thompson RN, Wells K, Metcalf CJE. Challenges in modeling the emergence of novel pathogens. Epidemics 2021; 37:100516. [PMID: 34775298 DOI: 10.1016/j.epidem.2021.100516] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 09/29/2021] [Accepted: 10/22/2021] [Indexed: 01/24/2023] Open
Abstract
The emergence of infectious agents with pandemic potential present scientific challenges from detection to data interpretation to understanding determinants of risk and forecasts. Mathematical models could play an essential role in how we prepare for future emergent pathogens. Here, we describe core directions for expansion of the existing tools and knowledge base, including: using mathematical models to identify critical directions and paths for strengthening data collection to detect and respond to outbreaks of novel pathogens; expanding basic theory to identify infectious agents and contexts that present the greatest risks, over both the short and longer term; by strengthening estimation tools that make the most use of the likely range and uncertainties in existing data; and by ensuring modelling applications are carefully communicated and developed within diverse and equitable collaborations for increased public health benefit.
Collapse
Affiliation(s)
- Emma E Glennon
- Disease Dynamics Unit, Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK.
| | - Marjolein Bruijning
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Justin Lessler
- Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Ian F Miller
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA; Rocky Mountain Biological Laboratory, Crested Butte, CO 81224, USA
| | - Benjamin L Rice
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA; Madagascar Health and Environmental Research (MAHERY), Maroantsetra, Madagascar
| | - Robin N Thompson
- Mathematics Institute, University of Warwick, Warwick CV4 7AL, UK; The Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Warwick CV4 7AL, UK
| | - Konstans Wells
- Department of Biosciences, Swansea University, Swansea SA28PP, UK
| | - C Jessica E Metcalf
- Disease Dynamics Unit, Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK; Princeton School of Public and International Affairs, Princeton University, Princeton, NJ, USA
| |
Collapse
|
16
|
Smith MR, Trofimova M, Weber A, Duport Y, Kühnert D, von Kleist M. Rapid incidence estimation from SARS-CoV-2 genomes reveals decreased case detection in Europe during summer 2020. Nat Commun 2021; 12:6009. [PMID: 34650062 PMCID: PMC8517019 DOI: 10.1038/s41467-021-26267-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 09/24/2021] [Indexed: 12/24/2022] Open
Abstract
By October 2021, 230 million SARS-CoV-2 diagnoses have been reported. Yet, a considerable proportion of cases remains undetected. Here, we propose GInPipe, a method that rapidly reconstructs SARS-CoV-2 incidence profiles solely from publicly available, time-stamped viral genomes. We validate GInPipe against simulated outbreaks and elaborate phylodynamic analyses. Using available sequence data, we reconstruct incidence histories for Denmark, Scotland, Switzerland, and Victoria (Australia) and demonstrate, how to use the method to investigate the effects of changing testing policies on case ascertainment. Specifically, we find that under-reporting was highest during summer 2020 in Europe, coinciding with more liberal testing policies at times of low testing capacities. Due to the increased use of real-time sequencing, it is envisaged that GInPipe can complement established surveillance tools to monitor the SARS-CoV-2 pandemic. In post-pandemic times, when diagnostic efforts are decreasing, GInPipe may facilitate the detection of hidden infection dynamics.
Collapse
Affiliation(s)
- Maureen Rebecca Smith
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany.
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany.
| | - Maria Trofimova
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany
| | - Ariane Weber
- Transmission, Infection, Diversification and Evolution Group, Max-Planck Institute for the Science of Human History, Jena, Germany
| | - Yannick Duport
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany
| | - Denise Kühnert
- Transmission, Infection, Diversification and Evolution Group, Max-Planck Institute for the Science of Human History, Jena, Germany
- German COVID Omics Initiative (deCOI), Bonn, Germany
| | - Max von Kleist
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany.
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany.
- German COVID Omics Initiative (deCOI), Bonn, Germany.
| |
Collapse
|
17
|
Nemira A, Adeniyi AE, Gasich EL, Bulda KY, Valentovich LN, Krasko AG, Glebova O, Kirpich A, Skums P. SARS-CoV-2 transmission dynamics in Belarus in 2020 revealed by genomic and incidence data analysis. COMMUNICATIONS MEDICINE 2021; 1:31. [PMID: 35602211 PMCID: PMC9053244 DOI: 10.1038/s43856-021-00031-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 08/25/2021] [Indexed: 12/15/2022] Open
Abstract
Background Non-pharmaceutical interventions (NPIs) have been implemented worldwide to curb COVID-19 spread. Belarus is a rare case of a country with a relatively modern healthcare system, where highly limited NPIs have been enacted. Thus, investigation of Belarusian COVID-19 dynamics is essential for the local and global assessment of the impact of NPI strategies. Methods We integrate genomic epidemiology and surveillance methods to investigate the spread of SARS-CoV-2 in Belarus in 2020. We utilize phylodynamics, phylogeography, and probabilistic bias inference to study the virus import and export routes, the dynamics of the effective reproduction number, and the incidence of SARS-CoV-2 infection. Results Here we show that the estimated cumulative number of infections by June 2020 exceeds the confirmed case number by a factor of ~4 (95% confidence interval (2; 9)). Intra-country SARS-CoV-2 genomic diversity originates from at least 18 introductions from different regions, with a high proportion of regional transmissions. Phylodynamic analysis indicates a moderate reduction of the effective reproductive number after the introduction of limited NPIs, but its magnitude is lower than for developed countries with large-scale NPIs. On the other hand, the effective reproduction number estimate is comparable with that for the neighboring Ukraine, where NPIs were broader. Conclusions The example of Belarus demonstrates how countries with relatively low outward population mobility continue to be integral parts of the global epidemiological environment. Comparison of the effective reproduction number dynamics for Belarus and other countries reveals the effect of different NPI strategies but also emphasizes the role of regional Eastern European sociodemographic factors in the virus spread. Belarus is one of few European countries that has enacted limited measures to contain SARS-CoV-2, the virus that causes COVID-19. We study the genetic sequences of the SARS-CoV-2 virus circulating in Belarus and other countries in 2020 to investigate how it might have been imported into the country and spread there. We show that the virus was repeatedly imported from and exported to different regions, including a large portion of regional transmissions that occurred despite stricter measures implemented by Belarus’ neighbors. There was a moderate reduction of the virus reproductive number—a measure of virus transmission speed—after April 2020, but its magnitude was lower than for developed countries with more stringent epidemiological interventions. These findings shed light on the COVID-19 spread in Eastern Europe and highlight the impact of public health policies and of regional factors on this spread. Nemira et al. study the genomic epidemiology and phylodynamics of SARS-CoV-2 in Belarus. They identify potential introduction routes of the virus from other countries, determine that during the first wave of the pandemic the number of infections was likely several times higher than reported case numbers, and estimate the impact of early non-pharmaceutical interventions on SARS-CoV-2 transmission.
Collapse
|
18
|
Louca S, McLaughlin A, MacPherson A, Joy JB, Pennell MW. Fundamental Identifiability Limits in Molecular Epidemiology. Mol Biol Evol 2021; 38:4010-4024. [PMID: 34009339 PMCID: PMC8382926 DOI: 10.1093/molbev/msab149] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Viral phylogenies provide crucial information on the spread of infectious diseases, and many studies fit mathematical models to phylogenetic data to estimate epidemiological parameters such as the effective reproduction ratio (Re) over time. Such phylodynamic inferences often complement or even substitute for conventional surveillance data, particularly when sampling is poor or delayed. It remains generally unknown, however, how robust phylodynamic epidemiological inferences are, especially when there is uncertainty regarding pathogen prevalence and sampling intensity. Here, we use recently developed mathematical techniques to fully characterize the information that can possibly be extracted from serially collected viral phylogenetic data, in the context of the commonly used birth-death-sampling model. We show that for any candidate epidemiological scenario, there exists a myriad of alternative, markedly different, and yet plausible "congruent" scenarios that cannot be distinguished using phylogenetic data alone, no matter how large the data set. In the absence of strong constraints or rate priors across the entire study period, neither maximum-likelihood fitting nor Bayesian inference can reliably reconstruct the true epidemiological dynamics from phylogenetic data alone; rather, estimators can only converge to the "congruence class" of the true dynamics. We propose concrete and feasible strategies for making more robust epidemiological inferences from viral phylogenetic data.
Collapse
Affiliation(s)
- Stilianos Louca
- Department of Biology, University of Oregon, Eugene, OR, USA
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, BC, Canada
- Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Ailene MacPherson
- Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, BC, Canada
- Bioinformatics, University of British Columbia, Vancouver, BC, Canada
- Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Matthew W Pennell
- Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
19
|
MacPherson A, Louca S, McLaughlin A, Joy JB, Pennell MW. Unifying Phylogenetic Birth-Death Models in Epidemiology and Macroevolution. Syst Biol 2021; 71:172-189. [PMID: 34165577 PMCID: PMC8972974 DOI: 10.1093/sysbio/syab049] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 06/09/2021] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
Birth–death stochastic processes are the foundations of many phylogenetic models and are
widely used to make inferences about epidemiological and macroevolutionary dynamics. There
are a large number of birth–death model variants that have been developed; these impose
different assumptions about the temporal dynamics of the parameters and about the sampling
process. As each of these variants was individually derived, it has been difficult to
understand the relationships between them as well as their precise biological and
mathematical assumptions. Without a common mathematical foundation, deriving new models is
nontrivial. Here, we unify these models into a single framework, prove that many
previously developed epidemiological and macroevolutionary models are all special cases of
a more general model, and illustrate the connections between these variants. This
unification includes both models where the process is the same for all lineages and those
in which it varies across types. We also outline a straightforward procedure for deriving
likelihood functions for arbitrarily complex birth–death(-sampling) models that will
hopefully allow researchers to explore a wider array of scenarios than was previously
possible. By rederiving existing single-type birth–death sampling models, we clarify and
synthesize the range of explicit and implicit assumptions made by these models.
[Birth–death processes; epidemiology; macroevolution; phylogenetics; statistical
inference.]
Collapse
Affiliation(s)
- Ailene MacPherson
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada
| | - Stilianos Louca
- Department of Biology, University of Oregon, USA.,Institute of Ecology and Evolution, University of Oregon, USA
| | - Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, Canada.,Bioinformatics, University of British Columbia, Vancouver, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, Canada.,Bioinformatics, University of British Columbia, Vancouver, Canada.,Department of Medicine, University of British Columbia, Vancouver, Canada
| | - Matthew W Pennell
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| |
Collapse
|
20
|
Featherstone LA, Di Giallonardo F, Holmes EC, Vaughan TG, Duchêne S. Infectious disease phylodynamics with occurrence data. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13620] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Leo A. Featherstone
- Department of Microbiology and Immunology Peter Doherty Institute for Infection and Immunity University of Melbourne Melbourne Vic. Australia
| | | | - Edward C. Holmes
- Marie Bashir Institute for Infectious Diseases and BiosecurityThe University of Sydney Sydney NSW Australia
- Charles Perkins Centre School of Life and Environmental Sciences The University of Sydney Sydney NSW Australia
- School of Medical Sciences The University of Sydney Sydney NSW Australia
| | - Timothy G. Vaughan
- Department of Biosystems Science and Engineering ETH Zurich Basel Switzerland
- Swiss Institute of Bioinformatics (SIB) Lausanne Switzerland
| | - Sebastián Duchêne
- Department of Microbiology and Immunology Peter Doherty Institute for Infection and Immunity University of Melbourne Melbourne Vic. Australia
| |
Collapse
|
21
|
Nemira A, Adeniyi AE, Gasich EL, Bulda KY, Valentovich LN, Krasko AG, Glebova O, Kirpich A, Skums P. SARS-CoV-2 transmission dynamics in Belarus revealed by genomic and incidence data analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.04.13.21255404. [PMID: 33907756 PMCID: PMC8077579 DOI: 10.1101/2021.04.13.21255404] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Since the emergence of COVID-19, a series of non-pharmaceutical interventions (NPIs) has been implemented by governments and public health authorities world-wide to control and curb the ongoing pandemic spread. From that perspective, Belarus is one of a few countries with a relatively modern healthcare system, where much narrower NPIs have been put in place. Given the uniqueness of this Belarusian experience, the understanding its COVID-19 epidemiological dynamics is essential not only for the local assessment, but also for a better insight into the impact of different NPI strategies globally. In this work, we integrate genomic epidemiology and surveillance methods to investigate the emergence and spread of SARS-CoV-2 in the country. The observed Belarusian SARS-CoV-2 genetic diversity originated from at least eighteen separate introductions, at least five of which resulted in on-going domestic transmissions. The introduction sources represent a wide variety of regions, although the proportion of regional virus introductions and exports from/to geographical neighbors appears to be higher than for other European countries. Phylodynamic analysis indicates a moderate reduction in the effective reproductive number ℛ e after the introduction of limited NPIs, with the reduction magnitude generally being lower than for countries with large-scale NPIs. On the other hand, the estimate of the Belarusian ℛ e at the early epidemic stage is comparable with this number for the neighboring ex-USSR country of Ukraine, where much broader NPIs have been implemented. The actual number of cases by the end of May, 2020 was predicted to be 2-9 times higher than the detected number of cases.
Collapse
Affiliation(s)
- Alina Nemira
- Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| | | | - Elena L. Gasich
- Republican Research and Practical Center for Epidemiology and Microbiology, Minsk, Belarus
| | - Kirill Y. Bulda
- Republican Research and Practical Center for Epidemiology and Microbiology, Minsk, Belarus
| | | | - Anatoly G. Krasko
- Republican Research and Practical Center for Epidemiology and Microbiology, Minsk, Belarus
| | - Olga Glebova
- Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| | - Alexander Kirpich
- Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, Georgia, USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| |
Collapse
|
22
|
Hufsky F, Lamkiewicz K, Almeida A, Aouacheria A, Arighi C, Bateman A, Baumbach J, Beerenwinkel N, Brandt C, Cacciabue M, Chuguransky S, Drechsel O, Finn RD, Fritz A, Fuchs S, Hattab G, Hauschild AC, Heider D, Hoffmann M, Hölzer M, Hoops S, Kaderali L, Kalvari I, von Kleist M, Kmiecinski R, Kühnert D, Lasso G, Libin P, List M, Löchel HF, Martin MJ, Martin R, Matschinske J, McHardy AC, Mendes P, Mistry J, Navratil V, Nawrocki EP, O’Toole ÁN, Ontiveros-Palacios N, Petrov AI, Rangel-Pineros G, Redaschi N, Reimering S, Reinert K, Reyes A, Richardson L, Robertson DL, Sadegh S, Singer JB, Theys K, Upton C, Welzel M, Williams L, Marz M. Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research. Brief Bioinform 2021; 22:642-663. [PMID: 33147627 PMCID: PMC7665365 DOI: 10.1093/bib/bbaa232] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 07/28/2020] [Accepted: 08/26/2020] [Indexed: 12/16/2022] Open
Abstract
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories. Contact:evbc@unj-jena.de.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Christian Brandt
- Institute of Infectious Disease and Infection Control at Jena University Hospital, Germany
| | - Marco Cacciabue
- Consejo Nacional de Investigaciones Científicas y Tócnicas (CONICET) working on FMDV virology at the Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET) and at the Departamento de Ciencias Básicas, Universidad Nacional de Luján (UNLu), Argentina
| | | | - Oliver Drechsel
- bioinformatics department at the Robert Koch-Institute, Germany
| | | | - Adrian Fritz
- Computational Biology of Infection Research group of Alice C. McHardy at the Helmholtz Centre for Infection Research, Germany
| | - Stephan Fuchs
- bioinformatics department at the Robert Koch-Institute, Germany
| | - Georges Hattab
- Bioinformatics Division at Philipps-University Marburg, Germany
| | | | - Dominik Heider
- Data Science in Biomedicine at the Philipps-University of Marburg, Germany
| | | | | | - Stefan Hoops
- Biocomplexity Institute and Initiative at the University of Virginia, USA
| | - Lars Kaderali
- Bioinformatics and head of the Institute of Bioinformatics at University Medicine Greifswald, Germany
| | | | - Max von Kleist
- bioinformatics department at the Robert Koch-Institute, Germany
| | - Renó Kmiecinski
- bioinformatics department at the Robert Koch-Institute, Germany
| | | | - Gorka Lasso
- Chandran Lab, Albert Einstein College of Medicine, USA
| | | | | | | | | | | | | | - Alice C McHardy
- Computational Biology of Infection Research Lab at the Helmholtz Centre for Infection Research in Braunschweig, Germany
| | - Pedro Mendes
- Center for Quantitative Medicine of the University of Connecticut School of Medicine, USA
| | | | - Vincent Navratil
- Bioinformatics and Systems Biology at the Rhône Alpes Bioinformatics core facility, Universitó de Lyon, France
| | | | | | | | | | | | - Nicole Redaschi
- Development of the Swiss-Prot group at the SIB for UniProt and SIB resources that cover viral biology (ViralZone)
| | - Susanne Reimering
- Computational Biology of Infection Research group of Alice C. McHardy at the Helmholtz Centre for Infection Research
| | | | | | | | | | - Sepideh Sadegh
- Chair of Experimental Bioinformatics at Technical University of Munich, Germany
| | - Joshua B Singer
- MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, UK
| | | | - Chris Upton
- Department of Biochemistry and Microbiology, University of Victoria, Canada
| | | | | | - Manja Marz
- Friedrich Schiller University Jena, Germany
| |
Collapse
|
23
|
Ingle DJ, Howden BP, Duchene S. Development of Phylodynamic Methods for Bacterial Pathogens. Trends Microbiol 2021; 29:788-797. [PMID: 33736902 DOI: 10.1016/j.tim.2021.02.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 02/13/2021] [Accepted: 02/15/2021] [Indexed: 11/30/2022]
Abstract
Phylodynamic methods have been essential to understand the interplay between the evolution and epidemiology of infectious diseases. To date, the field has centered on viruses. Bacterial pathogens are seldom analyzed under such phylodynamic frameworks, due to their complex genome evolution and, until recently, a paucity of whole-genome sequence data sets with rich associated metadata. We posit that the increasing availability of bacterial genomes and epidemiological data means that the field is now ripe to lay the foundations for applying phylodynamics to bacterial pathogens. The development of new methods that integrate more complex genomic and ecological data will help to inform public heath surveillance and control strategies for bacterial pathogens that represent serious threats to human health.
Collapse
Affiliation(s)
- Danielle J Ingle
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria, Australia; National Centre for Epidemiology and Population Health, The Australian National University, Canberra, Australia; Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria, Australia
| | - Benjamin P Howden
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria, Australia; Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria, Australia; Doherty Applied Microbial Genomics, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria, Australia
| | - Sebastian Duchene
- Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria, Australia.
| |
Collapse
|
24
|
Manceau M, Gupta A, Vaughan T, Stadler T. The probability distribution of the ancestral population size conditioned on the reconstructed phylogenetic tree with occurrence data. J Theor Biol 2021; 509:110400. [PMID: 32739241 PMCID: PMC7733867 DOI: 10.1016/j.jtbi.2020.110400] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 05/07/2020] [Accepted: 07/03/2020] [Indexed: 01/10/2023]
Abstract
We consider a homogeneous birth-death process with three different sampling schemes. First, individuals can be sampled through time and included in a reconstructed phylogenetic tree. Second, they can be sampled through time and only recorded as a point 'occurrence' along a timeline. Third, extant individuals can be sampled and included in the reconstructed phylogenetic tree with a fixed probability. We further consider that sampled individuals can be removed or not from the process, upon sampling, with fixed probability. We derive the probability distribution of the population size at any time in the past conditional on the joint observation of a reconstructed phylogenetic tree and a record of occurrences not included in the tree. We also provide an algorithm to simulate ancestral population size trajectories given the observation of a reconstructed phylogenetic tree and occurrences. This distribution can be readily used to draw inferences about the ancestral population size in the field of epidemiology and macroevolution. In epidemiology, these results will allow data from epidemiological case count studies to be used in conjunction with molecular sequencing data (yielding reconstructed phylogenetic trees) to coherently estimate prevalence through time. In macroevolution, it will foster the joint examination of the fossil record and extant taxa to reconstruct past biodiversity.
Collapse
Affiliation(s)
- Marc Manceau
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
| | - Ankit Gupta
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Timothy Vaughan
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
| |
Collapse
|
25
|
The probability distribution of the reconstructed phylogenetic tree with occurrence data. J Theor Biol 2020; 488:110115. [DOI: 10.1016/j.jtbi.2019.110115] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 12/09/2019] [Accepted: 12/11/2019] [Indexed: 11/20/2022]
|