1
|
Liu Y, Sapoval N, Gallego-García P, Tomás L, Posada D, Treangen TJ, Stadler LB. Crykey: Rapid identification of SARS-CoV-2 cryptic mutations in wastewater. Nat Commun 2024; 15:4545. [PMID: 38806450 PMCID: PMC11133379 DOI: 10.1038/s41467-024-48334-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 04/29/2024] [Indexed: 05/30/2024] Open
Abstract
Wastewater surveillance for SARS-CoV-2 provides early warnings of emerging variants of concerns and can be used to screen for novel cryptic linked-read mutations, which are co-occurring single nucleotide mutations that are rare, or entirely missing, in existing SARS-CoV-2 databases. While previous approaches have focused on specific regions of the SARS-CoV-2 genome, there is a need for computational tools capable of efficiently tracking cryptic mutations across the entire genome and investigating their potential origin. We present Crykey, a tool for rapidly identifying rare linked-read mutations across the genome of SARS-CoV-2. We evaluated the utility of Crykey on over 3,000 wastewater and over 22,000 clinical samples; our findings are three-fold: i) we identify hundreds of cryptic mutations that cover the entire SARS-CoV-2 genome, ii) we track the presence of these cryptic mutations across multiple wastewater treatment plants and over three years of sampling in Houston, and iii) we find a handful of cryptic mutations in wastewater mirror cryptic mutations in clinical samples and investigate their potential to represent real cryptic lineages. In summary, Crykey enables large-scale detection of cryptic mutations in wastewater that represent potential circulating cryptic lineages, serving as a new computational tool for wastewater surveillance of SARS-CoV-2.
Collapse
Affiliation(s)
- Yunxi Liu
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Pilar Gallego-García
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - Laura Tomás
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - David Posada
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
- Department of Biochemistry, Genetics, and Immunology, Universidade de Vigo, 36310, Vigo, Spain
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, 77005, USA.
| | - Lauren B Stadler
- Department of Civil and Environmental Engineering, Rice University, Houston, TX, 77005, USA.
| |
Collapse
|
2
|
Álvarez-Herrera M, Sevilla J, Ruiz-Rodriguez P, Vergara A, Vila J, Cano-Jiménez P, González-Candelas F, Comas I, Coscollá M. VIPERA: Viral Intra-Patient Evolution Reporting and Analysis. Virus Evol 2024; 10:veae018. [PMID: 38510921 PMCID: PMC10953798 DOI: 10.1093/ve/veae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/02/2024] [Accepted: 03/05/2024] [Indexed: 03/22/2024] Open
Abstract
Viral mutations within patients nurture the adaptive potential of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) during chronic infections, which are a potential source of variants of concern. However, there is no integrated framework for the evolutionary analysis of intra-patient SARS-CoV-2 serial samples. Herein, we describe Viral Intra-Patient Evolution Reporting and Analysis (VIPERA), a new software that integrates the evaluation of the intra-patient ancestry of SARS-CoV-2 sequences with the analysis of evolutionary trajectories of serial sequences from the same viral infection. We have validated it using positive and negative control datasets and have successfully applied it to a new case, which revealed population dynamics and evidence of adaptive evolution. VIPERA is available under a free software license at https://github.com/PathoGenOmics-Lab/VIPERA.
Collapse
Affiliation(s)
- Miguel Álvarez-Herrera
- Institute for Integrative Systems Biology (I2SysBio, University of Valencia—CSIC), FISABIO Joint Research Unit ‘Infection and Public Health’, C/Agustín Escardino, 9, Paterna 46980, Spain
| | - Jordi Sevilla
- Institute for Integrative Systems Biology (I2SysBio, University of Valencia—CSIC), FISABIO Joint Research Unit ‘Infection and Public Health’, C/Agustín Escardino, 9, Paterna 46980, Spain
| | - Paula Ruiz-Rodriguez
- Institute for Integrative Systems Biology (I2SysBio, University of Valencia—CSIC), FISABIO Joint Research Unit ‘Infection and Public Health’, C/Agustín Escardino, 9, Paterna 46980, Spain
| | - Andrea Vergara
- Department of Clinical Microbiology, CDB, Hospital Clínic of Barcelona; University of Barcelona; ISGlobal, C. de Villarroel, 170, Barcelona 08007, Spain
- CIBER of Infectious Diseases (CIBERINFEC), Av. Monforte de Lemos, 3-5, Madrid 28029, Spain
| | - Jordi Vila
- Department of Clinical Microbiology, CDB, Hospital Clínic of Barcelona; University of Barcelona; ISGlobal, C. de Villarroel, 170, Barcelona 08007, Spain
- CIBER of Infectious Diseases (CIBERINFEC), Av. Monforte de Lemos, 3-5, Madrid 28029, Spain
| | - Pablo Cano-Jiménez
- Institute of Biomedicine of Valencia (IBV-CSIC), C/ Jaime Roig, 11, Valencia 46010, Spain
| | - Fernando González-Candelas
- Institute for Integrative Systems Biology (I2SysBio, University of Valencia—CSIC), FISABIO Joint Research Unit ‘Infection and Public Health’, C/Agustín Escardino, 9, Paterna 46980, Spain
- CIBER of Epidemiology and Public Health (CIBERESP), Av. Monforte de Lemos, 3-5, Madrid 28029, Spain
| | - Iñaki Comas
- Institute of Biomedicine of Valencia (IBV-CSIC), C/ Jaime Roig, 11, Valencia 46010, Spain
- CIBER of Epidemiology and Public Health (CIBERESP), Av. Monforte de Lemos, 3-5, Madrid 28029, Spain
| | - Mireia Coscollá
- Institute for Integrative Systems Biology (I2SysBio, University of Valencia—CSIC), FISABIO Joint Research Unit ‘Infection and Public Health’, C/Agustín Escardino, 9, Paterna 46980, Spain
| |
Collapse
|
3
|
Carson J, Keeling M, Wyllie D, Ribeca P, Didelot X. Inference of Infectious Disease Transmission through a Relaxed Bottleneck Using Multiple Genomes Per Host. Mol Biol Evol 2024; 41:msad288. [PMID: 38168711 PMCID: PMC10798190 DOI: 10.1093/molbev/msad288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/21/2023] [Accepted: 12/29/2023] [Indexed: 01/05/2024] Open
Abstract
In recent times, pathogen genome sequencing has become increasingly used to investigate infectious disease outbreaks. When genomic data is sampled densely enough amongst infected individuals, it can help resolve who infected whom. However, transmission analysis cannot rely solely on a phylogeny of the genomes but must account for the within-host evolution of the pathogen, which blurs the relationship between phylogenetic and transmission trees. When only a single genome is sampled for each host, the uncertainty about who infected whom can be quite high. Consequently, transmission analysis based on multiple genomes of the same pathogen per host has a clear potential for delivering more precise results, even though it is more laborious to achieve. Here, we present a new methodology that can use any number of genomes sampled from a set of individuals to reconstruct their transmission network. Furthermore, we remove the need for the assumption of a complete transmission bottleneck. We use simulated data to show that our method becomes more accurate as more genomes per host are provided, and that it can infer key infectious disease parameters such as the size of the transmission bottleneck, within-host growth rate, basic reproduction number, and sampling fraction. We demonstrate the usefulness of our method in applications to real datasets from an outbreak of Pseudomonas aeruginosa amongst cystic fibrosis patients and a nosocomial outbreak of Klebsiella pneumoniae.
Collapse
Affiliation(s)
- Jake Carson
- Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK
- School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK
- Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry CV4 7AL, UK
| | - Matt Keeling
- Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK
- School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK
- Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry CV4 7AL, UK
| | | | | | - Xavier Didelot
- School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK
- Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry CV4 7AL, UK
- Department of Statistics, University of Warwick, Coventry CV4 7AL, UK
| |
Collapse
|
4
|
Mushegian AA, Long SW, Olsen RJ, Christensen PA, Subedi S, Chung M, Davis J, Musser J, Ghedin E. Within-host genetic diversity of SARS-CoV-2 in the context of large-scale hospital-associated genomic surveillance. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2022:2022.08.17.22278898. [PMID: 36032964 PMCID: PMC9413716 DOI: 10.1101/2022.08.17.22278898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The COVID-19 pandemic has resulted in extensive surveillance of the genomic diversity of SARS-CoV-2. Sequencing data generated as part of these efforts can also capture the diversity of the SARS-CoV-2 virus populations replicating within infected individuals. To assess this within-host diversity of SARS-CoV-2 we quantified low frequency (minor) variants from deep sequence data of thousands of clinical samples collected by a large urban hospital system over the course of a year. Using a robust analytical pipeline to control for technical artefacts, we observe that at comparable viral loads, specimens from patients hospitalized due to COVID-19 had a greater number of minor variants than samples from outpatients. Since individuals with highly diverse viral populations could be disproportionate drivers of new viral lineages in the patient population, these results suggest that transmission control should pay special attention to patients with severe or protracted disease to prevent the spread of novel variants.
Collapse
Affiliation(s)
- Alexandra A. Mushegian
- Systems Genomics Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD USA
| | - Scott W. Long
- Laboratory of Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital Houston, Texas, 77030
| | - Randall J. Olsen
- Laboratory of Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital Houston, Texas, 77030
| | - Paul A. Christensen
- Laboratory of Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital Houston, Texas, 77030
| | - Sishir Subedi
- Laboratory of Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital Houston, Texas, 77030
| | - Matthew Chung
- Systems Genomics Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD USA
| | - James Davis
- Division of Data Science and Learning, Argonne National Laboratory, 9700 S. Cass Ave., Lemont, Illinois, 60439
- University of Chicago Consortium for Advanced Science and Engineering, 5801 South Ellis Avenue, Chicago, Illinois, 60637
| | - James Musser
- Laboratory of Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital Houston, Texas, 77030
| | - Elodie Ghedin
- Systems Genomics Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD USA
| |
Collapse
|