1
|
Liu Y, Sapoval N, Gallego-García P, Tomás L, Posada D, Treangen TJ, Stadler LB. Crykey: Rapid Identification of SARS-CoV-2 Cryptic Mutations in Wastewater. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.16.23291524. [PMID: 37986916 PMCID: PMC10659477 DOI: 10.1101/2023.06.16.23291524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
We present Crykey, a computational tool for rapidly identifying cryptic mutations of SARS-CoV-2. Specifically, we identify co-occurring single nucleotide mutations on the same sequencing read, called linked-read mutations, that are rare or entirely missing in existing databases, and have the potential to represent novel cryptic lineages found in wastewater. While previous approaches exist for identifying cryptic linked-read mutations from specific regions of the SARS-CoV-2 genome, there is a need for computational tools capable of efficiently tracking cryptic mutations across the entire genome and for tens of thousands of samples and with increased scrutiny, given their potential to represent either artifacts or hidden SARS-CoV-2 lineages. Crykey fills this gap by identifying rare linked-read mutations that pass stringent computational filters to limit the potential for artifacts. We evaluate the utility of Crykey on >3,000 wastewater and >22,000 clinical samples; our findings are three-fold: i) we identify hundreds of cryptic mutations that cover the entire SARS-CoV-2 genome, ii) we track the presence of these cryptic mutations across multiple wastewater treatment plants and over a three years of sampling in Houston, and iii) we find a handful of cryptic mutations in wastewater mirror cryptic mutations in clinical samples and investigate their potential to represent real cryptic lineages. In summary, Crykey enables large-scale detection of cryptic mutations representing potential cryptic lineages in wastewater.
Collapse
Affiliation(s)
- Yunxi Liu
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Pilar Gallego-García
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO
| | - Laura Tomás
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO
| | - David Posada
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO
- Department of Biochemistry, Genetics, and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Todd J. Treangen
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Lauren B. Stadler
- Department of Civil and Environmental Engineering, Rice University, Houston, TX, 77005, USA
| |
Collapse
|
2
|
Swift CL, Isanovic M, Correa Velez KE, Sellers SC, Norman RS. Wastewater surveillance of SARS-CoV-2 mutational profiles at a university and its surrounding community reveals a 20G outbreak on campus. PLoS One 2022; 17:e0266407. [PMID: 35421164 PMCID: PMC9009614 DOI: 10.1371/journal.pone.0266407] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 03/20/2022] [Indexed: 12/16/2022] Open
Abstract
Wastewater surveillance of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been leveraged during the Coronavirus Disease 2019 (COVID-19) pandemic as a public health tool at the community and building level. In this study, we compare the sequence diversity of SARS-CoV-2 amplified from wastewater influent to the Columbia, South Carolina, metropolitan wastewater treatment plant (WWTP) and the University of South Carolina campus during September 2020, which represents the peak of COVID-19 cases at the university during 2020. A total of 92 unique mutations were detected across all WWTP influent and campus samples, with the highest frequency mutations corresponding to the SARS-CoV-2 20C and 20G clades. Signature mutations for the 20G clade dominated SARS-CoV-2 sequences amplified from localized wastewater samples collected at the University of South Carolina, suggesting that the peak in COVID-19 cases during early September 2020 was caused by an outbreak of the 20G lineage. Thirteen mutations were shared between the university building-level wastewater samples and the WWTP influent collected in September 2020, 62% of which were nonsynonymous substitutions. Co-occurrence of mutations was used as a similarity metric to compare wastewater samples. Three pairs of mutations co-occurred in university wastewater and WWTP influent during September 2020. Thirty percent of the detected mutations, including 12 pairs of concurrent mutations, were only detected in university samples. This report affirms the close relationship between the prevalent SARS-CoV-2 genotypes of the student population at a university campus and those of the surrounding community. However, this study also suggests that wastewater surveillance at the building-level at a university offers important insight by capturing sequence diversity that was not apparent in the WWTP influent, thus offering a balance between the community-level wastewater and clinical sequencing.
Collapse
Affiliation(s)
- Candice L. Swift
- Department of Environmental Health Sciences, University of South Carolina, Columbia, SC, United States of America
| | - Mirza Isanovic
- Department of Environmental Health Sciences, University of South Carolina, Columbia, SC, United States of America
| | - Karlen E. Correa Velez
- Department of Environmental Health Sciences, University of South Carolina, Columbia, SC, United States of America
| | - Sarah C. Sellers
- Department of Environmental Health Sciences, University of South Carolina, Columbia, SC, United States of America
| | - R. Sean Norman
- Department of Environmental Health Sciences, University of South Carolina, Columbia, SC, United States of America
| |
Collapse
|
3
|
Swift CL, Isanovic M, Correa Velez KE, Norman RS. Community-level SARS-CoV-2 sequence diversity revealed by wastewater sampling. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 801:149691. [PMID: 34438144 PMCID: PMC8372435 DOI: 10.1016/j.scitotenv.2021.149691] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/11/2021] [Accepted: 08/11/2021] [Indexed: 05/20/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for causing the COVID-19 pandemic, can be detected in untreated wastewater. Wastewater surveillance of SARS-CoV-2 complements clinical data by offering earlier community-level detection, removing underlying factors such as access to healthcare, sampling asymptomatic patients, and reaching a greater population. Here, we compare 24-hour composite samples from the influents of two different wastewater treatment plants (WWTPs) in South Carolina, USA: Columbia and Rock Hill. The sampling intervals span the months of July 2020 and January 2021, which cover the first and second waves of elevated SARS-CoV-2 transmission and COVID-19 clinical cases in these regions. We identify four signature mutations in the surface glycoprotein (spike) gene that are associated with the following variants of interest or concern, VOI or VOC (listed in parenthesis): S477N (B.1.526, Iota), T478K (B.1.617.2, Delta), D614G (present in all VOC as of May 2021), and H655Y (P.1, Gamma). The N501Y mutation, which is associated with three variants of concern, was identified in samples from July 2020, but not detected in January 2021 samples. Comparison of mutations identified in viral sequence databases such as NCBI Virus and GISAID indicated that wastewater sampling detected mutations that were present in South Carolina, but not reflected in the clinical data deposited into databases.
Collapse
Affiliation(s)
- Candice L Swift
- Department of Environmental Health Sciences, University of South Carolina, USA
| | - Mirza Isanovic
- Department of Environmental Health Sciences, University of South Carolina, USA
| | | | - R Sean Norman
- Department of Environmental Health Sciences, University of South Carolina, USA.
| |
Collapse
|
4
|
Olesen SW, Imakaev M, Duvallet C. Making waves: Defining the lead time of wastewater-based epidemiology for COVID-19. WATER RESEARCH 2021; 202:117433. [PMID: 34304074 PMCID: PMC8282235 DOI: 10.1016/j.watres.2021.117433] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 06/24/2021] [Accepted: 07/09/2021] [Indexed: 05/19/2023]
Abstract
Individuals infected with SARS-CoV-2, the virus that causes COVID-19, may shed the virus in stool before developing symptoms, suggesting that measurements of SARS-CoV-2 concentrations in wastewater could be a "leading indicator" of COVID-19 prevalence. Multiple studies have corroborated the leading indicator concept by showing that the correlation between wastewater measurements and COVID-19 case counts is maximized when case counts are lagged. However, the meaning of "leading indicator" will depend on the specific application of wastewater-based epidemiology, and the correlation analysis is not relevant for all applications. In fact, the quantification of a leading indicator will depend on epidemiological, biological, and health systems factors. Thus, there is no single "lead time" for wastewater-based COVID-19 monitoring. To illustrate this complexity, we enumerate three different applications of wastewater-based epidemiology for COVID-19: a qualitative "early warning" system; an independent, quantitative estimate of disease prevalence; and a quantitative alert of bursts of disease incidence. The leading indicator concept has different definitions and utility in each application.
Collapse
|
5
|
Greenwald HD, Kennedy LC, Hinkle A, Whitney ON, Fan VB, Crits-Christoph A, Harris-Lovett S, Flamholz AI, Al-Shayeb B, Liao LD, Beyers M, Brown D, Chakrabarti AR, Dow J, Frost D, Koekemoer M, Lynch C, Sarkar P, White E, Kantor R, Nelson KL. Tools for interpretation of wastewater SARS-CoV-2 temporal and spatial trends demonstrated with data collected in the San Francisco Bay Area. WATER RESEARCH X 2021; 12:100111. [PMID: 34373850 DOI: 10.1101/2021.05.04.21256418] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/30/2021] [Accepted: 07/25/2021] [Indexed: 05/26/2023]
Abstract
Wastewater surveillance for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA can be integrated with COVID-19 case data to inform timely pandemic response. However, more research is needed to apply and develop systematic methods to interpret the true SARS-CoV-2 signal from noise introduced in wastewater samples (e.g., from sewer conditions, sampling and extraction methods, etc.). In this study, raw wastewater was collected weekly from five sewersheds and one residential facility. The concentrations of SARS-CoV-2 in wastewater samples were compared to geocoded COVID-19 clinical testing data. SARS-CoV-2 was reliably detected (95% positivity) in frozen wastewater samples when reported daily new COVID-19 cases were 2.4 or more per 100,000 people. To adjust for variation in sample fecal content, four normalization biomarkers were evaluated: crAssphage, pepper mild mottle virus, Bacteroides ribosomal RNA (rRNA), and human 18S rRNA. Of these, crAssphage displayed the least spatial and temporal variability. Both unnormalized SARS-CoV-2 RNA signal and signal normalized to crAssphage had positive and significant correlation with clinical testing data (Kendall's Tau-b (τ)=0.43 and 0.38, respectively), but no normalization biomarker strengthened the correlation with clinical testing data. Locational dependencies and the date associated with testing data impacted the lead time of wastewater for clinical trends, and no lead time was observed when the sample collection date (versus the result date) was used for both wastewater and clinical testing data. This study supports that trends in wastewater surveillance data reflect trends in COVID-19 disease occurrence and presents tools that could be applied to make wastewater signal more interpretable and comparable across studies.
Collapse
Affiliation(s)
- Hannah D Greenwald
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Lauren C Kennedy
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Adrian Hinkle
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Oscar N Whitney
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Vinson B Fan
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Alexander Crits-Christoph
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| | | | - Avi I Flamholz
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Basem Al-Shayeb
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| | - Lauren D Liao
- School of Public Health, University of California, Berkeley, CA, USA
| | - Matt Beyers
- Alameda County Public Health Department, San Leandro, CA, USA
| | | | | | - Jason Dow
- Central Marin Sanitation Agency, San Rafael, CA, USA
| | - Dan Frost
- Central Contra Costa Sanitary District, Martinez, CA, USA
| | | | - Chris Lynch
- Contra Costa Health Services, Martinez, CA, USA
| | - Payal Sarkar
- San José-Santa Clara Regional Wastewater Facility, San José, CA, USA
| | - Eileen White
- East Bay Municipal Utility District, Oakland, CA, USA
| | - Rose Kantor
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Kara L Nelson
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| |
Collapse
|
6
|
Greenwald HD, Kennedy LC, Hinkle A, Whitney ON, Fan VB, Crits-Christoph A, Harris-Lovett S, Flamholz AI, Al-Shayeb B, Liao LD, Beyers M, Brown D, Chakrabarti AR, Dow J, Frost D, Koekemoer M, Lynch C, Sarkar P, White E, Kantor R, Nelson KL. Tools for interpretation of wastewater SARS-CoV-2 temporal and spatial trends demonstrated with data collected in the San Francisco Bay Area. WATER RESEARCH X 2021; 12:100111. [PMID: 34373850 PMCID: PMC8325558 DOI: 10.1016/j.wroa.2021.100111] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/30/2021] [Accepted: 07/25/2021] [Indexed: 05/18/2023]
Abstract
Wastewater surveillance for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA can be integrated with COVID-19 case data to inform timely pandemic response. However, more research is needed to apply and develop systematic methods to interpret the true SARS-CoV-2 signal from noise introduced in wastewater samples (e.g., from sewer conditions, sampling and extraction methods, etc.). In this study, raw wastewater was collected weekly from five sewersheds and one residential facility. The concentrations of SARS-CoV-2 in wastewater samples were compared to geocoded COVID-19 clinical testing data. SARS-CoV-2 was reliably detected (95% positivity) in frozen wastewater samples when reported daily new COVID-19 cases were 2.4 or more per 100,000 people. To adjust for variation in sample fecal content, four normalization biomarkers were evaluated: crAssphage, pepper mild mottle virus, Bacteroides ribosomal RNA (rRNA), and human 18S rRNA. Of these, crAssphage displayed the least spatial and temporal variability. Both unnormalized SARS-CoV-2 RNA signal and signal normalized to crAssphage had positive and significant correlation with clinical testing data (Kendall's Tau-b (τ)=0.43 and 0.38, respectively), but no normalization biomarker strengthened the correlation with clinical testing data. Locational dependencies and the date associated with testing data impacted the lead time of wastewater for clinical trends, and no lead time was observed when the sample collection date (versus the result date) was used for both wastewater and clinical testing data. This study supports that trends in wastewater surveillance data reflect trends in COVID-19 disease occurrence and presents tools that could be applied to make wastewater signal more interpretable and comparable across studies.
Collapse
Affiliation(s)
- Hannah D. Greenwald
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Lauren C. Kennedy
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Adrian Hinkle
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Oscar N. Whitney
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Vinson B. Fan
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Alexander Crits-Christoph
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| | | | - Avi I. Flamholz
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Basem Al-Shayeb
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| | - Lauren D. Liao
- School of Public Health, University of California, Berkeley, CA, USA
| | - Matt Beyers
- Alameda County Public Health Department, San Leandro, CA, USA
| | | | | | - Jason Dow
- Central Marin Sanitation Agency, San Rafael, CA, USA
| | - Dan Frost
- Central Contra Costa Sanitary District, Martinez, CA, USA
| | | | - Chris Lynch
- Contra Costa Health Services, Martinez, CA, USA
| | - Payal Sarkar
- San José-Santa Clara Regional Wastewater Facility, San José, CA, USA
| | - Eileen White
- East Bay Municipal Utility District, Oakland, CA, USA
| | - Rose Kantor
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
| | - Kara L. Nelson
- Department of Civil and Environmental Engineering, University of California, Berkeley, CA, USA
- Berkeley Water Center, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| |
Collapse
|