1
|
Suhail Y, Afzal J, Kshitiz. Incorporating and addressing testing bias within estimates of epidemic dynamics for SARS-CoV-2. BMC Med Res Methodol 2021; 21:11. [PMID: 33413154 PMCID: PMC7789897 DOI: 10.1186/s12874-020-01196-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 12/18/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The disease burden of SARS-CoV-2 as measured by tests from various localities, and at different time points present varying estimates of infection and fatality rates. Models based on these acquired data may suffer from systematic errors and large estimation variances due to the biases associated with testing. An unbiased randomized testing to estimate the true fatality rate is still missing. METHODS Here, we characterize the effect of incidental sampling bias in the estimation of epidemic dynamics. Towards this, we explicitly modeled for sampling bias in an augmented compartment model to predict epidemic dynamics. We further calculate the bias from differences in disease prediction from biased, and randomized sampling, proposing a strategy to obtain unbiased estimates. RESULTS Our simulations demonstrate that sampling biases in favor of patients with higher disease manifestation could significantly affect direct estimates of infection and fatality rates calculated from the numbers of confirmed cases and deaths, and serological testing can partially mitigate these biased estimates. CONCLUSIONS The augmented compartmental model allows the explicit modeling of different testing policies and their effects on disease estimates. Our calculations for the dependence of expected confidence on a randomized sample sizes, show that relatively small sample sizes can provide statistically significant estimates for SARS-CoV-2 related death rates.
Collapse
Affiliation(s)
- Yasir Suhail
- Department of Biomedical Engineering, University of Connecticut Health, Farmington, CT, USA.
- Center for Cancer Systems Biology @ Yale, West Haven, CT, USA.
| | - Junaid Afzal
- Department of Medicine, University of California, San Francisco, CA, USA
| | - Kshitiz
- Department of Biomedical Engineering, University of Connecticut Health, Farmington, CT, USA.
- Center for Cancer Systems Biology @ Yale, West Haven, CT, USA.
| |
Collapse
|
2
|
Pérez-Losada M, Arenas M, Galán JC, Bracho MA, Hillung J, García-González N, González-Candelas F. High-throughput sequencing (HTS) for the analysis of viral populations. INFECTION GENETICS AND EVOLUTION 2020; 80:104208. [PMID: 32001386 DOI: 10.1016/j.meegid.2020.104208] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 01/21/2020] [Accepted: 01/24/2020] [Indexed: 12/12/2022]
Abstract
The development of High-Throughput Sequencing (HTS) technologies is having a major impact on the genomic analysis of viral populations. Current HTS platforms can capture nucleic acid variation across millions of genes for both selected amplicons and full viral genomes. HTS has already facilitated the discovery of new viruses, hinted new taxonomic classifications and provided a deeper and broader understanding of their diversity, population and genetic structure. Hence, HTS has already replaced standard Sanger sequencing in basic and applied research fields, but the next step is its implementation as a routine technology for the analysis of viruses in clinical settings. The most likely application of this implementation will be the analysis of viral genomics, because the huge population sizes, high mutation rates and very fast replacement of viral populations have demonstrated the limited information obtained with Sanger technology. In this review, we describe new technologies and provide guidelines for the high-throughput sequencing and genetic and evolutionary analyses of viral populations and metaviromes, including software applications. With the development of new HTS technologies, new and refurbished molecular and bioinformatic tools are also constantly being developed to process and integrate HTS data. These allow assembling viral genomes and inferring viral population diversity and dynamics. Finally, we also present several applications of these approaches to the analysis of viral clinical samples including transmission clusters and outbreak characterization.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão 4485-661, Portugal
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain; Biomedical Research Center (CINBIO), University of Vigo, 36310 Vigo, Spain.
| | - Juan Carlos Galán
- Microbiology Service, Hospital Ramón y Cajal, Madrid, Spain; CIBER in Epidemiology and Public Health, Spain.
| | - Mª Alma Bracho
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain.
| | - Julia Hillung
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Neris García-González
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Fernando González-Candelas
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| |
Collapse
|
3
|
Zhernakov AI, Afonin AM, Gavriliuk ND, Moiseeva OM, Zhukov VA. s-dePooler: determination of polymorphism carriers from overlapping DNA pools. BMC Bioinformatics 2019; 20:45. [PMID: 30669964 PMCID: PMC6343301 DOI: 10.1186/s12859-019-2616-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 01/09/2019] [Indexed: 11/26/2022] Open
Abstract
Background Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Surprisingly there is a lack of tools for result interpretation and carrier identification, i.e. for “depooling”. Results Here we present s-dePooler, the application for analysis of pooling experiments data. s-dePooler uses the variants information (VCF-file) and the pooling scheme to produce a list of candidate carriers for each polymorphism. We incorporated s-dePooler into a pipeline (dePoP) for automation of pooling analysis. The performance of the pipeline was tested on a synthetic dataset built using the 1000 Genomes Project data, resulting in the successful identification 97% of carriers of polymorphisms present in fewer than ~ 10% of carriers. Conclusions s-dePooler along with dePoP can be used to identify carriers of polymorphisms in overlapping pools, and is compatible with any pooling scheme with equivalent molar ratios of pooled samples. s-dePooler and dePoP with usage instructions and test data are freely available at https://github.com/lab9arriam/depop. Electronic supplementary material The online version of this article (10.1186/s12859-019-2616-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Aleksandr Igorevich Zhernakov
- Research Department of Non-Coronary Heart Diseases, Almazov National Medical Research Center, Ministry of Health of Russia, 2 Akkuratova St., St. Petersburg, 197341, Russia. .,All-Russia Research Institute for Agricultural Microbiology (ARRIAM), 3 Podbelsky Ch., St. Petersburg - Pushkin, 196608, Russia.
| | - Alexey Mikhailovich Afonin
- All-Russia Research Institute for Agricultural Microbiology (ARRIAM), 3 Podbelsky Ch., St. Petersburg - Pushkin, 196608, Russia
| | - Natalia Dmitrievna Gavriliuk
- Research Department of Non-Coronary Heart Diseases, Almazov National Medical Research Center, Ministry of Health of Russia, 2 Akkuratova St., St. Petersburg, 197341, Russia
| | - Olga Mikhailovna Moiseeva
- Research Department of Non-Coronary Heart Diseases, Almazov National Medical Research Center, Ministry of Health of Russia, 2 Akkuratova St., St. Petersburg, 197341, Russia
| | - Vladimir Aleksandrovich Zhukov
- Research Department of Non-Coronary Heart Diseases, Almazov National Medical Research Center, Ministry of Health of Russia, 2 Akkuratova St., St. Petersburg, 197341, Russia.,All-Russia Research Institute for Agricultural Microbiology (ARRIAM), 3 Podbelsky Ch., St. Petersburg - Pushkin, 196608, Russia
| |
Collapse
|
4
|
Gupta N, Zahra S, Singh A, Kumar S. PVsiRNAdb: a database for plant exclusive virus-derived small interfering RNAs. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:5126495. [PMID: 30307523 PMCID: PMC6181178 DOI: 10.1093/database/bay105] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Accepted: 09/14/2018] [Indexed: 11/13/2022]
Abstract
Ribonucleic acids (RNA) interference mechanism has been proved to be an important regulator of both transcriptional and post-transcription controls of gene expression during biotic and abiotic stresses in plants. Virus-derived small interfering RNAs (vsiRNAs) are established components of the RNA silencing mechanism for incurring anti-viral resistance in plants. Some databases like siRNAdb, HIVsirDB and VIRsiRNAdb are available online pertaining to siRNAs as well as vsiRNAs generated during viral infection in humans; however, currently there is a lack of repository for plant exclusive vsiRNAs. We have developed `PVsiRNAdb (http://www.nipgr.res.in/PVsiRNAdb)', a manually curated plant-exclusive database harboring information related to vsiRNAs found in different virus-infected plants collected by exhaustive data mining of published literature so far. This database contains a total of 322 214 entries and 282 549 unique sequences of vsiRNAs. In PVsiRNAdb, detailed and comprehensive information is available for each vsiRNA sequence. Apart from the core information consisting of plant, tissue, virus name and vsiRNA sequence, additional information of each vsiRNAs (map position, length, coordinates, strand information and predicted structure) may be of high utility to the user. Different types of search and browse modules with three different tools namely BLAST, Smith-Waterman Align and Mapping are provided at PVsiRNAdb. Thus, this database being one of its kind will surely be of much use to molecular biologists for exploring the complex viral genetics and genomics, viral-host interactions and beneficial to the scientific community and can prove to be very advantageous in the field of agriculture for producing viral resistance transgenic crops.
Collapse
Affiliation(s)
- Nikita Gupta
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | - Shafaque Zahra
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | - Ajeet Singh
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | - Shailesh Kumar
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| |
Collapse
|
5
|
Glebova O, Knyazev S, Melnyk A, Artyomenko A, Khudyakov Y, Zelikovsky A, Skums P. Inference of genetic relatedness between viral quasispecies from sequencing data. BMC Genomics 2017; 18:918. [PMID: 29244009 PMCID: PMC5731608 DOI: 10.1186/s12864-017-4274-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND RNA viruses such as HCV and HIV mutate at extremely high rates, and as a result, they exist in infected hosts as populations of genetically related variants. Recent advances in sequencing technologies make possible to identify such populations at great depth. In particular, these technologies provide new opportunities for inference of relatedness between viral samples, identification of transmission clusters and sources of infection, which are crucial tasks for viral outbreaks investigations. RESULTS We present (i) an evolutionary simulation algorithm Viral Outbreak InferenCE (VOICE) inferring genetic relatedness, (ii) an algorithm MinDistB detecting possible transmission using minimal distances between intra-host viral populations and sizes of their relative borders, and (iii) a non-parametric recursive clustering algorithm Relatedness Depth (ReD) analyzing clusters' structure to infer possible transmissions and their directions. All proposed algorithms were validated using real sequencing data from HCV outbreaks. CONCLUSIONS All algorithms are applicable to the analysis of outbreaks of highly heterogeneous RNA viruses. Our experimental validation shows that they can successfully identify genetic relatedness between viral populations, as well as infer transmission clusters and outbreak sources.
Collapse
Affiliation(s)
- Olga Glebova
- Computer Science Department, Georgia State University, 25 Park Place NE, Atlanta, 30303, GA, USA.
| | - Sergey Knyazev
- Computer Science Department, Georgia State University, 25 Park Place NE, Atlanta, 30303, GA, USA
| | - Andrew Melnyk
- Computer Science Department, Georgia State University, 25 Park Place NE, Atlanta, 30303, GA, USA
| | - Alexander Artyomenko
- Computer Science Department, Georgia State University, 25 Park Place NE, Atlanta, 30303, GA, USA
| | - Yury Khudyakov
- Centers for Disease Control and Prevention, 1600 Clifton Rd, Atlanta, 30329, GA, USA
| | - Alex Zelikovsky
- Computer Science Department, Georgia State University, 25 Park Place NE, Atlanta, 30303, GA, USA
| | - Pavel Skums
- Computer Science Department, Georgia State University, 25 Park Place NE, Atlanta, 30303, GA, USA.,Centers for Disease Control and Prevention, 1600 Clifton Rd, Atlanta, 30329, GA, USA
| |
Collapse
|
6
|
Artyomenko A, Wu NC, Mangul S, Eskin E, Sun R, Zelikovsky A. Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants. J Comput Biol 2016; 24:558-570. [PMID: 27901586 DOI: 10.1089/cmb.2016.0146] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous "swarm" of mutant variants. The long read length offered by single-molecule sequencing technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this article, we present two single-nucleotide variants (2SNV), a method able to tolerate the high error rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single-nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2% and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction.
Collapse
Affiliation(s)
| | - Nicholas C Wu
- 2 Department of Integrative Structural and Computational Biology, The Scripps Research Institute , La Jolla, California
| | - Serghei Mangul
- 3 Department of Computer Science, University of California , Los Angeles, Los Angeles, California.,4 Institute for Quantitative and Computational Biosciences, University of California Los Angeles , Los Angeles, California
| | - Eleazar Eskin
- 3 Department of Computer Science, University of California , Los Angeles, Los Angeles, California
| | - Ren Sun
- 5 Molecular and Medical Pharmacology, University of California , Los Angeles, Los Angeles, California
| | - Alex Zelikovsky
- 1 Department of Computer Science, Georgia State University , Atlanta, Georgia
| |
Collapse
|
7
|
Skums P, Artyomenko A, Glebova O, Ramachandran S, Campo DS, Dimitrova Z, Măndoiu II, Zelikovsky A, Khudyakov Y. Pooling Strategy for Massive Viral Sequencing. COMPUTATIONAL METHODS FOR NEXT GENERATION SEQUENCING DATA ANALYSIS 2016:57-83. [DOI: 10.1002/9781119272182.ch3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2025]
|
8
|
|
9
|
Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants. LECTURE NOTES IN COMPUTER SCIENCE 2016. [DOI: 10.1007/978-3-319-31957-5_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
10
|
Zhang C, Wu Z, Li Y, Wu J. Biogenesis, Function, and Applications of Virus-Derived Small RNAs in Plants. Front Microbiol 2015; 6:1237. [PMID: 26617580 PMCID: PMC4637412 DOI: 10.3389/fmicb.2015.01237] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 10/26/2015] [Indexed: 11/13/2022] Open
Abstract
RNA silencing, an evolutionarily conserved and sequence-specific gene-inactivation system, has a pivotal role in antiviral defense in most eukaryotic organisms. In plants, a class of exogenous small RNAs (sRNAs) originating from the infecting virus called virus-derived small interfering RNAs (vsiRNAs) are predominantly responsible for RNA silencing-mediated antiviral immunity. Nowadays, the process of vsiRNA formation and the role of vsiRNAs in plant viral defense have been revealed through deep sequencing of sRNAs and diverse genetic analysis. The biogenesis of vsiRNAs is analogous to that of endogenous sRNAs, which require diverse essential components including dicer-like (DCL), argonaute (AGO), and RNA-dependent RNA polymerase (RDR) proteins. vsiRNAs trigger antiviral defense through post-transcriptional gene silencing (PTGS) or transcriptional gene silencing (TGS) of viral RNA, and they hijack the host RNA silencing system to target complementary host transcripts. Additionally, several applications that take advantage of the current knowledge of vsiRNAs research are being used, such as breeding antiviral plants through genetic engineering technology, reconstructing of viral genomes, and surveying viral ecology and populations. Here, we will provide an overview of vsiRNA pathways, with a primary focus on the advances in vsiRNA biogenesis and function, and discuss their potential applications as well as the future challenges in vsiRNAs research.
Collapse
Affiliation(s)
- Chao Zhang
- Key Laboratory of Plant Virology of Fujian Province, Institute of Plant Virology, Fujian Agriculture and Forestry University Fuzhou, China
| | - Zujian Wu
- Key Laboratory of Plant Virology of Fujian Province, Institute of Plant Virology, Fujian Agriculture and Forestry University Fuzhou, China
| | - Yi Li
- Peking-Yale Joint Center for Plant Molecular Genetics and Agrobiotechnology, The National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University Beijing, China
| | - Jianguo Wu
- Key Laboratory of Plant Virology of Fujian Province, Institute of Plant Virology, Fujian Agriculture and Forestry University Fuzhou, China ; Peking-Yale Joint Center for Plant Molecular Genetics and Agrobiotechnology, The National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University Beijing, China
| |
Collapse
|
11
|
Ma J, Pallett D, Jiang H, Hou Y, Wang H. Mutational bias of Turnip Yellow Mosaic Virus in the context of host anti-viral gene silencing. Virology 2015; 486:2-6. [PMID: 26379088 DOI: 10.1016/j.virol.2015.08.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Revised: 05/01/2015] [Accepted: 08/21/2015] [Indexed: 01/04/2023]
Abstract
Plant Dicer-like (DCL) enzymes exhibit a GC-preference during anti-viral post-transcriptional gene silencing (PTGS), delivering an evolutionary selection pressure resulting in plant viruses with GC-poor genomes. However, some viruses, e.g. Turnip Yellow Mosaic Virus (TYMV, genus Tymovirus) have GC-rich genomes, raising the question as to whether or not DCL derived selection pressure affects these viruses. In this study we analyzed the virus-derived small interfering RNAs from TYMV-infected leaves of Brassica juncea showed that the TYMV population accumulated a mutational bias with AU replacing GC (GC-AU), demonstrating PTGS pressure. Interestingly, at the highly polymorphic sites the GC-AU bias was no longer observed. This suggests the presence of an unknown mechanism preventing mutational drift of the viral population and maintaining viral genome stability, despite the host PTGS pressure.
Collapse
Affiliation(s)
- Jinmin Ma
- BGI-shenzhen, Beishan Road, Yantian, Shenzhen 518083, China
| | - Denise Pallett
- NERC/Centre for Ecology & Hydrology, Benson Lane, Wallingford, Oxfordshire OX10 8BB, UK
| | - Hui Jiang
- BGI-shenzhen, Beishan Road, Yantian, Shenzhen 518083, China
| | - Yong Hou
- BGI-shenzhen, Beishan Road, Yantian, Shenzhen 518083, China
| | - Hui Wang
- BGI-shenzhen, Beishan Road, Yantian, Shenzhen 518083, China; NERC/Centre for Ecology & Hydrology, Benson Lane, Wallingford, Oxfordshire OX10 8BB, UK; Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK.
| |
Collapse
|