1
|
Tachbele E, Kyobe S, Katabazi FA, Kigozi E, Mwesigwa S, Joloba M, Messele A, Amogne W, Legesse M, Pieper R, Ameni G. Genetic Diversity and Acquired Drug Resistance Mutations Detected by Deep Sequencing in Virologic Failures among Antiretroviral Treatment Experienced Human Immunodeficiency Virus-1 Patients in a Pastoralist Region of Ethiopia. Infect Drug Resist 2021; 14:4833-4847. [PMID: 34819737 PMCID: PMC8607991 DOI: 10.2147/idr.s337485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 11/03/2021] [Indexed: 01/15/2023] Open
Abstract
Purpose This study was conducted to investigate the drug resistance mutations and genetic diversity of HIV-1 in ART experienced patients in South Omo, Ethiopia. Patients and Methods A cross-sectional study conducted on 253 adult patients attending ART clinics for ≥6 months in South Omo. Samples with VL ≥1000 copies/mL were considered as virological failures (VF) and their reverse transcriptase gene codons 90–234 were sequenced using Illumina MiSeq. MinVar was used for the identification of the subtypes and drug resistance mutations. Phylogenetic tree was constructed by neighbor-joining method using the maximum likelihood model. Results The median duration of ART was 51 months and 18.6% (47/253) of the patients exhibited VF. Of 47 viraemic patients, the genome of 41 were sequenced and subtype C was dominant (87.8%) followed by recombinant subtype BC (4.9%), M-09-CPX (4.9) and BF1 (2.4%). Of 41 genotyped subjects, 85.4% (35/41) had at least one ADR mutation. Eighty-one percent (33/41) of viraemic patients harbored NRTI resistance mutations, and 48.8% (20/41) were positive for NNRTI resistance mutations, with 43.9% dual resistance mutations. Among NRTI resistance mutations, M184V (73.2%), K219Q (63.4%) and T215 (56.1%) complex were the most mutated positions, while the most common NNRTI resistance mutations were K103N (24.4%), K101E, P225H and V108I 7.5% each. Active tuberculosis (aOR=13, 95% CI= 3.46–29.69), immunological failure (aOR=3.61, 95% CI=1.26–10.39), opportunistic infections (aOR=8.39, 95% CI= 1.75–40.19), and poor adherence were significantly associated with virological failure, while rural residence (aOR 2.37; 95% CI: 1.62–9.10, P= 0.05), immunological failures (aOR 2.37; 95% CI: 1.62–9.10, P= 0.05) and high viral load (aOR 16; 95% CI: 5.35 51.59, P <0.001) were predictors of ADR mutation among the ART experienced and viraemic study subjects. Conclusion The study revealed considerable prevalence of VF and ADR mutation with the associated risk indicators. Regular virological monitoring and drug resistance genotyping methods should be implemented for better ART treatment outcomes of the nation.
Collapse
Affiliation(s)
- Erdaw Tachbele
- Aklilu Lemma Institute of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia.,College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Samuel Kyobe
- College of Health Sciences, Makerere University, Kampala, Uganda
| | | | - Edgar Kigozi
- College of Health Sciences, Makerere University, Kampala, Uganda
| | | | - Moses Joloba
- College of Health Sciences, Makerere University, Kampala, Uganda
| | - Alebachew Messele
- Aklilu Lemma Institute of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
| | - Wondwossen Amogne
- College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Mengistu Legesse
- Aklilu Lemma Institute of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
| | | | - Gobena Ameni
- Aklilu Lemma Institute of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
| |
Collapse
|
2
|
Knyazev S, Hughes L, Skums P, Zelikovsky A. Epidemiological data analysis of viral quasispecies in the next-generation sequencing era. Brief Bioinform 2021; 22:96-108. [PMID: 32568371 PMCID: PMC8485218 DOI: 10.1093/bib/bbaa101] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 04/24/2020] [Accepted: 05/04/2020] [Indexed: 01/04/2023] Open
Abstract
The unprecedented coverage offered by next-generation sequencing (NGS) technology has facilitated the assessment of the population complexity of intra-host RNA viral populations at an unprecedented level of detail. Consequently, analysis of NGS datasets could be used to extract and infer crucial epidemiological and biomedical information on the levels of both infected individuals and susceptible populations, thus enabling the development of more effective prevention strategies and antiviral therapeutics. Such information includes drug resistance, infection stage, transmission clusters and structures of transmission networks. However, NGS data require sophisticated analysis dealing with millions of error-prone short reads per patient. Prior to the NGS era, epidemiological and phylogenetic analyses were geared toward Sanger sequencing technology; now, they must be redesigned to handle the large-scale NGS datasets and properly model the evolution of heterogeneous rapidly mutating viral populations. Additionally, dedicated epidemiological surveillance systems require big data analytics to handle millions of reads obtained from thousands of patients for rapid outbreak investigation and management. We survey bioinformatics tools analyzing NGS data for (i) characterization of intra-host viral population complexity including single nucleotide variant and haplotype calling; (ii) downstream epidemiological analysis and inference of drug-resistant mutations, age of infection and linkage between patients; and (iii) data collection and analytics in surveillance systems for fast response and control of outbreaks.
Collapse
|
3
|
Billerbeck E, Wolfisberg R, Fahnøe U, Xiao JW, Quirk C, Luna JM, Cullen JM, Hartlage AS, Chiriboga L, Ghoshal K, Lipkin WI, Bukh J, Scheel TKH, Kapoor A, Rice CM. Mouse models of acute and chronic hepacivirus infection. Science 2018; 357:204-208. [PMID: 28706073 DOI: 10.1126/science.aal1962] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Revised: 04/03/2017] [Accepted: 06/05/2017] [Indexed: 12/12/2022]
Abstract
An estimated 71 million people worldwide are infected with hepatitis C virus (HCV). The lack of small-animal models has impeded studies of antiviral immune mechanisms. Here we show that an HCV-related hepacivirus discovered in Norway rats can establish high-titer hepatotropic infections in laboratory mice with immunological features resembling those seen in human viral hepatitis. Whereas immune-compromised mice developed persistent infection, immune-competent mice cleared the virus within 3 to 5 weeks. Acute clearance was T cell dependent and associated with liver injury. Transient depletion of CD4+ T cells before infection resulted in chronic infection, characterized by high levels of intrahepatic regulatory T cells and expression of inhibitory molecules on intrahepatic CD8+ T cells. Natural killer cells controlled early infection but were not essential for viral clearance. This model may provide mechanistic insights into hepatic antiviral immunity, a prerequisite for the development of HCV vaccines.
Collapse
Affiliation(s)
- Eva Billerbeck
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA
| | - Raphael Wolfisberg
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases and Clinical Research Centre, Hvidovre Hospital and Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Ulrik Fahnøe
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases and Clinical Research Centre, Hvidovre Hospital and Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jing W Xiao
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA
| | - Corrine Quirk
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA
| | - Joseph M Luna
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA
| | - John M Cullen
- College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA
| | - Alex S Hartlage
- Center for Vaccines and Immunity, The Research Institute at Nationwide Children's Hospital and Department of Pediatrics, Ohio State University, Columbus, OH, USA
| | - Luis Chiriboga
- Department of Pathology, New York University Medical Center, New York, NY, USA
| | - Kalpana Ghoshal
- Department of Pathology, Comprehensive Cancer Center, Ohio State University, Columbus, OH, USA
| | - W Ian Lipkin
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Jens Bukh
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases and Clinical Research Centre, Hvidovre Hospital and Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Troels K H Scheel
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA.,Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases and Clinical Research Centre, Hvidovre Hospital and Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Amit Kapoor
- Center for Vaccines and Immunity, The Research Institute at Nationwide Children's Hospital and Department of Pediatrics, Ohio State University, Columbus, OH, USA
| | - Charles M Rice
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
4
|
Young WC, Raftery AE, Yeung KY. Model-Based Clustering With Data Correction For Removing Artifacts In Gene Expression Data. Ann Appl Stat 2017; 11:1998-2026. [PMID: 30740193 DOI: 10.1214/17-aoas1051] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The NIH Library of Integrated Network-based Cellular Signatures (LINCS) contains gene expression data from over a million experiments, using Luminex Bead technology. Only 500 colors are used to measure the expression levels of the 1,000 landmark genes measured, and the data for the resulting pairs of genes are deconvolved. The raw data are sometimes inadequate for reliable deconvolution, leading to artifacts in the final processed data. These include the expression levels of paired genes being flipped or given the same value, and clusters of values that are not at the true expression level. We propose a new method called model-based clustering with data correction (MCDC) that is able to identify and correct these three kinds of artifacts simultaneously. We show that MCDC improves the resulting gene expression data in terms of agreement with external baselines, as well as improving results from subsequent analysis.
Collapse
Affiliation(s)
- William Chad Young
- Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195
| | - Adrian E Raftery
- Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195
| | - Ka Yee Yeung
- Institute of Technology, University of Washington Tacoma, Campus Box 358426, 1900 Commerce Street, Tacoma, WA 98402
| |
Collapse
|
5
|
Shen-Gunther J, Wang Y, Lai Z, Poage GM, Perez L, Huang THM. Deep sequencing of HPV E6/E7 genes reveals loss of genotypic diversity and gain of clonal dominance in high-grade intraepithelial lesions of the cervix. BMC Genomics 2017; 18:231. [PMID: 28288568 PMCID: PMC5348809 DOI: 10.1186/s12864-017-3612-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 03/07/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Human papillomavirus (HPV) is the carcinogen of almost all invasive cervical cancer and a major cause of oral and other anogenital malignancies. HPV genotyping by dideoxy (Sanger) sequencing is currently the reference method of choice for clinical diagnostics. However, for samples with multiple HPV infections, genotype identification is singular and occasionally imprecise or indeterminable due to overlapping chromatograms. Our aim was to explore and compare HPV metagenomes in abnormal cervical cytology by deep sequencing for correlation with disease states. RESULTS Low- and high-grade intraepithelial lesion (LSIL and HSIL) cytology samples were DNA extracted for PCR-amplification of the HPV E6/E7 genes. HPV+ samples were sequenced by dideoxy and deep methods. Deep sequencing revealed ~60% of all samples (n = 72) were multi-HPV infected. Among LSIL samples (n = 43), 27 different genotypes were found. The 3 dominant (most abundant) genotypes were: HPV-39, 11/43 (26%); -16, 9/43 (21%); and -35, 4/43 (9%). Among HSIL (n = 29), 17 HPV genotypes were identified; the 3 dominant genotypes were: HPV-16, 21/29 (72%); -35, 4/29 (14%); and -39, 3/29 (10%). Phylogenetically, type-specific E6/E7 genetic distances correlated with carcinogenic potential. Species diversity analysis between LSIL and HSIL revealed loss of HPV diversity and domination by HPV-16 in HSIL samples. CONCLUSIONS Deep sequencing resolves HPV genotype composition within multi-infected cervical cytology. Biodiversity analysis reveals loss of diversity and gain of dominance by carcinogenic genotypes in high-grade cytology. Metagenomic profiles may therefore serve as a biomarker of disease severity and a population surveillance tool for emerging genotypes.
Collapse
Affiliation(s)
- Jane Shen-Gunther
- Department of Clinical Investigation, Brooke Army Medical Center, Gynecologic Oncology & Clinical Investigation, 3698 Chambers Pass, Fort Sam Houston, TX 78234 USA
| | - Yufeng Wang
- Department of Biology, University of Texas at San Antonio, San Antonio, TX 78249 USA
| | - Zhao Lai
- Greehey Children’s Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229 USA
| | - Graham M. Poage
- Department of Clinical Investigation, Brooke Army Medical Center, Gynecologic Oncology & Clinical Investigation, 3698 Chambers Pass, Fort Sam Houston, TX 78234 USA
| | - Luis Perez
- Department of Clinical Investigation, Brooke Army Medical Center, Gynecologic Oncology & Clinical Investigation, 3698 Chambers Pass, Fort Sam Houston, TX 78234 USA
| | - Tim H. M. Huang
- Department of Molecular Medicine, Cancer Therapy and Research Center, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229 USA
| |
Collapse
|
6
|
MinVar: A rapid and versatile tool for HIV-1 drug resistance genotyping by deep sequencing. J Virol Methods 2016; 240:7-13. [PMID: 27867045 DOI: 10.1016/j.jviromet.2016.11.008] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 10/17/2016] [Accepted: 11/11/2016] [Indexed: 02/08/2023]
Abstract
Genotypic monitoring of drug-resistance mutations (DRMs) in HIV-1 infected individuals is strongly recommended to guide selection of the initial antiretroviral therapy (ART) and changes of drug regimens. Traditionally, mutations conferring drug resistance are detected by population sequencing of the reverse transcribed viral RNA encoding the HIV-1 enzymes target by ART, followed by manual analysis and interpretation of Sanger sequencing traces. This process is labor intensive, relies on subjective interpretation from the operator, and offers limited sensitivity as only mutations above 20% frequency can be reliably detected. Here we present MinVar, a pipeline for the analysis of deep sequencing data, which allows reliable and automated detection of DRMs down to 5%. We evaluated MinVar with data from amplicon sequencing of defined mixtures of molecular virus clones with known DRM and plasma samples of viremic HIV-1 infected individuals and we compared it to VirVarSeq, another virus variant detection tool exclusively working on Illumina deep sequencing data. MinVar was designed to be compatible with a diverse range of sequencing platforms and allows the detection of DRMs and insertions/deletions from deep sequencing data without the need to perform additional bioinformatics analysis, a prerequisite to a widespread implementation of HIV-1 genotyping using deep sequencing in routine diagnostic settings.
Collapse
|
7
|
Posada-Cespedes S, Seifert D, Beerenwinkel N. Recent advances in inferring viral diversity from high-throughput sequencing data. Virus Res 2016; 239:17-32. [PMID: 27693290 DOI: 10.1016/j.virusres.2016.09.016] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Revised: 09/23/2016] [Accepted: 09/24/2016] [Indexed: 02/05/2023]
Abstract
Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity estimation and global haplotype reconstruction. Challenges posed by aligning reads, as well as the impact of reference biases on diversity estimates are also discussed. In addition, we address some of the experimental approaches designed to improve the biological signal-to-noise ratio. In the future, computational methods for the analysis of heterogeneous virus populations are likely to continue being complemented by technological developments.
Collapse
Affiliation(s)
- Susana Posada-Cespedes
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland
| | - David Seifert
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland.
| |
Collapse
|
8
|
Scrucca L, Fop M, Murphy TB, Raftery AE. mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models. THE R JOURNAL 2016; 8:289-317. [PMID: 27818791 PMCID: PMC5096736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Finite mixture models are being used increasingly to model a wide variety of random phenomena for clustering, classification and density estimation. mclust is a powerful and popular package which allows modelling of data as a Gaussian finite mixture with different covariance structures and different numbers of mixture components, for a variety of purposes of analysis. Recently, version 5 of the package has been made available on CRAN. This updated version adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.
Collapse
Affiliation(s)
- Luca Scrucca
- Università degli Studi di Perugia, Via A. Pascoli 20, 06123 Perugia, Italy
| | - Michael Fop
- University College Dublin, Belfield, Dublin 4, Ireland
| | | | | |
Collapse
|
9
|
Zukurov JP, do Nascimento-Brito S, Volpini AC, Oliveira GC, Janini LMR, Antoneli F. Estimation of genetic diversity in viral populations from next generation sequencing data with extremely deep coverage. Algorithms Mol Biol 2016; 11:2. [PMID: 26973707 PMCID: PMC4788855 DOI: 10.1186/s13015-016-0064-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 02/25/2016] [Indexed: 12/16/2022] Open
Abstract
Background In this paper we propose a method and discuss its computational implementation as an integrated tool for the analysis of viral genetic diversity on data generated by high-throughput sequencing. The main motivation for this work is to better understand the genetic diversity of viruses with high rates of nucleotide substitution, as HIV-1 and Influenza. Most methods for viral diversity estimation proposed so far are intended to take benefit of the longer reads produced by some next-generation sequencing platforms in order to estimate a population of haplotypes which represent the diversity of the original population. The method proposed here is custom-made to take advantage of the very low error rate and extremely deep coverage per site, which are the main features of some neglected technologies that have not received much attention due to the short length of its reads, which precludes haplotype estimation. This approach allowed us to avoid some hard problems related to haplotype reconstruction (need of long reads, preliminary error filtering and assembly). Results We propose to measure genetic diversity of a viral population through a family of multinomial probability distributions indexed by the sites of the virus genome, each one representing the distribution of nucleic bases per site. Moreover, the implementation of the method focuses on two main optimization strategies: a read mapping/alignment procedure that aims at the recovery of the maximum possible number of short-reads; the inference of the multinomial parameters in a Bayesian framework with smoothed Dirichlet estimation. The Bayesian approach provides conditional probability distributions for the multinomial parameters allowing one to take into account the prior information of the control experiment and providing a natural way to separate signal from noise, since it automatically furnishes Bayesian confidence intervals and thus avoids the drawbacks of preliminary error filtering. Conclusions The methods described in this paper have been implemented as an integrated tool called Tanden (Tool for Analysis of Diversity in Viral Populations) and successfully tested on samples obtained from HIV-1 strain NL4-3 (group M, subtype B) cultivations on primary human cell cultures in many distinct viral propagation conditions. Tanden is written in C# (Microsoft), runs on the Windows operating system, and can be downloaded from: http://tanden.url.ph/.
Collapse
|
10
|
Van der Borght K, Thys K, Wetzels Y, Clement L, Verbist B, Reumers J, van Vlijmen H, Aerssens J. QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles. BMC Bioinformatics 2015; 16:379. [PMID: 26554718 PMCID: PMC4641353 DOI: 10.1186/s12859-015-0812-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Accepted: 10/31/2015] [Indexed: 12/03/2022] Open
Abstract
Background Next generation sequencing enables studying heterogeneous populations of viral infections. When the sequencing is done at high coverage depth (“deep sequencing”), low frequency variants can be detected. Here we present QQ-SNV (http://sourceforge.net/projects/qqsnv), a logistic regression classifier model developed for the Illumina sequencing platforms that uses the quantiles of the quality scores, to distinguish true single nucleotide variants from sequencing errors based on the estimated SNV probability. To train the model, we created a dataset of an in silico mixture of five HIV-1 plasmids. Testing of our method in comparison to the existing methods LoFreq, ShoRAH, and V-Phaser 2 was performed on two HIV and four HCV plasmid mixture datasets and one influenza H1N1 clinical dataset. Results For default application of QQ-SNV, variants were called using a SNV probability cutoff of 0.5 (QQ-SNVD). To improve the sensitivity we used a SNV probability cutoff of 0.0001 (QQ-SNVHS). To also increase specificity, SNVs called were overruled when their frequency was below the 80th percentile calculated on the distribution of error frequencies (QQ-SNVHS-P80). When comparing QQ-SNV versus the other methods on the plasmid mixture test sets, QQ-SNVD performed similarly to the existing approaches. QQ-SNVHS was more sensitive on all test sets but with more false positives. QQ-SNVHS-P80 was found to be the most accurate method over all test sets by balancing sensitivity and specificity. When applied to a paired-end HCV sequencing study, with lowest spiked-in true frequency of 0.5 %, QQ-SNVHS-P80 revealed a sensitivity of 100 % (vs. 40–60 % for the existing methods) and a specificity of 100 % (vs. 98.0–99.7 % for the existing methods). In addition, QQ-SNV required the least overall computation time to process the test sets. Finally, when testing on a clinical sample, four putative true variants with frequency below 0.5 % were consistently detected by QQ-SNVHS-P80 from different generations of Illumina sequencers. Conclusions We developed and successfully evaluated a novel method, called QQ-SNV, for highly efficient single nucleotide variant calling on Illumina deep sequencing virology data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0812-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Koen Van der Borght
- Janssen Infectious Diseases-Diagnostics BVBA, B-2340, Beerse, Belgium. .,Interuniversity Institute for Biostatistics and statistical Bioinformatics, Katholieke Universiteit Leuven, B-3000, Leuven, Belgium.
| | - Kim Thys
- Janssen Infectious Diseases-Diagnostics BVBA, B-2340, Beerse, Belgium.
| | - Yves Wetzels
- Janssen Infectious Diseases-Diagnostics BVBA, B-2340, Beerse, Belgium.
| | - Lieven Clement
- Ghent University, Applied Mathematics, Informatics and Statistics, B-9000, Ghent, Belgium.
| | - Bie Verbist
- Janssen Infectious Diseases-Diagnostics BVBA, B-2340, Beerse, Belgium.
| | - Joke Reumers
- Janssen Infectious Diseases-Diagnostics BVBA, B-2340, Beerse, Belgium.
| | | | - Jeroen Aerssens
- Janssen Infectious Diseases-Diagnostics BVBA, B-2340, Beerse, Belgium.
| |
Collapse
|
11
|
Zhang C, Wu Z, Li Y, Wu J. Biogenesis, Function, and Applications of Virus-Derived Small RNAs in Plants. Front Microbiol 2015; 6:1237. [PMID: 26617580 PMCID: PMC4637412 DOI: 10.3389/fmicb.2015.01237] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 10/26/2015] [Indexed: 11/13/2022] Open
Abstract
RNA silencing, an evolutionarily conserved and sequence-specific gene-inactivation system, has a pivotal role in antiviral defense in most eukaryotic organisms. In plants, a class of exogenous small RNAs (sRNAs) originating from the infecting virus called virus-derived small interfering RNAs (vsiRNAs) are predominantly responsible for RNA silencing-mediated antiviral immunity. Nowadays, the process of vsiRNA formation and the role of vsiRNAs in plant viral defense have been revealed through deep sequencing of sRNAs and diverse genetic analysis. The biogenesis of vsiRNAs is analogous to that of endogenous sRNAs, which require diverse essential components including dicer-like (DCL), argonaute (AGO), and RNA-dependent RNA polymerase (RDR) proteins. vsiRNAs trigger antiviral defense through post-transcriptional gene silencing (PTGS) or transcriptional gene silencing (TGS) of viral RNA, and they hijack the host RNA silencing system to target complementary host transcripts. Additionally, several applications that take advantage of the current knowledge of vsiRNAs research are being used, such as breeding antiviral plants through genetic engineering technology, reconstructing of viral genomes, and surveying viral ecology and populations. Here, we will provide an overview of vsiRNA pathways, with a primary focus on the advances in vsiRNA biogenesis and function, and discuss their potential applications as well as the future challenges in vsiRNAs research.
Collapse
Affiliation(s)
- Chao Zhang
- Key Laboratory of Plant Virology of Fujian Province, Institute of Plant Virology, Fujian Agriculture and Forestry University Fuzhou, China
| | - Zujian Wu
- Key Laboratory of Plant Virology of Fujian Province, Institute of Plant Virology, Fujian Agriculture and Forestry University Fuzhou, China
| | - Yi Li
- Peking-Yale Joint Center for Plant Molecular Genetics and Agrobiotechnology, The National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University Beijing, China
| | - Jianguo Wu
- Key Laboratory of Plant Virology of Fujian Province, Institute of Plant Virology, Fujian Agriculture and Forestry University Fuzhou, China ; Peking-Yale Joint Center for Plant Molecular Genetics and Agrobiotechnology, The National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University Beijing, China
| |
Collapse
|