1
|
Ye R, Wang A, Bu B, Luo P, Deng W, Zhang X, Yin S. Viral oncogenes, viruses, and cancer: a third-generation sequencing perspective on viral integration into the human genome. Front Oncol 2023; 13:1333812. [PMID: 38188304 PMCID: PMC10768168 DOI: 10.3389/fonc.2023.1333812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 12/06/2023] [Indexed: 01/09/2024] Open
Abstract
The link between viruses and cancer has intrigued scientists for decades. Certain viruses have been shown to be vital in the development of various cancers by integrating viral DNA into the host genome and activating viral oncogenes. These viruses include the Human Papillomavirus (HPV), Hepatitis B and C Viruses (HBV and HCV), Epstein-Barr Virus (EBV), and Human T-Cell Leukemia Virus (HTLV-1), which are all linked to the development of a myriad of human cancers. Third-generation sequencing technologies have revolutionized our ability to study viral integration events at unprecedented resolution in recent years. They offer long sequencing capabilities along with the ability to map viral integration sites, assess host gene expression, and track clonal evolution in cancer cells. Recently, researchers have been exploring the application of Oxford Nanopore Technologies (ONT) nanopore sequencing and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) sequencing in cancer research. As viral integration is crucial to the development of cancer via viruses, third-generation sequencing would provide a novel approach to studying the relationship interlinking viral oncogenes, viruses, and cancer. This review article explores the molecular mechanisms underlying viral oncogenesis, the role of viruses in cancer development, and the impact of third-generation sequencing on our understanding of viral integration into the human genome.
Collapse
Affiliation(s)
- Ruichen Ye
- Department of Pathology, Albert Einstein College of Medicine, Bronx, NY, United States
- Einstein Pathology Single-cell & Bioinformatics Laboratory, Bronx, NY, United States
- Stony Brook University, Stony Brook, NY, United States
| | - Angelina Wang
- Tufts Friedman School of Nutrition, Boston, MA, United States
| | - Brady Bu
- Horace Mann School, Bronx, NY, United States
| | - Pengxiang Luo
- Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Wenjun Deng
- Clinical Proteomics Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Xinyi Zhang
- Department of Respiratory Diseases, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Shanye Yin
- Department of Pathology, Albert Einstein College of Medicine, Bronx, NY, United States
- Einstein Pathology Single-cell & Bioinformatics Laboratory, Bronx, NY, United States
| |
Collapse
|
2
|
Cheng C, Fei Z, Xiao P. Methods to improve the accuracy of next-generation sequencing. Front Bioeng Biotechnol 2023; 11:982111. [PMID: 36741756 PMCID: PMC9895957 DOI: 10.3389/fbioe.2023.982111] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 01/11/2023] [Indexed: 01/21/2023] Open
Abstract
Next-generation sequencing (NGS) is present in all fields of life science, which has greatly promoted the development of basic research while being gradually applied in clinical diagnosis. However, the cost and throughput advantages of next-generation sequencing are offset by large tradeoffs with respect to read length and accuracy. Specifically, its high error rate makes it extremely difficult to detect SNPs or low-abundance mutations, limiting its clinical applications, such as pharmacogenomics studies primarily based on SNP and early clinical diagnosis primarily based on low abundance mutations. Currently, Sanger sequencing is still considered to be the gold standard due to its high accuracy, so the results of next-generation sequencing require verification by Sanger sequencing in clinical practice. In order to maintain high quality next-generation sequencing data, a variety of improvements at the levels of template preparation, sequencing strategy and data processing have been developed. This study summarized the general procedures of next-generation sequencing platforms, highlighting the improvements involved in eliminating errors at each step. Furthermore, the challenges and future development of next-generation sequencing in clinical application was discussed.
Collapse
|
3
|
He X, Zhong D, Zou C, Pi L, Zhao L, Qin Y, Pan M, Wang S, Zeng W, Xiang Z, Chen X, Wu Y, Si Y, Cui L, Huang Y, Yan G, Yang Z. Unraveling the Complexity of Imported Malaria Infections by Amplicon Deep Sequencing. Front Cell Infect Microbiol 2021; 11:725859. [PMID: 34595134 PMCID: PMC8477663 DOI: 10.3389/fcimb.2021.725859] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 08/16/2021] [Indexed: 11/22/2022] Open
Abstract
Imported malaria and recurrent infections are becoming an emerging issue in many malaria non-endemic countries. This study aimed to determine the molecular patterns of the imported malaria infections and recurrence. Blood samples were collected from patients with imported malaria infections during 2016-2018 in Guangxi Zhuang Autonomous Region, China. Next-generation amplicon deep-sequencing approaches were used to assess parasite genetic diversity, multiplexity of infection, relapse, recrudescence, and antimalarial drug resistance. A total of 44 imported malaria cases were examined during the study, of which 35 (79.5%) had recurrent malaria infections within 1 year. The majority (91.4%) had one recurrent malaria episode, whereas two patients had two recurrences and one patient had three recurrences. A total of 19 recurrence patterns (the species responsible for primary and successive clinical episodes) were found in patients returning from malaria epidemic countries. Four parasite species were detected with a higher than usual proportion (46.2%) of non-falciparum infections or mixed-species infections. An increasing trend of recurrence infections and reduced drug treatment efficacy were observed among the cases of imported malaria. The high recurrence rate and complex patterns of imported malaria from Africa to non-endemic countries have the potential to initiate local transmission, thereby undermining efforts to eliminate locally acquired malaria. Our findings highlight the power of amplicon deep-sequencing applications in molecular epidemiological studies of the imported malaria recurrences.
Collapse
Affiliation(s)
- Xi He
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Daibin Zhong
- Program in Public Health, College of Health Sciences, University of California at Irvine, Irvine, CA, United States
| | - Chunyan Zou
- Department of Electrocardiogram, Guangxi Zhuang Autonomous Region People’s Hospital, Nanning, China
| | - Liang Pi
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Luyi Zhao
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Yucheng Qin
- Department of Infectious Diseases, Shanglin County People’s Hospital, Shanglin, China
| | - Maohua Pan
- Department of Infectious Diseases, Shanglin County People’s Hospital, Shanglin, China
| | - Siqi Wang
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Weiling Zeng
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Zheng Xiang
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Xi Chen
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Yanrui Wu
- Department of Cell Biology & Genetics, Kunming Medical University, Kunming, China
| | - Yu Si
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| | - Liwang Cui
- Department of Internal Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, United States
| | - Yaming Huang
- Department of Protozoa, Guangxi Zhuang Autonomous Region Center for Disease Prevention and Control, Nanning, China
| | - Guiyun Yan
- Program in Public Health, College of Health Sciences, University of California at Irvine, Irvine, CA, United States
| | - Zhaoqing Yang
- Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming, China
| |
Collapse
|
4
|
Tasakis RN, Samaras G, Jamison A, Lee M, Paulus A, Whitehouse G, Verkoczy L, Papavasiliou FN, Diaz M. SARS-CoV-2 variant evolution in the United States: High accumulation of viral mutations over time likely through serial Founder Events and mutational bursts. PLoS One 2021; 16:e0255169. [PMID: 34297786 PMCID: PMC8301627 DOI: 10.1371/journal.pone.0255169] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 07/11/2021] [Indexed: 12/13/2022] Open
Abstract
Since the first case of COVID-19 in December 2019 in Wuhan, China, SARS-CoV-2 has spread worldwide and within a year and a half has caused 3.56 million deaths globally. With dramatically increasing infection numbers, and the arrival of new variants with increased infectivity, tracking the evolution of its genome is crucial for effectively controlling the pandemic and informing vaccine platform development. Our study explores evolution of SARS-CoV-2 in a representative cohort of sequences covering the entire genome in the United States, through all of 2020 and early 2021. Strikingly, we detected many accumulating Single Nucleotide Variations (SNVs) encoding amino acid changes in the SARS-CoV-2 genome, with a pattern indicative of RNA editing enzymes as major mutators of SARS-CoV-2 genomes. We report three major variants through October of 2020. These revealed 14 key mutations that were found in various combinations among 14 distinct predominant signatures. These signatures likely represent evolutionary lineages of SARS-CoV-2 in the U.S. and reveal clues to its evolution such as a mutational burst in the summer of 2020 likely leading to a homegrown new variant, and a trend towards higher mutational load among viral isolates, but with occasional mutation loss. The last quartile of 2020 revealed a concerning accumulation of mostly novel low frequency replacement mutations in the Spike protein, and a hypermutable glutamine residue near the putative furin cleavage site. Finally, end of the year data and 2021 revealed the gradual increase to prevalence of known variants of concern, particularly B.1.1.7, that have acquired additional Spike mutations. Overall, our results suggest that predominant viral genomes are dynamically evolving over time, with periods of mutational bursts and unabated mutation accumulation. This high level of existing variation, even at low frequencies and especially in the Spike-encoding region may become problematic when super-spreader events, akin to serial Founder Events in evolution, drive these rare mutations to prominence.
Collapse
Affiliation(s)
- Rafail Nikolaos Tasakis
- Division of Immune Diversity, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
| | - Georgios Samaras
- Division of Immune Diversity, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Program of Translational Medical Research, Medical Faculty Mannheim, University of Heidelberg, Heidelberg, Germany
| | - Anna Jamison
- The Nightingale-Bamford School, New York, NY, United States of America
| | - Michelle Lee
- Cornell University, Ithaca, NY, United States of America
| | - Alexandra Paulus
- The Nightingale-Bamford School, New York, NY, United States of America
| | | | - Laurent Verkoczy
- San Diego Biomedical Research Institute (SDBRI), San Diego, CA, United States of America
| | - F. Nina Papavasiliou
- Division of Immune Diversity, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Marilyn Diaz
- San Diego Biomedical Research Institute (SDBRI), San Diego, CA, United States of America
| |
Collapse
|
5
|
Danilenko AV, Kolosova NP, Shvalov AN, Ilyicheva TN, Svyatchenko SV, Durymanov AG, Bulanovich JA, Goncharova NI, Susloparov IM, Marchenko VY, Tregubchak TV, Gavrilova EV, Maksyutov RA, Ryzhikov AB. Evaluation of HA-D222G/N polymorphism using targeted NGS analysis in A(H1N1)pdm09 influenza virus in Russia in 2018-2019. PLoS One 2021; 16:e0251019. [PMID: 33914831 PMCID: PMC8084186 DOI: 10.1371/journal.pone.0251019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 04/19/2021] [Indexed: 02/07/2023] Open
Abstract
Outbreaks of influenza, which is a contagious respiratory disease, occur throughout the world annually, affecting millions of people with many fatal cases. The D222G/N mutations in the hemagglutinin (HA) gene of A(H1N1)pdm09 are associated with severe and fatal human influenza cases. These mutations lead to increased virus replication in the lower respiratory tract (LRT) and may result in life-threatening pneumonia. Targeted NGS analysis revealed the presence of mutations in major and minor variants in 57% of fatal cases, with the proportion of viral variants with mutations varying from 1% to 98% in each individual sample in the epidemic season 2018-2019 in Russia. Co-occurrence of the mutations D222G and D222N was detected in a substantial number of the studied fatal cases (41%). The D222G/N mutations were detected at a low frequency (less than 1%) in the rest of the studied samples from fatal and nonfatal cases of influenza. The presence of HA D222Y/V/A mutations was detected in a few fatal cases. The high rate of occurrence of HA D222G/N mutations in A(H1N1)pdm09 viruses, their increased ability to replicate in the LRT and their association with fatal outcomes points to the importance of monitoring the mutations in circulating A(H1N1)pdm09 viruses for the evaluation of their epidemiological significance and for the consideration of disease prevention and treatment options.
Collapse
Affiliation(s)
- Alexey V. Danilenko
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Natalia P. Kolosova
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Alexander N. Shvalov
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Tatyana N. Ilyicheva
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
- Novosibirsk State University, Novosibirsk, Russia
| | - Svetlana V. Svyatchenko
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Alexander G. Durymanov
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Julia A. Bulanovich
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Natalia I. Goncharova
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Ivan M. Susloparov
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Vasiliy Y. Marchenko
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Tatyana V. Tregubchak
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Elena V. Gavrilova
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Rinat A. Maksyutov
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| | - Alexander B. Ryzhikov
- Vector State Research Center of Virology and Biotechnology, Koltsovo, Novosibirsk Region, Russia
| |
Collapse
|
6
|
Lau BT, Pavlichin D, Hooker AC, Almeda A, Shin G, Chen J, Sahoo MK, Huang CH, Pinsky BA, Lee HJ, Ji HP. Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies. Genome Med 2021; 13:62. [PMID: 33875001 PMCID: PMC8054698 DOI: 10.1186/s13073-021-00882-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 03/31/2021] [Indexed: 12/14/2022] Open
Abstract
Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contact tracing. Methods Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies variants, and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints. Results We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and a combination of mutations that appear in only a small number of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of 100 SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome. Conclusions We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients and determined their general prevalence when compared to over 70,000 other strains. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.
Collapse
Affiliation(s)
- Billy T Lau
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA.,Stanford Genome Technology Center West, Stanford University, Palo Alto, CA, 94304, USA
| | - Dmitri Pavlichin
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA
| | - Anna C Hooker
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA
| | - Alison Almeda
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA
| | - Giwon Shin
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA
| | - Jiamin Chen
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA
| | - Malaya K Sahoo
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Chun Hong Huang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Benjamin A Pinsky
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA.,Department of Medicine, Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Ho Joon Lee
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA.
| | - Hanlee P Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, 269 Campus Drive, CCSR 1120, Stanford, CA, 94305-5151, USA. .,Stanford Genome Technology Center West, Stanford University, Palo Alto, CA, 94304, USA.
| |
Collapse
|
7
|
Mitchell RM, Zhou Z, Sheth M, Sergent S, Frace M, Nayak V, Hu B, Gimnig J, Ter Kuile F, Lindblade K, Slutsker L, Hamel MJ, Desai M, Otieno K, Kariuki S, Vigfusson Y, Shi YP. Development of a new barcode-based, multiplex-PCR, next-generation-sequencing assay and data processing and analytical pipeline for multiplicity of infection detection of Plasmodium falciparum. Malar J 2021; 20:92. [PMID: 33593329 PMCID: PMC7885407 DOI: 10.1186/s12936-021-03624-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 02/04/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Simultaneous infection with multiple malaria parasite strains is common in high transmission areas. Quantifying the number of strains per host, or the multiplicity of infection (MOI), provides additional parasite indices for assessing transmission levels but it is challenging to measure accurately with current tools. This paper presents new laboratory and analytical methods for estimating the MOI of Plasmodium falciparum. METHODS Based on 24 single nucleotide polymorphisms (SNPs) previously identified as stable, unlinked targets across 12 of the 14 chromosomes within P. falciparum genome, three multiplex PCRs of short target regions and subsequent next generation sequencing (NGS) of the amplicons were developed. A bioinformatics pipeline including B4Screening pathway removed spurious amplicons to ensure consistent frequency calls at each SNP location, compiled amplicons by SNP site diversity, and performed algorithmic haplotype and strain reconstruction. The pipeline was validated by 108 samples generated from cultured-laboratory strain mixtures in different proportions and concentrations, with and without pre-amplification, and using whole blood and dried blood spots (DBS). The pipeline was applied to 273 smear-positive samples from surveys conducted in western Kenya, then providing results into StrainRecon Thresholding for Infection Multiplicity (STIM), a novel MOI estimator. RESULTS The 24 barcode SNPs were successfully identified uniformly across the 12 chromosomes of P. falciparum in a sample using the pipeline. Pre-amplification and parasite concentration, while non-linearly associated with SNP read depth, did not influence the SNP frequency calls. Based on consistent SNP frequency calls at targeted locations, the algorithmic strain reconstruction for each laboratory-mixed sample had 98.5% accuracy in dominant strains. STIM detected up to 5 strains in field samples from western Kenya and showed declining MOI over time (q < 0.02), from 4.32 strains per infected person in 1996 to 4.01, 3.56 and 3.35 in 2001, 2007 and 2012, and a reduction in the proportion of samples with 5 strains from 57% in 1996 to 18% in 2012. CONCLUSION The combined approach of new multiplex PCRs and NGS, the unique bioinformatics pipeline and STIM could identify 24 barcode SNPs of P. falciparum correctly and consistently. The methodology could be applied to field samples to reliably measure temporal changes in MOI.
Collapse
Affiliation(s)
- Rebecca M Mitchell
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
- Department of Computer Science, Emory University, Atlanta, USA
- School of Nursing, Emory University, Atlanta, USA
| | - Zhiyong Zhou
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | - Mili Sheth
- Biotechnology Core Facility Branch, Division of Scientific Resources, CDC, Atlanta, USA
| | - Sheila Sergent
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | - Michael Frace
- Biotechnology Core Facility Branch, Division of Scientific Resources, CDC, Atlanta, USA
| | - Vishal Nayak
- Office of Infectious Diseases, National Center for Emerging and Zoonotic Infectious Diseases, CDC, Atlanta, USA
| | - Bin Hu
- Office of Infectious Diseases, National Center for Emerging and Zoonotic Infectious Diseases, CDC, Atlanta, USA
| | - John Gimnig
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | | | - Kim Lindblade
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | - Laurence Slutsker
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | - Mary J Hamel
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | - Meghna Desai
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA
| | - Kephas Otieno
- Kenya Medical Research Institute, Centre for Global Health Research, Kisumu, Kenya
| | - Simon Kariuki
- Kenya Medical Research Institute, Centre for Global Health Research, Kisumu, Kenya
| | - Ymir Vigfusson
- Department of Computer Science, Emory University, Atlanta, USA.
| | - Ya Ping Shi
- Division of Parasitic Diseases, Center for Global Health, Centers for Disease Control and Prevention (CDC), Atlanta, USA.
| |
Collapse
|
8
|
Lau BT, Pavlichin D, Hooker AC, Almeda A, Shin G, Chen J, Sahoo MK, Huang C, Pinsky BA, Lee H, Ji HP. Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2020:2020.11.02.20224816. [PMID: 33173909 PMCID: PMC7654905 DOI: 10.1101/2020.11.02.20224816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contract tracing. Methods Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints. Results We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and rare ones that occur in only small fraction of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome. Conclusions We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients, mutations demarcating dominant species and the prevalence of mutation signatures, of which a significant number were relatively unique. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.
Collapse
Affiliation(s)
- Billy T. Lau
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
- Stanford Genome Technology Center West, Stanford University, Palo Alto, CA, 94304, United States
| | - Dmitri Pavlichin
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Anna C. Hooker
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Alison Almeda
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Giwon Shin
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Jiamin Chen
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Malaya K. Sahoo
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - ChunHong Huang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Benjamin A. Pinsky
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, United States
- Department of Medicine, Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - HoJoon Lee
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
| | - Hanlee P. Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
- Stanford Genome Technology Center West, Stanford University, Palo Alto, CA, 94304, United States
| |
Collapse
|
9
|
Cacciabue M, Currá A, Carrillo E, König G, Gismondi MI. A beginner's guide for FMDV quasispecies analysis: sub-consensus variant detection and haplotype reconstruction using next-generation sequencing. Brief Bioinform 2020; 21:1766-1775. [PMID: 31697321 PMCID: PMC7110011 DOI: 10.1093/bib/bbz086] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Revised: 06/18/2019] [Accepted: 06/19/2019] [Indexed: 12/18/2022] Open
Abstract
Deep sequencing of viral genomes is a powerful tool to study RNA virus complexity. However, the analysis of next-generation sequencing data might be challenging for researchers who have never approached the study of viral quasispecies by this methodology. In this work we present a suitable and affordable guide to explore the sub-consensus variability and to reconstruct viral quasispecies from Illumina sequencing data. The guide includes a complete analysis pipeline along with user-friendly descriptions of software and file formats. In addition, we assessed the feasibility of the workflow proposed by analyzing a set of foot-and-mouth disease viruses (FMDV) with different degrees of variability. This guide introduces the analysis of quasispecies of FMDV and other viruses through this kind of approach.
Collapse
Affiliation(s)
- Marco Cacciabue
- Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET), Hurlingham, Argentina
- Departamento de Ciencias Básicas, Universidad Nacional de Luján, Luján, Argentina
| | - Anabella Currá
- Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET), Hurlingham, Argentina
- Departamento de Ciencias Básicas, Universidad Nacional de Luján, Luján, Argentina
| | - Elisa Carrillo
- Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET), Hurlingham, Argentina
| | - Guido König
- Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET), Hurlingham, Argentina
| | - María Inés Gismondi
- Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET), Hurlingham, Argentina
- Departamento de Ciencias Básicas, Universidad Nacional de Luján, Luján, Argentina
| |
Collapse
|
10
|
Limited Practical Utility of Liquid Biopsy in the Treated Patients with Advanced Breast Cancer. Diagnostics (Basel) 2020; 10:diagnostics10080523. [PMID: 32731384 PMCID: PMC7460238 DOI: 10.3390/diagnostics10080523] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 07/24/2020] [Indexed: 12/16/2022] Open
Abstract
Recently, liquid biopsy has emerged as a tool to monitor oncologic disease progression and the effects of treatment. In this study we aimed to determine the clinical utility of liquid biopsy relative to conventional oncological post-treatment surveillance. Plasma cell-free (cf) DNA was collected from six healthy women and 37 patients with breast cancer (18 and 19 with stage III and IV tumors, respectively). CfDNA was assessed using the Oncomine Pan-Cancer Cell-Free Assay. In cfDNA samples from patients with BC, 1112 variants were identified, with only a few recurrent or hotspot mutations within specific regions of cancer genes. Of 65 potentially pathogenic variants detected in tumors, only 19 were also discovered in at least one blood sample. The allele frequencies of detected variants (VAFs) were <1% in cfDNA from all controls and patients with stage III BC, and 24/85 (28.2%) variants had VAFs > 1% in only 8 of 25 (32%) patients with stage IV BC. Copy number variations (CNVs) spanning CDK4, MET, FGFR1, FGFR2, ERBB2, MYC, and CCND3 were found in 1 of 12 (8%) and 8 of 25 (32%) patients with stage III and IV tumors, respectively. In healthy controls and patients without BC progression after treatment, VAFs were <1%, while in patients with metastatic disease and/or more advanced genomic alterations, VAFs > 1% and/or CNV were detected in approximately 30%. Therefore, most patients with stage IV BC could not be distinguished from those with stage III disease following therapy, based on liquid biopsy results.
Collapse
|
11
|
Gaye A, Sy M, Ndiaye T, Siddle KJ, Park DJ, Deme AB, Mbaye A, Dieye B, Ndiaye YD, Neafsey DE, Early A, Farrell T, Yade MS, Diallo MA, Diongue K, Bei A, Ndiaye IM, Volkman SK, Badiane AS, Ndiaye D. Amplicon deep sequencing of kelch13 in Plasmodium falciparum isolates from Senegal. Malar J 2020; 19:134. [PMID: 32228566 PMCID: PMC7106636 DOI: 10.1186/s12936-020-03193-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Accepted: 03/20/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In 2006, the Senegalese National Malaria Control Programme recommended artemisinin-based combination therapy (ACT) with artemether-lumefantrine as the first-line treatment for uncomplicated Plasmodium falciparum malaria. To date, multiple mutations associated with artemisinin delayed parasite clearance have been described in Southeast Asia in the Pfk13 gene, such as Y493H, R539T, I543T and C580Y. Even though ACT remains clinically and parasitologically efficacious in Senegal, the spread of resistance is possible as shown by the earlier emergence of resistance to chloroquine in Southeast Asia that subsequently spread to Africa. Therefore, surveillance of artemisinin resistance in malaria endemic regions is crucial and requires the implementation of sensitive tools, such as next-generation sequencing (NGS) which can detect novel mutations at low frequency. METHODS Here, an amplicon sequencing approach was used to identify mutations in the Pfk13 gene in eighty-one P. falciparum isolates collected from three different regions of Senegal. RESULTS In total, 10 SNPs around the propeller domain were identified; one synonymous SNP and nine non-synonymous SNPs, and two insertions. Three of these SNPs (T478T, A578S and V637I) were located in the propeller domain. A578S, is the most frequent mutation observed in Africa, but has not previously been reported in Senegal. A previous study has suggested that A578S could disrupt the function of the Pfk13 propeller region. CONCLUSION As the genetic basis of possible artemisinin resistance may be distinct in Africa and Southeast Asia, further studies are necessary to assess the new SNPs reported in this study.
Collapse
Affiliation(s)
- Amy Gaye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal.
| | - Mouhamad Sy
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Tolla Ndiaye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | | | - Daniel J Park
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Awa B Deme
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Aminata Mbaye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Baba Dieye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Yaye Die Ndiaye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Daniel E Neafsey
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Angela Early
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Mamadou Samb Yade
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Mamadou Alpha Diallo
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Khadim Diongue
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Amy Bei
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal.,Yale School of Public Health, Laboratory of Epidemiology and Public Health, 60 College Street, New Haven, CT, 06510, USA
| | - Ibrahima Mbaye Ndiaye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Sarah K Volkman
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Aida Sadikh Badiane
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| | - Daouda Ndiaye
- Laboratory of Parasitology and Mycology, Aristide le Dantec Hospital, Cheikh Anta Diop University, Dakar, Senegal
| |
Collapse
|
12
|
Idowu AO, Oyibo WA, Bhattacharyya S, Khubbar M, Mendie UE, Bumah VV, Black C, Igietseme J, Azenabor AA. Rare mutations in Pfmdr1 gene of Plasmodium falciparum detected in clinical isolates from patients treated with anti-malarial drug in Nigeria. Malar J 2019; 18:319. [PMID: 31533729 PMCID: PMC6751857 DOI: 10.1186/s12936-019-2947-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 09/06/2019] [Indexed: 01/18/2023] Open
Abstract
Background Plasmodium falciparum, the deadliest causative agent of malaria, has high prevalence in Nigeria. Drug resistance causing failure of previously effective drugs has compromised anti-malarial treatment. On this basis, there is need for a proactive surveillance for resistance markers to the currently recommended artemisinin-based combination therapy (ACT), for early detection of resistance before it become widespread. Methods This study assessed anti-malarial resistance genes polymorphism in patients with uncomplicated P. falciparum malaria in Lagos, Nigeria. Sanger and Next Generation Sequencing (NGS) methods were used to screen for mutations in thirty-seven malaria positive blood samples targeting the P. falciparum chloroquine-resistance transporter (Pfcrt), P. falciparum multidrug-resistance 1 (Pfmdr1), and P. falciparum kelch 13 (Pfk13) genes, which have been previously associated with anti-malarial resistance. Results Expectedly, the NGS method was more proficient, detecting six Pfmdr1, seven Pfcrt and three Pfk13 mutations in the studied clinical isolates from Nigeria, a malaria endemic area. These mutations included rare Pfmdr1 mutations, N504K, N649D, F938Y and S967N, which were previously unreported. In addition, there was moderate prevalence of the K76T mutation (34.6%) associated with chloroquine and amodiaquine resistance, and high prevalence of the N86 wild type allele (92.3%) associated with lumefantrine resistance. Conclusion Widespread circulation of mutations associated with resistance to current anti-malarial drugs could potentially limit effective malaria therapy in endemic populations.
Collapse
Affiliation(s)
- Abel O Idowu
- Department of Biomedical Sciences, College of Health Sciences, University of Wisconsin, 2400 E. Hartford Avenue, Milwaukee, WI, 53211, USA.,Department of Pharmaceutics and Pharmaceutical Technology, Faculty of Pharmacy, University of Lagos, Lagos, Nigeria
| | - Wellington A Oyibo
- ANDI Centre of Excellence in Malaria Diagnosis, College of Medicine, University of Lagos, Lagos, Nigeria
| | | | - Manjeet Khubbar
- City of Milwaukee Health Department Laboratory, Milwaukee, USA
| | - Udoma E Mendie
- Department of Pharmaceutics and Pharmaceutical Technology, Faculty of Pharmacy, University of Lagos, Lagos, Nigeria
| | - Violet V Bumah
- Department of Biology, North Life Science 317, San Diego State University, San Diego, CA, 92182, USA
| | - Carolyn Black
- Molecular Pathogenesis Laboratory, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Joseph Igietseme
- Molecular Pathogenesis Laboratory, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Anthony A Azenabor
- Department of Biomedical Sciences, College of Health Sciences, University of Wisconsin, 2400 E. Hartford Avenue, Milwaukee, WI, 53211, USA. .,Department of Pharmaceutics and Pharmaceutical Technology, Faculty of Pharmacy, University of Lagos, Lagos, Nigeria.
| |
Collapse
|
13
|
FERMI: A Novel Method for Sensitive Detection of Rare Mutations in Somatic Tissue. G3-GENES GENOMES GENETICS 2019; 9:2977-2987. [PMID: 31352405 PMCID: PMC6723130 DOI: 10.1534/g3.119.400438] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
With growing interest in monitoring mutational processes in normal tissues, tumor heterogeneity, and cancer evolution under therapy, the ability to accurately and economically detect ultra-rare mutations is becoming increasingly important. However, this capability has often been compromised by significant sequencing, PCR and DNA preparation error rates. Here, we describe FERMI (Fast Extremely Rare Mutation Identification) - a novel method designed to eliminate the majority of these sequencing and library-preparation errors in order to significantly improve rare somatic mutation detection. This method leverages barcoded targeting probes to capture and sequence DNA of interest with single copy resolution. The variant calls from the barcoded sequencing data are then further filtered in a position-dependent fashion against an adaptive, context-aware null model in order to distinguish true variants. As a proof of principle, we employ FERMI to probe bone marrow biopsies from leukemia patients, and show that rare mutations and clonal evolution can be tracked throughout cancer treatment, including during historically intractable periods like minimum residual disease. Importantly, FERMI is able to accurately detect nascent clonal expansions within leukemias in a manner that may facilitate the early detection and characterization of cancer relapse.
Collapse
|
14
|
Ghorbani A, Ngunjiri JM, Lee CW. Influenza A Virus Subpopulations and Their Implication in Pathogenesis and Vaccine Development. Annu Rev Anim Biosci 2019; 8:247-267. [PMID: 31479617 DOI: 10.1146/annurev-animal-021419-083756] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The concept of influenza A virus (IAV) subpopulations emerged approximately 75 years ago, when Preben von Magnus described "incomplete" virus particles that interfere with the replication of infectious virus. It is now widely accepted that infectious particles constitute only a minor portion of biologically active IAV subpopulations. The IAV quasispecies is an extremely diverse swarm of biologically and genetically heterogeneous particle subpopulations that collectively influence the evolutionary fitness of the virus. This review summarizes the current knowledge of IAV subpopulations, focusing on their biologic and genomic diversity. It also discusses the potential roles IAV subpopulations play in virus pathogenesis and live attenuated influenza vaccine development.
Collapse
Affiliation(s)
- Amir Ghorbani
- Food Animal Health Research Program, Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, Ohio 44691, USA; , , .,Department of Veterinary Preventive Medicine, College of Veterinary Medicine, The Ohio State University, Columbus, Ohio 43210, USA
| | - John M Ngunjiri
- Food Animal Health Research Program, Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, Ohio 44691, USA; , ,
| | - Chang-Won Lee
- Food Animal Health Research Program, Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, Ohio 44691, USA; , , .,Department of Veterinary Preventive Medicine, College of Veterinary Medicine, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
15
|
Sakhtemani R, Senevirathne V, Stewart J, Perera MLW, Pique-Regi R, Lawrence MS, Bhagwat AS. Genome-wide mapping of regions preferentially targeted by the human DNA-cytosine deaminase APOBEC3A using uracil-DNA pulldown and sequencing. J Biol Chem 2019; 294:15037-15051. [PMID: 31431505 DOI: 10.1074/jbc.ra119.008053] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 08/13/2019] [Indexed: 12/16/2022] Open
Abstract
Activation-induced deaminase (AID) and apolipoprotein B mRNA-editing enzyme catalytic subunit (APOBEC) enzymes convert cytosines to uracils, creating signature mutations that have been used to predict sites targeted by these enzymes. Mutation-based targeting maps are distorted by the error-prone or error-free repair of these uracils and by selection pressures. To directly map uracils created by AID/APOBEC enzymes, here we used uracil-DNA glycosylase and an alkoxyamine to covalently tag and sequence uracil-containing DNA fragments (UPD-Seq). We applied this technique to the genome of repair-defective, APOBEC3A-expressing bacterial cells and created a uracilation genome map, i.e. uracilome. The peak uracilated regions were in the 5'-ends of genes and operons mainly containing tRNA genes and a few protein-coding genes. We validated these findings through deep sequencing of pulldown regions and whole-genome sequencing of independent clones. The peaks were not correlated with high transcription rates or stable RNA:DNA hybrid formation. We defined the uracilation index (UI) as the frequency of occurrence of TT in UPD-Seq reads at different original TC dinucleotides. Genome-wide UI calculation confirmed that APOBEC3A modifies cytosines in the lagging-strand template during replication and in short hairpin loops. APOBEC3A's preference for tRNA genes was observed previously in yeast, and an analysis of human tumor sequences revealed that in tumors with a high percentage of APOBEC3 signature mutations, the frequency of tRNA gene mutations was much higher than in the rest of the genome. These results identify multiple causes underlying selection of cytosines by APOBEC3A for deamination, and demonstrate the utility of UPD-Seq.
Collapse
Affiliation(s)
- Ramin Sakhtemani
- Department of Chemistry, Wayne State University, Detroit, Michigan 48202
| | | | - Jessica Stewart
- Department of Chemistry, Wayne State University, Detroit, Michigan 48202
| | - Madusha L W Perera
- Department of Chemistry, Wayne State University, Detroit, Michigan 48202
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Wayne State University School of Medicine, Detroit, Michigan 48201
| | - Michael S Lawrence
- Department of Pathology and Cancer Center, Massachusetts General Hospital, Boston, Massachusetts 02114
| | - Ashok S Bhagwat
- Department of Chemistry, Wayne State University, Detroit, Michigan 48202 .,Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, Michigan 48201
| |
Collapse
|
16
|
Zhang C, Wang Y, Hu X, Qin L, Yin T, Fu W, Fan G, Zhang H, Liu G, Jiang Z, Zhang X, Li X. An Improved NGS Library Construction Approach Using DNA Isolated from Human Cancer Formalin-Fixed Paraffin-Embedded Samples. Anat Rec (Hoboken) 2019; 302:941-946. [PMID: 30365237 DOI: 10.1002/ar.24002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 05/31/2018] [Accepted: 06/08/2018] [Indexed: 11/09/2022]
Abstract
Identification of genomic alterations from formalin-fixed paraffin-embedded (FFPE) samples using next-generation sequencing (NGS) is very important for cancer-targeted therapy today. To achieve a higher efficiency and shorter turn-around time for NGS library preparation, here, we compared NGS library preparation processes and outcomes with three commercial library construction methods and two hybridization capture methods thus, developed an improved NGS library construction approach. This improved approach took advantage of both methods and resulted in a higher output from the same input DNA, including higher library construction success rate, higher probe capture rate, and shorter turn-around time. Using this approach, targeted region libraries could be constructed within only 1 day for FFPE samples; therefore, this approach has potential applications of NGS in routine clinical tests. Anat Rec, 302:941-946, 2019. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chunze Zhang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Yijia Wang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Xia Hu
- Department of Agriculture Insect, Tianjin Institute of Plant Protection, Tianjin, 300381, China
| | - Litao Qin
- Medical Genetic Institute of Henan Province, Henan Provincial People's Hospital, People's Hospital of Zhengzhou University, Zhengzhou, 450003, Henan, China
| | - Tingting Yin
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Wenzheng Fu
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Guanwei Fan
- Institute of Traditional Chinese Medicine Research, Tianjin State Key Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China
| | - Heng Zhang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Guang Liu
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Zhi Jiang
- Department of Next-Generation Sequencing, Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Xipeng Zhang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, 300121, China
| | - Xichuan Li
- Key Laboratory of Molecular and Cellular Systems Biology, Tianjin Normal University, Tianjin, 300387, China
| |
Collapse
|
17
|
Aeschlimann SH, Graf C, Mayilo D, Lindecker H, Urda L, Kappes N, Burr AL, Simonis M, Splinter E, Min M, Laux H. Enhanced CHO Clone Screening: Application of Targeted Locus Amplification and Next‐Generation Sequencing Technologies for Cell Line Development. Biotechnol J 2019; 14:e1800371. [DOI: 10.1002/biot.201800371] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 12/20/2018] [Indexed: 12/20/2022]
Affiliation(s)
- Samuel H. Aeschlimann
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| | - Christian Graf
- Novartis Technical R&D, Technical Development BiosimilarsHexal AG, Keltenring 1+3 82041 Oberhaching Germany
| | - Dmytro Mayilo
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| | - Hélène Lindecker
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| | - Lorena Urda
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| | - Nora Kappes
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| | - Alicia Leone Burr
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| | | | - Erik Splinter
- Cergentis B.VYalelaan 62 3584 CM Utrecht The Netherlands
| | - Max Min
- Cergentis B.VYalelaan 62 3584 CM Utrecht The Netherlands
| | - Holger Laux
- Novartis Institutes for BioMedical Research, Integrated Biologics Profiling UnitCH‐4002 Basel Switzerland
| |
Collapse
|
18
|
Filges S, Yamada E, Ståhlberg A, Godfrey TE. Impact of Polymerase Fidelity on Background Error Rates in Next-Generation Sequencing with Unique Molecular Identifiers/Barcodes. Sci Rep 2019; 9:3503. [PMID: 30837525 PMCID: PMC6401092 DOI: 10.1038/s41598-019-39762-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 01/31/2019] [Indexed: 12/27/2022] Open
Abstract
Liquid biopsy and detection of tumor-associated mutations in cell-free circulating DNA often requires the ability to identify single nucleotide variants at allele frequencies below 0.1%. Standard sequencing protocols cannot achieve this level of sensitivity due to background noise from DNA damage and polymerase induced errors. Addition of unique molecular identifiers allows identification and removal of errors responsible for this background noise. Theoretically, high fidelity enzymes will also reduce error rates in barcoded NGS but this has not been thoroughly explored. We evaluated the impact of polymerase fidelity on the magnitude of error reduction at different steps of barcoded NGS library construction. We find that barcoding itself displays largest impact on error reduction, even with low fidelity polymerases. Use of high fidelity polymerases in the barcoding step of library construction further suppresses error in barcoded NGS, and allows detection of variant alleles below 0.1% allele frequency. However, the improvement in error correction is modest and is not directly proportional to polymerase fidelity. Depending on the specific application, other polymerase characteristics such as multiplexing capacity, PCR efficiency, buffer requirements and ability to amplify targets with high GC content may outweigh the relatively small additional decrease in error afforded by ultra-high fidelity polymerases.
Collapse
Affiliation(s)
- Stefan Filges
- Department of Pathology and Genetics, Sahlgrenska Cancer Center, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Medicinaregatan 1F, 405 30, Gothenberg, Sweden
| | - Emiko Yamada
- Department of Surgery, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA
| | - Anders Ståhlberg
- Department of Pathology and Genetics, Sahlgrenska Cancer Center, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Medicinaregatan 1F, 405 30, Gothenberg, Sweden.
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
- Department of Clinical Pathology and Genetics, Sahlgrenska University Hospital, 413 45, Gothenburg, Sweden.
| | - Tony E Godfrey
- Department of Surgery, Boston University School of Medicine, 700 Albany Street, Boston, MA, 02118, USA.
| |
Collapse
|
19
|
The cornerstone of integrating circulating tumor DNA into cancer management. Biochim Biophys Acta Rev Cancer 2018; 1871:1-11. [PMID: 30419316 DOI: 10.1016/j.bbcan.2018.11.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 08/23/2018] [Accepted: 11/07/2018] [Indexed: 12/26/2022]
Abstract
Recent circulating tumor DNA (ctDNA) research has demonstrated its potential as a non-invasive biomarker for cancer. However, the deployment of ctDNA assays in routine clinical practice remains challenging owing to variability in analytical approaches and the assessment of clinical significance. A well-developed, analytically valid ctDNA assay is a prerequisite for integrating ctDNA into cancer management, and an appropriate analytical technology is crucial for the development of a ctDNA assay. Other determinants including pre-analytical procedures, test validation, internal quality control (IQC), and continual proficiency testing (PT) are also important for the accuracy of ctDNA assays. In the present review, we will focus on the most widely used ctDNA detection technologies and the key quality management measures used to assure the accuracy of ctDNA assays. The aim of this review is to provide useful information for technology selection during ctDNA assay development and assure a reliable test result in clinical practice.
Collapse
|
20
|
Abstract
Genetic mosaicism arises when a zygote harbors two or more distinct genotypes, typically due to de novo, somatic mutation during embryogenesis. The clinical manifestations largely depend on the differentiation status of the mutated cell; earlier mutations target pluripotent cells and generate more widespread disease affecting multiple organ systems. If gonadal tissue is spared-as in somatic genomic mosaicism-the mutation and its effects are limited to the proband, whereas mosaicism also affecting the gametes, such as germline or gonosomal mosaicism, is transmissible. Mosaicism is easily appreciated in cutaneous disorders, as phenotypically distinct mutant cells often give rise to lesions in patterns determined by the affected cell type. Genetic investigation of cutaneous mosaic disorders has identified pathways central to disease pathogenesis, revealing novel therapeutic targets. In this review, we discuss examples of cutaneous mosaicism, approaches to gene discovery in these disorders, and insights into molecular pathobiology that have potential for clinical translation.
Collapse
Affiliation(s)
- Young H Lim
- Department of Dermatology, Yale University School of Medicine, New Haven, Connecticut 06520, USA; .,Departments of Pathology and Genetics, Yale University School of Medicine, New Haven, Connecticut 06520, USA
| | - Zoe Moscato
- Department of Dermatology, Yale University School of Medicine, New Haven, Connecticut 06520, USA;
| | - Keith A Choate
- Department of Dermatology, Yale University School of Medicine, New Haven, Connecticut 06520, USA; .,Departments of Pathology and Genetics, Yale University School of Medicine, New Haven, Connecticut 06520, USA
| |
Collapse
|
21
|
Sloan DB, Broz AK, Sharbrough J, Wu Z. Detecting Rare Mutations and DNA Damage with Sequencing-Based Methods. Trends Biotechnol 2018; 36:729-740. [PMID: 29550161 PMCID: PMC6004327 DOI: 10.1016/j.tibtech.2018.02.009] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 02/16/2018] [Accepted: 02/20/2018] [Indexed: 12/18/2022]
Abstract
There is a great need in biomedical and genetic research to detect DNA damage and de novo mutations, but doing so is inherently challenging because of the rarity of these events. The enormous capacity of current DNA sequencing technologies has opened the door for quantifying sequence variants present at low frequencies in vivo, such as within cancerous tissues. However, these sequencing technologies are error prone, resulting in high noise thresholds. Most DNA sequencing methods are also generally incapable of identifying chemically modified bases arising from DNA damage. In recent years, numerous specialized modifications to sequencing methods have been developed to address these shortcomings. Here, we review this landscape of emerging techniques, highlighting their respective strengths, weaknesses, and target applications.
Collapse
Affiliation(s)
- Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO, USA.
| | - Amanda K Broz
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Joel Sharbrough
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Zhiqiang Wu
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| |
Collapse
|
22
|
Borges V, Pinheiro M, Pechirra P, Guiomar R, Gomes JP. INSaFLU: an automated open web-based bioinformatics suite "from-reads" for influenza whole-genome-sequencing-based surveillance. Genome Med 2018; 10:46. [PMID: 29954441 PMCID: PMC6027769 DOI: 10.1186/s13073-018-0555-0] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Accepted: 06/07/2018] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND A new era of flu surveillance has already started based on the genetic characterization and exploration of influenza virus evolution at whole-genome scale. Although this has been prioritized by national and international health authorities, the demanded technological transition to whole-genome sequencing (WGS)-based flu surveillance has been particularly delayed by the lack of bioinformatics infrastructures and/or expertise to deal with primary next-generation sequencing (NGS) data. RESULTS We developed and implemented INSaFLU ("INSide the FLU"), which is the first influenza-oriented bioinformatics free web-based suite that deals with primary NGS data (reads) towards the automatic generation of the output data that are actually the core first-line "genetic requests" for effective and timely influenza laboratory surveillance (e.g., type and sub-type, gene and whole-genome consensus sequences, variants' annotation, alignments and phylogenetic trees). By handling NGS data collected from any amplicon-based schema, the implemented pipeline enables any laboratory to perform multi-step software intensive analyses in a user-friendly manner without previous advanced training in bioinformatics. INSaFLU gives access to user-restricted sample databases and projects management, being a transparent and flexible tool specifically designed to automatically update project outputs as more samples are uploaded. Data integration is thus cumulative and scalable, fitting the need for a continuous epidemiological surveillance during the flu epidemics. Multiple outputs are provided in nomenclature-stable and standardized formats that can be explored in situ or through multiple compatible downstream applications for fine-tuned data analysis. This platform additionally flags samples as "putative mixed infections" if the population admixture enrolls influenza viruses with clearly distinct genetic backgrounds, and enriches the traditional "consensus-based" influenza genetic characterization with relevant data on influenza sub-population diversification through a depth analysis of intra-patient minor variants. This dual approach is expected to strengthen our ability not only to detect the emergence of antigenic and drug resistance variants but also to decode alternative pathways of influenza evolution and to unveil intricate routes of transmission. CONCLUSIONS In summary, INSaFLU supplies public health laboratories and influenza researchers with an open "one size fits all" framework, potentiating the operationalization of a harmonized multi-country WGS-based surveillance for influenza virus. INSaFLU can be accessed through https://insaflu.insa.pt .
Collapse
Affiliation(s)
- Vítor Borges
- Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health, Av. Padre Cruz, 1649-016 Lisbon, Portugal
| | - Miguel Pinheiro
- Institute of Biomedicine—iBiMED, Department of Medical Sciences, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Pedro Pechirra
- National Reference Laboratory for Influenza and other Respiratory Viruses, Department of Infectious Diseases, National Institute of Health, 1649-016 Lisbon, Portugal
| | - Raquel Guiomar
- National Reference Laboratory for Influenza and other Respiratory Viruses, Department of Infectious Diseases, National Institute of Health, 1649-016 Lisbon, Portugal
| | - João Paulo Gomes
- Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health, Av. Padre Cruz, 1649-016 Lisbon, Portugal
| |
Collapse
|
23
|
Aggeli D, Karas VO, Sinnott-Armstrong NA, Varghese V, Shafer RW, Greenleaf WJ, Sherlock G. Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery. Nucleic Acids Res 2018; 46:e42. [PMID: 29361139 PMCID: PMC5909455 DOI: 10.1093/nar/gky022] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Revised: 12/15/2017] [Accepted: 01/16/2018] [Indexed: 01/15/2023] Open
Abstract
Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation.
Collapse
Affiliation(s)
- Dimitra Aggeli
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Vlad O Karas
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | - Vici Varghese
- Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Robert W Shafer
- Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
24
|
Kamps-Hughes N, McUsic A, Kurihara L, Harkins TT, Pal P, Ray C, Ionescu-Zanetti C. ERASE-Seq: Leveraging replicate measurements to enhance ultralow frequency variant detection in NGS data. PLoS One 2018; 13:e0195272. [PMID: 29630678 PMCID: PMC5890993 DOI: 10.1371/journal.pone.0195272] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 03/19/2018] [Indexed: 12/30/2022] Open
Abstract
The accurate detection of ultralow allele frequency variants in DNA samples is of interest in both research and medical settings, particularly in liquid biopsies where cancer mutational status is monitored from circulating DNA. Next-generation sequencing (NGS) technologies employing molecular barcoding have shown promise but significant sensitivity and specificity improvements are still needed to detect mutations in a majority of patients before the metastatic stage. To address this we present analytical validation data for ERASE-Seq (Elimination of Recurrent Artifacts and Stochastic Errors), a method for accurate and sensitive detection of ultralow frequency DNA variants in NGS data. ERASE-Seq differs from previous methods by creating a robust statistical framework to utilize technical replicates in conjunction with background error modeling, providing a 10 to 100-fold reduction in false positive rates compared to published molecular barcoding methods. ERASE-Seq was tested using spiked human DNA mixtures with clinically realistic DNA input quantities to detect SNVs and indels between 0.05% and 1% allele frequency, the range commonly found in liquid biopsy samples. Variants were detected with greater than 90% sensitivity and a false positive rate below 0.1 calls per 10,000 possible variants. The approach represents a significant performance improvement compared to molecular barcoding methods and does not require changing molecular reagents.
Collapse
Affiliation(s)
- Nick Kamps-Hughes
- Fluxion Biosciences Inc., South San Francisco, California, United States of America
| | - Andrew McUsic
- Swift Biosciences Inc., Ann Arbor, Michigan, United States of America
| | - Laurie Kurihara
- Swift Biosciences Inc., Ann Arbor, Michigan, United States of America
| | - Timothy T Harkins
- Swift Biosciences Inc., Ann Arbor, Michigan, United States of America
| | - Prithwish Pal
- Illumina Inc., San Diego, California, United States of America
| | - Claire Ray
- Illumina Inc., San Diego, California, United States of America
| | | |
Collapse
|
25
|
Cartwright JF, Anderson K, Longworth J, Lobb P, James DC. Highly sensitive detection of mutations in CHO cell recombinant DNA using multi-parallel single molecule real-time DNA sequencing. Biotechnol Bioeng 2018; 115:1485-1498. [DOI: 10.1002/bit.26561] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 12/01/2017] [Accepted: 02/04/2018] [Indexed: 12/13/2022]
Affiliation(s)
- Joseph F. Cartwright
- Department of Chemical and Biological Engineering; University of Sheffield; Sheffield UK
| | - Karin Anderson
- Cell Line Development; BioTherapeutic Pharmaceutical Sciences; Pfizer Inc; Andover Massachusetts
| | - Joseph Longworth
- Department of Chemical and Biological Engineering; University of Sheffield; Sheffield UK
| | | | - David C. James
- Department of Chemical and Biological Engineering; University of Sheffield; Sheffield UK
| |
Collapse
|
26
|
Germini D, Tsfasman T, Zakharova VV, Sjakste N, Lipinski M, Vassetzky Y. A Comparison of Techniques to Evaluate the Effectiveness of Genome Editing. Trends Biotechnol 2018; 36:147-159. [PMID: 29157536 DOI: 10.1016/j.tibtech.2017.10.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Revised: 10/17/2017] [Accepted: 10/18/2017] [Indexed: 12/21/2022]
Abstract
Genome editing using engineered nucleases (meganucleases, zinc finger nucleases, transcription activator-like effector nucleases) has created many recent breakthroughs. Prescreening for efficiency and specificity is a critical step prior to using any newly designed genome editing tool for experimental purposes. The current standard screening methods of evaluation are based on DNA sequencing or use mismatch-sensitive endonucleases. They can be time-consuming and costly or lack reproducibility. Here, we review and critically compare standard techniques with those more recently developed in terms of reliability, time, cost, and ease of use.
Collapse
Affiliation(s)
- Diego Germini
- UMR 8126, Université Paris Sud - Paris Saclay, CNRS, Institut Gustave Roussy, 94805 Villejuif, France; LIA 1066, French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; The first two authors contributed equally to this work
| | - Tatiana Tsfasman
- UMR 8126, Université Paris Sud - Paris Saclay, CNRS, Institut Gustave Roussy, 94805 Villejuif, France; LIA 1066, French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; The first two authors contributed equally to this work
| | - Vlada V Zakharova
- UMR 8126, Université Paris Sud - Paris Saclay, CNRS, Institut Gustave Roussy, 94805 Villejuif, France; LIA 1066, French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France
| | | | - Marс Lipinski
- UMR 8126, Université Paris Sud - Paris Saclay, CNRS, Institut Gustave Roussy, 94805 Villejuif, France; LIA 1066, French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France
| | - Yegor Vassetzky
- UMR 8126, Université Paris Sud - Paris Saclay, CNRS, Institut Gustave Roussy, 94805 Villejuif, France; LIA 1066, French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; Koltzov Institute of Developmental Biology, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
27
|
Cui M, Xiao X, Zhao M, Zheng B. Detection of single nucleotide polymorphism by measuring extension kinetics with T7 exonuclease mediated isothermal amplification. Analyst 2018; 143:116-122. [DOI: 10.1039/c7an00875a] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Kinetics based detection of single nucleotide polymorphism at room temperature with high sensitivity and specificity.
Collapse
Affiliation(s)
- Miao Cui
- Department of Chemistry
- Centre of Novel Biomaterials
- The Chinese University of Hong Kong
- Shatin
- China
| | - Xianjin Xiao
- Family Planning Research Institute/Center of Reproductive Medicine
- Tongji Medical College
- Huazhong University of Science and Technology
- Wuhan
- China
| | - Meiping Zhao
- Beijing National Laboratory for Molecular Sciences
- MOE Key Laboratory of Bioorganic Chemistry and Molecular Engineering
- College of Chemistry and Molecular Engineering
- Peking University
- Beijing 100871
| | - Bo Zheng
- Department of Chemistry
- Centre of Novel Biomaterials
- The Chinese University of Hong Kong
- Shatin
- China
| |
Collapse
|
28
|
Song L, Huang W, Kang J, Huang Y, Ren H, Ding K. Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus. Sci Rep 2017; 7:8106. [PMID: 28808243 PMCID: PMC5556038 DOI: 10.1038/s41598-017-08139-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 07/05/2017] [Indexed: 01/26/2023] Open
Abstract
Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms in PGM viral datasets (e.g., hepatitis B virus (HBV)) has not been performed. We examined 19 quality-trimmed PGM datasets for the HBV reverse transcriptase (RT) region and found a total error rate of 0.48% ± 0.12%. Deletion errors were clearly present at the ends of homopolymer runs. Tests using both real and simulated data showed that the algorithms differed in their abilities to detect and correct errors and that the error rate and sequencing depth significantly affected the performance. Of the algorithms tested, Pollux showed a better overall performance but tended to over-correct 'genuine' substitution variants, whereas Fiona proved to be better at distinguishing these variants from sequencing errors. We found that the combined use of Pollux and Fiona gave the best results when error-correcting Ion Torrent PGM viral data.
Collapse
Affiliation(s)
- Liting Song
- Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical University, Chongqing, 400010, P.R. China
| | - Wenxun Huang
- Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical University, Chongqing, 400010, P.R. China
| | - Juan Kang
- Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical University, Chongqing, 400010, P.R. China
| | - Yuan Huang
- Center for Hepatobillary and Pancreatic Diseases, Beijing Tsinghua Changgung Hospital, Medical Center, Tsinghua University, Beijing, 100044, P.R. China
| | - Hong Ren
- Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical University, Chongqing, 400010, P.R. China
| | - Keyue Ding
- Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical University, Chongqing, 400010, P.R. China.
| |
Collapse
|
29
|
The Number of Target Molecules of the Amplification Step Limits Accuracy and Sensitivity in Ultradeep-Sequencing Viral Population Studies. J Virol 2017; 91:JVI.00561-17. [PMID: 28566384 DOI: 10.1128/jvi.00561-17] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 05/25/2017] [Indexed: 11/20/2022] Open
Abstract
The invention of next-generation sequencing (NGS) techniques marked the coming of a new era in the detection of the genetic diversity of intrahost viral populations. A good understanding of the genetic structure of these populations requires, first, the ability to identify the different isolates or variants and, second, the ability to accurately quantify them. However, the initial amplification step of NGS studies can impose potential quantitative biases, modifying the variant relative frequencies. In particular, the number of target molecules (NTM) used during the amplification step is vastly overlooked although of primary importance, as it sets the limit of the accuracy and sensitivity of the sequencing procedure. In the present article, we investigated quantitative biases in an NGS study of populations of a multipartite single-stranded DNA (ssDNA) virus at different steps of the procedure. We studied 20 independent populations of the ssDNA virus faba bean necrotic stunt virus (FBNSV) in two host plants, Vicia faba and Medicago truncatula FBNSV is a multipartite virus composed of eight genomic segments, whose specific and host-dependent relative frequencies are defined as the "genome formula." Our results show a significant distortion of the FBNSV genome formula after the amplification and sequencing steps. We also quantified the genetic bottleneck occurring at the amplification step by documenting the NTM of two genomic segments of FBNSV. We argue that the NTM must be documented and carefully considered when determining the sensitivity and accuracy of data from NGS studies.IMPORTANCE The advent of next-generation sequencing (NGS) techniques now enables study of the genetic diversity of viral populations. A good understanding of the genetic structure of these populations first requires the ability to identify the different isolates or variants and second requires the ability to accurately quantify them. Prior to sequencing, viral genomes need to be amplified, a step that potentially imposes quantitative biases and modifies the viral population structure. In particular, the number of target molecules (NTM) used during the amplification step is of primary importance, as it sets the limit of the accuracy and sensitivity of the sequencing procedure. In this work, we used 20 replicated populations of the multipartite faba bean necrotic stunt virus (FBNSV) to estimate the various limitations of ultradeep-sequencing studies performed on intrahost viral populations. We report quantitative biases during rolling-circle amplification and the NTM of two genomic segments of FBNSV.
Collapse
|
30
|
Ng S, Gisonni-Lex L, Azizi A. New approaches for characterization of the genetic stability of vaccine cell lines. Hum Vaccin Immunother 2017; 13:1669-1672. [PMID: 28333573 PMCID: PMC5512780 DOI: 10.1080/21645515.2017.1295191] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Revised: 01/26/2017] [Accepted: 02/10/2017] [Indexed: 10/20/2022] Open
Abstract
The genetic stability of cell lines is a critical analytical attribute required to demonstrate the quality of cells over time. During cell passage, mutations can arise in the genomic DNA, potentially leading to changes in the final vaccine product. The identity and integrity of master cell banks, extended cell banks, complementing cell lines or recombinant cell lines expressing transgenes has to be tested throughout the production process by the vaccine manufacturer. Over the past few years, the traditional methods for evaluation of genetic stability have been replaced with molecular approaches including quantitative PCR, digital PCR and high throughput sequencing. However, these molecular-based approaches are used in research laboratories and not within a GMP-compliant environment. In this article, we briefly discuss some opportunities and challenges in characterization of the genetic stability of vaccine cell lines with these molecular-based approaches.
Collapse
Affiliation(s)
- Siemon Ng
- Microbiology & Virology Platform, Department of Analytical Research & Development North America, Sanofi Pasteur, Toronto, Ontario, Canada
| | - Lucy Gisonni-Lex
- Microbiology & Virology Platform, Department of Analytical Research & Development North America, Sanofi Pasteur, Toronto, Ontario, Canada
| | - Ali Azizi
- Microbiology & Virology Platform, Department of Analytical Research & Development North America, Sanofi Pasteur, Toronto, Ontario, Canada
| |
Collapse
|
31
|
Wang K, Lai S, Yang X, Zhu T, Lu X, Wu CI, Ruan J. Ultrasensitive and high-efficiency screen of de novo low-frequency mutations by o2n-seq. Nat Commun 2017; 8:15335. [PMID: 28530222 PMCID: PMC5458117 DOI: 10.1038/ncomms15335] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 03/21/2017] [Indexed: 12/15/2022] Open
Abstract
Detection of de novo, low-frequency mutations is essential for characterizing cancer genomes and heterogeneous cell populations. However, the screening capacity of current ultrasensitive NGS methods is inadequate owing to either low-efficiency read utilization or severe amplification bias. Here, we present o2n-seq, an ultrasensitive and high-efficiency NGS library preparation method for discovering de novo, low-frequency mutations. O2n-seq reduces the error rate of NGS to 10-5-10-8. The efficiency of its data usage is about 10-30 times higher than that of barcode-based strategies. For detecting mutations with allele frequency (AF) 1% in 4.6 Mb-sized genome, the sensitivity and specificity of o2n-seq reach to 99% and 98.64%, respectively. For mutations with AF around 0.07% in phix174, o2n-seq detects all the mutations with 100% specificity. Moreover, we successfully apply o2n-seq to screen de novo, low-frequency mutations in human tumours. O2n-seq will aid to characterize the landscape of somatic mutations in research and clinical settings.
Collapse
Affiliation(s)
- Kaile Wang
- Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Pengfei Road No. 7, Dapeng New District, Shenzhen, Guangdong 518120, China
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Chaoyang, Beijing 100101, China
- University of Chinese Academy of Sciences, Shijingshan, Beijing 100049, China
| | - Shujuan Lai
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Chaoyang, Beijing 100101, China
| | - Xiaoxu Yang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Haidian, Beijing 100871, China
| | - Tianqi Zhu
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Haidian, Beijing 100190, China
- Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Xuemei Lu
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Chaoyang, Beijing 100101, China
| | - Chung-I Wu
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Chaoyang, Beijing 100101, China
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA
| | - Jue Ruan
- Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Pengfei Road No. 7, Dapeng New District, Shenzhen, Guangdong 518120, China
| |
Collapse
|
32
|
Detection of Emerging Vaccine-Related Polioviruses by Deep Sequencing. J Clin Microbiol 2017; 55:2162-2171. [PMID: 28468861 PMCID: PMC5483918 DOI: 10.1128/jcm.00144-17] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/19/2017] [Indexed: 12/13/2022] Open
Abstract
Oral poliovirus vaccine can mutate to regain neurovirulence. To date, evaluation of these mutations has been performed primarily on culture-enriched isolates by using conventional Sanger sequencing. We therefore developed a culture-independent, deep-sequencing method targeting the 5′ untranslated region (UTR) and P1 genomic region to characterize vaccine-related poliovirus variants. Error analysis of the deep-sequencing method demonstrated reliable detection of poliovirus mutations at levels of <1%, depending on read depth. Sequencing of viral nucleic acids from the stool of vaccinated, asymptomatic children and their close contacts collected during a prospective cohort study in Veracruz, Mexico, revealed no vaccine-derived polioviruses. This was expected given that the longest duration between sequenced sample collection and the end of the most recent national immunization week was 66 days. However, we identified many low-level variants (<5%) distributed across the 5′ UTR and P1 genomic region in all three Sabin serotypes, as well as vaccine-related viruses with multiple canonical mutations associated with phenotypic reversion present at high levels (>90%). These results suggest that monitoring emerging vaccine-related poliovirus variants by deep sequencing may aid in the poliovirus endgame and efforts to ensure global polio eradication.
Collapse
|
33
|
Simple multiplexed PCR-based barcoding of DNA for ultrasensitive mutation detection by next-generation sequencing. Nat Protoc 2017; 12:664-682. [PMID: 28253235 DOI: 10.1038/nprot.2017.006] [Citation(s) in RCA: 99] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Detection of extremely rare variant alleles within a complex mixture of DNA molecules is becoming increasingly relevant in many areas of clinical and basic research, such as the detection of circulating tumor DNA in the plasma of cancer patients. Barcoding of DNA template molecules early in next-generation sequencing (NGS) library construction provides a way to identify and bioinformatically remove polymerase errors that otherwise make detection of these rare variants very difficult. Several barcoding strategies have been reported, but all require long and complex library preparation protocols. Simple, multiplexed, PCR-based barcoding of DNA for sensitive mutation detection using sequencing (SiMSen-seq) was developed to generate targeted barcoded libraries with minimal DNA input, flexible target selection and a very simple, short (∼4 h) library construction protocol. The protocol comprises a three-cycle barcoding PCR step followed directly by adaptor PCR to generate the library and then bead purification before sequencing. Thus, SiMSen-seq allows detection of variant alleles at <0.1% frequency with easy customization of library content (from 1 to 40+ PCR amplicons) and a protocol that can be implemented in any molecular biology laboratory. Here, we provide a detailed protocol for assay development and describe software to process the barcoded sequence reads.
Collapse
|
34
|
Molecular Epidemiology of Plasmodium falciparum kelch13 Mutations in Senegal Determined by Using Targeted Amplicon Deep Sequencing. Antimicrob Agents Chemother 2017; 61:AAC.02116-16. [PMID: 28069653 DOI: 10.1128/aac.02116-16] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 12/27/2016] [Indexed: 12/19/2022] Open
Abstract
The emergence of Plasmodium falciparum resistance to artemisinin in Southeast Asia threatens malaria control and elimination activities worldwide. Multiple polymorphisms in the P. falciparum kelch gene found in chromosome 13 (Pfk13) have been associated with artemisinin resistance. Surveillance of potential drug resistance loci within a population that may emerge under increasing drug pressure is an important public health activity. In this context, P. falciparum infections from an observational surveillance study in Senegal were genotyped using targeted amplicon deep sequencing (TADS) for Pfk13 polymorphisms. The results were compared to previously reported Pfk13 polymorphisms from around the world. A total of 22 Pfk13 propeller domain polymorphisms were identified in this study, of which 12 have previously not been reported. Interestingly, of the 10 polymorphisms identified in the present study that were also previously reported, all had a different amino acid substitution at these codon positions. Most of the polymorphisms were present at low frequencies and were confined to single isolates, suggesting they are likely transient polymorphisms that are part of naturally evolving parasite populations. The results of this study underscore the need to identify potential drug resistance loci existing within a population, which may emerge under increasing drug pressure.
Collapse
|
35
|
Kugelman JR, Wiley MR, Nagle ER, Reyes D, Pfeffer BP, Kuhn JH, Sanchez-Lockhart M, Palacios GF. Error baseline rates of five sample preparation methods used to characterize RNA virus populations. PLoS One 2017; 12:e0171333. [PMID: 28182717 PMCID: PMC5300104 DOI: 10.1371/journal.pone.0171333] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 01/18/2017] [Indexed: 11/19/2022] Open
Abstract
Individual RNA viruses typically occur as populations of genomes that differ slightly from each other due to mutations introduced by the error-prone viral polymerase. Understanding the variability of RNA virus genome populations is critical for understanding virus evolution because individual mutant genomes may gain evolutionary selective advantages and give rise to dominant subpopulations, possibly even leading to the emergence of viruses resistant to medical countermeasures. Reverse transcription of virus genome populations followed by next-generation sequencing is the only available method to characterize variation for RNA viruses. However, both steps may lead to the introduction of artificial mutations, thereby skewing the data. To better understand how such errors are introduced during sample preparation, we determined and compared error baseline rates of five different sample preparation methods by analyzing in vitro transcribed Ebola virus RNA from an artificial plasmid-based system. These methods included: shotgun sequencing from plasmid DNA or in vitro transcribed RNA as a basic “no amplification” method, amplicon sequencing from the plasmid DNA or in vitro transcribed RNA as a “targeted” amplification method, sequence-independent single-primer amplification (SISPA) as a “random” amplification method, rolling circle reverse transcription sequencing (CirSeq) as an advanced “no amplification” method, and Illumina TruSeq RNA Access as a “targeted” enrichment method. The measured error frequencies indicate that RNA Access offers the best tradeoff between sensitivity and sample preparation error (1.4−5) of all compared methods.
Collapse
Affiliation(s)
- Jeffrey R. Kugelman
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
| | - Michael R. Wiley
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
| | - Elyse R. Nagle
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
| | - Daniel Reyes
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
| | - Brad P. Pfeffer
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
| | - Jens H. Kuhn
- Integrated Research Facility at Fort Detrick (IRF-Frederick), National Institute of Allergy and Infectious Diseases, National Institutes of Health, Fort Detrick, Frederick, Maryland, United States of America
| | - Mariano Sanchez-Lockhart
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
| | - Gustavo F. Palacios
- Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases (USAMRIID), Fort Detrick, Frederick, Maryland, United States of America
- * E-mail:
| |
Collapse
|
36
|
Variational inference for rare variant detection in deep, heterogeneous next-generation sequencing data. BMC Bioinformatics 2017; 18:45. [PMID: 28103803 PMCID: PMC5244592 DOI: 10.1186/s12859-016-1451-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Accepted: 12/22/2016] [Indexed: 01/09/2023] Open
Abstract
Background The detection of rare single nucleotide variants (SNVs) is important for understanding genetic heterogeneity using next-generation sequencing (NGS) data. Various computational algorithms have been proposed to detect variants at the single nucleotide level in mixed samples. Yet, the noise inherent in the biological processes involved in NGS technology necessitates the development of statistically accurate methods to identify true rare variants. Results We propose a Bayesian statistical model and a variational expectation maximization (EM) algorithm to estimate non-reference allele frequency (NRAF) and identify SNVs in heterogeneous cell populations. We demonstrate that our variational EM algorithm has comparable sensitivity and specificity compared with a Markov Chain Monte Carlo (MCMC) sampling inference algorithm, and is more computationally efficient on tests of relatively low coverage (27× and 298×) data. Furthermore, we show that our model with a variational EM inference algorithm has higher specificity than many state-of-the-art algorithms. In an analysis of a directed evolution longitudinal yeast data set, we are able to identify a time-series trend in non-reference allele frequency and detect novel variants that have not yet been reported. Our model also detects the emergence of a beneficial variant earlier than was previously shown, and a pair of concomitant variants. Conclusions We developed a variational EM algorithm for a hierarchical Bayesian model to identify rare variants in heterogeneous next-generation sequencing data. Our algorithm is able to identify variants in a broad range of read depths and non-reference allele frequencies with high sensitivity and specificity. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1451-5) contains supplementary material, which is available to authorized users.
Collapse
|
37
|
Hsu CW, Sowers ML, Hsu W, Eyzaguirre E, Qiu S, Chao C, Mouton CP, Fofanov Y, Singh P, Sowers LC. How does inflammation drive mutagenesis in colorectal cancer? TRENDS IN CANCER RESEARCH 2017; 12:111-132. [PMID: 30147278 PMCID: PMC6107301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Colorectal cancer (CRC) is a major health challenge worldwide. Factors thought to be important in CRC etiology include diet, microbiome, exercise, obesity, a history of colon inflammation and family history. Interventions, including the use of non-steroidal anti-Inflammatory drugs (NSAIDs) and anti-inflammatory agents, have been shown to decrease incidence in some settings. However, our current understanding of the mechanistic details that drive CRC are insufficient to sort out the complex and interacting factors responsible for cancer-initiating events. It has been known for some time that the development of CRC involves mutations in key genes such as p53 and APC, and the sequence in which these mutations occur can determine tumor presentation. Observed recurrent mutations are dominated by C to T transitions at CpG sites, implicating the deamination of 5-methylcytosine (5mC) as a key initiating event in cancer-driving mutations. While it has been widely assumed that inflammation-mediated oxidation drives mutations in CRC, oxidative damage to DNA induces primarily G to T transversions, not C to T transitions. In this review, we discuss this unresolved conundrum, and specifically, we elucidate how the known nucleotide excision repair (NER) and base excision repair (BER) pathways, which are partially redundant and potentially competing, might provide a critical link between oxidative DNA damage and C to T mutations. Studies using recently developed next-generation DNA sequencing technologies have revealed the genetic heterogeneity in human tissues including tumors, as well as the presence of DNA damage. The capacity to follow DNA damage, repair and mutagenesis in human tissues using these emerging technologies could provide a mechanistic basis for understanding the role of oxidative damage in CRC tumor initiation. The application of these technologies could identify mechanism-based biomarkers useful in earlier diagnosis and aid in the development of cancer prevention strategies.
Collapse
Affiliation(s)
- Chia Wei Hsu
- MD/PhD program, University of Texas Medical Branch, Galveston, Texas
| | - Mark L Sowers
- MD/PhD program, University of Texas Medical Branch, Galveston, Texas
| | - Willie Hsu
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, Galveston, Texas
| | - Eduardo Eyzaguirre
- Department of Pathology, University of Texas Medical Branch, Galveston, Texas
| | - Suimin Qiu
- Department of Pathology, University of Texas Medical Branch, Galveston, Texas
| | - Celia Chao
- Department of Surgery, University of Texas Medical Branch, Galveston, Texas
| | - Charles P Mouton
- Department of Family Medicine, University of Texas Medical Branch, Galveston, Texas
| | - Yuri Fofanov
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, Galveston, Texas
- Sealy Center for Structural Biology, University of Texas Medical Branch, Galveston, Texas
| | - Pomila Singh
- Department of Neuroscience and Cell Biology, University of Texas Medical Branch, Galveston, Texas
| | - Lawrence C Sowers
- Department of Pharmacology and Toxicology, University of Texas Medical Branch, Galveston, Texas
- Sealy Center for Structural Biology, University of Texas Medical Branch, Galveston, Texas
- Department of Internal Medicine, University of Texas Medical Branch, Galveston, Texas, USA
| |
Collapse
|
38
|
Artyomenko A, Wu NC, Mangul S, Eskin E, Sun R, Zelikovsky A. Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants. J Comput Biol 2016; 24:558-570. [PMID: 27901586 DOI: 10.1089/cmb.2016.0146] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous "swarm" of mutant variants. The long read length offered by single-molecule sequencing technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this article, we present two single-nucleotide variants (2SNV), a method able to tolerate the high error rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single-nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2% and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction.
Collapse
Affiliation(s)
| | - Nicholas C Wu
- 2 Department of Integrative Structural and Computational Biology, The Scripps Research Institute , La Jolla, California
| | - Serghei Mangul
- 3 Department of Computer Science, University of California , Los Angeles, Los Angeles, California.,4 Institute for Quantitative and Computational Biosciences, University of California Los Angeles , Los Angeles, California
| | - Eleazar Eskin
- 3 Department of Computer Science, University of California , Los Angeles, Los Angeles, California
| | - Ren Sun
- 5 Molecular and Medical Pharmacology, University of California , Los Angeles, Los Angeles, California
| | - Alex Zelikovsky
- 1 Department of Computer Science, Georgia State University , Atlanta, Georgia
| |
Collapse
|
39
|
Parker J, Chen J. Application of next generation sequencing for the detection of human viral pathogens in clinical specimens. J Clin Virol 2016; 86:20-26. [PMID: 27902961 DOI: 10.1016/j.jcv.2016.11.010] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 11/02/2016] [Accepted: 11/21/2016] [Indexed: 02/07/2023]
Abstract
BACKGROUND Next generation sequencing (NGS) is a new technology that can be used for broad detection of infectious pathogens and is rapidly becoming an essential platform in clinical laboratories. It is not known how NGS will displace or enhance gold standard methodologies in infectious disease diagnosis. OBJECTIVES To investigate the feasibility and application of NGS technology in public health laboratories and compare NGS technology with conventional methods. STUDY DESIGN Illumina MiSeq system was used to detect viral pathogens alongside other conventional virology methods using typical clinical specimen matrices. Sixteen clinical specimens and two CDC proficiency panels containing seventeen specimens were analyzed. RESULTS Known pathogenic viral nucleic acid was positively identified in all clinical specimens, correlating and building upon results obtained by more conventional laboratory methods. Sequencing depths ranged from 0.008X to 319 and genome coverage ranged from 0.6% to 99.9%. To substantiate the described methods used to analyze data derived from clinical specimens, the results of a clinical proficiency panel are also presented. DISCUSSION Our results reveal true scarcity of known pathogenic viral nucleic acids in clinical specimens. NGS outperforms more conventional detection methods in this study by turnaround time as well as the improved depth of knowledge in regards to serotyping and drug resistance.
Collapse
Affiliation(s)
- Jayme Parker
- Department of Biology and Wildlife, Institute of Arctic Biology, University of Alaska Fairbanks, Fairbanks, AK 99775, USA; Alaska State Public Health Virology Laboratory, Fairbanks, AK 99775, USA
| | - Jack Chen
- Department of Biology and Wildlife, Institute of Arctic Biology, University of Alaska Fairbanks, Fairbanks, AK 99775, USA; Alaska State Public Health Virology Laboratory, Fairbanks, AK 99775, USA.
| |
Collapse
|
40
|
Ali R, Blackburn RM, Kozlakidis Z. Next-Generation Sequencing and Influenza Virus: A Short Review of the Published Implementation Attempts. HAYATI JOURNAL OF BIOSCIENCES 2016. [DOI: 10.1016/j.hjb.2016.12.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/09/2022] Open
|
41
|
Posada-Cespedes S, Seifert D, Beerenwinkel N. Recent advances in inferring viral diversity from high-throughput sequencing data. Virus Res 2016; 239:17-32. [PMID: 27693290 DOI: 10.1016/j.virusres.2016.09.016] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Revised: 09/23/2016] [Accepted: 09/24/2016] [Indexed: 02/05/2023]
Abstract
Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity estimation and global haplotype reconstruction. Challenges posed by aligning reads, as well as the impact of reference biases on diversity estimates are also discussed. In addition, we address some of the experimental approaches designed to improve the biological signal-to-noise ratio. In the future, computational methods for the analysis of heterogeneous virus populations are likely to continue being complemented by technological developments.
Collapse
Affiliation(s)
- Susana Posada-Cespedes
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland
| | - David Seifert
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland.
| |
Collapse
|
42
|
Zhao J, Liu J, Vemula SV, Lin C, Tan J, Ragupathy V, Wang X, Mbondji-wonje C, Ye Z, Landry ML, Hewlett I. Sensitive Detection and Simultaneous Discrimination of Influenza A and B Viruses in Nasopharyngeal Swabs in a Single Assay Using Next-Generation Sequencing-Based Diagnostics. PLoS One 2016; 11:e0163175. [PMID: 27658193 PMCID: PMC5033603 DOI: 10.1371/journal.pone.0163175] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 09/02/2016] [Indexed: 11/18/2022] Open
Abstract
Reassortment of 2009 (H1N1) pandemic influenza virus (pdH1N1) with other strains may produce more virulent and pathogenic forms, detection and their rapid characterization is critical. In this study, we reported a “one-size-fits-all” approach using a next-generation sequencing (NGS) detection platform to extensively identify influenza viral genomes for diagnosis and determination of novel virulence and drug resistance markers. A de novo module and other bioinformatics tools were used to generate contiguous sequence and identify influenza types/subtypes. Of 162 archived influenza-positive patient specimens, 161(99.4%) were positive for either influenza A or B viruses determined using the NGS assay. Among these, 135(83.3%) were A(H3N2), 14(8.6%) were A(pdH1N1), 2(1.2%) were A(H3N2) and A(pdH1N1) virus co-infections and 10(6.2%) were influenza B viruses. Of the influenza A viruses, 66.7% of A(H3N2) viruses tested had a E627K mutation in the PB2 protein, and 87.8% of the influenza A viruses contained the S31N mutation in the M2 protein. Further studies demonstrated that the NGS assay could achieve a high level of sensitivity and reveal adequate genetic information for final laboratory confirmation. The current diagnostic platform allows for simultaneous identification of a broad range of influenza viruses, monitoring emerging influenza strains with pandemic potential that facilitating diagnostics and antiviral treatment in the clinical setting and protection of the public health.
Collapse
Affiliation(s)
- Jiangqin Zhao
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
- * E-mail: (JZ); (IH)
| | - Jikun Liu
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Sai Vikram Vemula
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Corinna Lin
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Jiying Tan
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Viswanath Ragupathy
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Xue Wang
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Christelle Mbondji-wonje
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Zhiping Ye
- DVP/OVRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
| | - Marie L. Landry
- Department of Laboratory Medicine, Yale University School of Medicine, New Haven, CT, 06520, United States of America
| | - Indira Hewlett
- DETTD/OBRR/CBER, Food and Drug Administration, Silver Spring, MD, 20993, United States of America
- * E-mail: (JZ); (IH)
| |
Collapse
|
43
|
Jakaitiene A, Avino M, Guarracino MR. Beta-Binomial Model for the Detection of Rare Mutations in Pooled Next-Generation Sequencing Experiments. J Comput Biol 2016; 24:357-367. [PMID: 27632638 DOI: 10.1089/cmb.2016.0106] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.
Collapse
Affiliation(s)
- Audrone Jakaitiene
- 1 Bioinformatics and Biostatistics Center, Department of Human and Medical Genetics, Faculty of Medicine, Vilnius University , Vilnius, Lithuania
| | - Mariano Avino
- 2 High Performance Computing and Networking Institute , National Research Council, Naples, Italy
| | - Mario Rosario Guarracino
- 2 High Performance Computing and Networking Institute , National Research Council, Naples, Italy
| |
Collapse
|
44
|
Wong LH, Sinha S, Bergeron JR, Mellor JC, Giaever G, Flaherty P, Nislow C. Reverse Chemical Genetics: Comprehensive Fitness Profiling Reveals the Spectrum of Drug Target Interactions. PLoS Genet 2016; 12:e1006275. [PMID: 27588687 PMCID: PMC5010250 DOI: 10.1371/journal.pgen.1006275] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2016] [Accepted: 08/03/2016] [Indexed: 01/22/2023] Open
Abstract
The emergence and prevalence of drug resistance demands streamlined strategies to identify drug resistant variants in a fast, systematic and cost-effective way. Methods commonly used to understand and predict drug resistance rely on limited clinical studies from patients who are refractory to drugs or on laborious evolution experiments with poor coverage of the gene variants. Here, we report an integrative functional variomics methodology combining deep sequencing and a Bayesian statistical model to provide a comprehensive list of drug resistance alleles from complex variant populations. Dihydrofolate reductase, the target of methotrexate chemotherapy drug, was used as a model to identify functional mutant alleles correlated with methotrexate resistance. This systematic approach identified previously reported resistance mutations, as well as novel point mutations that were validated in vivo. Use of this systematic strategy as a routine diagnostics tool widens the scope of successful drug research and development. One of the most profound outcomes of fast, reliable genome sequencing is the ability to tailor drug therapy to an individual’s genotype. This ‘personalized’ or ‘precision medicine’ is the realization of a decades-long effort to maximize drug effect and limit unwanted side effects. An undesirable consequence of such targeted therapies, however, is the emergence of drug resistance. This outcome is the result of an evolutionary process where mutations in the drug target render the drug perturbation allow such mutant cells to proliferate. Because of the unbiased, and stochastic nature of the emergence of drug resistance, it is impossible to predict. We developed a test where hundreds of thousands of mutant cells are exposed to a drug simultaneously and those cells that modulate resistance survive. This method is innovative because it partners a high-throughput experimental protocol with a tailored statistical model to identify all mutations that modulate resistance. Finally, we used synthetic biology to re-create these mutations and demonstrate that they were, in fact, bona fide drug-resistant variants. These mutations were further extended and confirmed to also be resistant in the human orthologue. This combined biological-computational approach allows one to identify drug’s degree of resistance to both guide treatments and future drug discovery.
Collapse
Affiliation(s)
- Lai H. Wong
- Department of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
| | - Sunita Sinha
- Department of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
| | - Julien R. Bergeron
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | | | - Guri Giaever
- Department of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
| | - Patrick Flaherty
- Department of Mathematics and Statistics, University of Massachusetts, Amherst, Massachusetts, United States of America
- * E-mail: (PF); (CN)
| | - Corey Nislow
- Department of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
- * E-mail: (PF); (CN)
| |
Collapse
|
45
|
Measurements of Intrahost Viral Diversity Are Extremely Sensitive to Systematic Errors in Variant Calling. J Virol 2016; 90:6884-95. [PMID: 27194763 DOI: 10.1128/jvi.00667-16] [Citation(s) in RCA: 87] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Accepted: 05/11/2016] [Indexed: 12/22/2022] Open
Abstract
UNLABELLED With next-generation sequencing technologies, it is now feasible to efficiently sequence patient-derived virus populations at a depth of coverage sufficient to detect rare variants. However, each sequencing platform has characteristic error profiles, and sample collection, target amplification, and library preparation are additional processes whereby errors are introduced and propagated. Many studies account for these errors by using ad hoc quality thresholds and/or previously published statistical algorithms. Despite common usage, the majority of these approaches have not been validated under conditions that characterize many studies of intrahost diversity. Here, we use defined populations of influenza virus to mimic the diversity and titer typically found in patient-derived samples. We identified single-nucleotide variants using two commonly employed variant callers, DeepSNV and LoFreq. We found that the accuracy of these variant callers was lower than expected and exquisitely sensitive to the input titer. Small reductions in specificity had a significant impact on the number of minority variants identified and subsequent measures of diversity. We were able to increase the specificity of DeepSNV to >99.95% by applying an empirically validated set of quality thresholds. When applied to a set of influenza virus samples from a household-based cohort study, these changes resulted in a 10-fold reduction in measurements of viral diversity. We have made our sequence data and analysis code available so that others may improve on our work and use our data set to benchmark their own bioinformatics pipelines. Our work demonstrates that inadequate quality control and validation can lead to significant overestimation of intrahost diversity. IMPORTANCE Advances in sequencing technology have made it feasible to sequence patient-derived viral samples at a level sufficient for detection of rare mutations. These high-throughput, cost-effective methods are revolutionizing the study of within-host viral diversity. However, the techniques are error prone, and the methods commonly used to control for these errors have not been validated under the conditions that characterize patient-derived samples. Here, we show that these conditions affect measurements of viral diversity. We found that the accuracy of previously benchmarked analysis pipelines was greatly reduced under patient-derived conditions. By carefully validating our sequencing analysis using known control samples, we were able to identify biases in our method and to improve our accuracy to acceptable levels. Application of our modified pipeline to a set of influenza virus samples from a cohort study provided a realistic picture of intrahost diversity and suggested the need for rigorous quality control in such studies.
Collapse
|
46
|
Advanced Molecular Detection of Malarone Resistance. Antimicrob Agents Chemother 2016; 60:3821-3. [PMID: 27001821 DOI: 10.1128/aac.00171-16] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 03/14/2016] [Indexed: 11/20/2022] Open
Abstract
The rapid emergence of drug-resistant malaria parasites during the course of an infection remains a major challenge for providing accurate treatment guidelines. This is particularly important in cases of malaria treatment failure. Using a previously well-characterized case of malaria treatment failure, we show the utility of using next-generation sequencing for early detection of the rise and selection of a previously reported atovaquone-proguanil (malarone) drug resistance-associated mutation.
Collapse
|
47
|
Wang K, Ma X, Zhang X, Wu D, Sun C, Sun Y, Lu X, Wu CI, Guo C, Ruan J. Using ultra-sensitive next generation sequencing to dissect DNA damage-induced mutagenesis. Sci Rep 2016; 6:25310. [PMID: 27122023 PMCID: PMC4848531 DOI: 10.1038/srep25310] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 04/13/2016] [Indexed: 12/17/2022] Open
Abstract
Next generation sequencing (NGS) technologies have dramatically improved studies in biology and biomedical science. However, no optimal NGS approach is available to conveniently analyze low frequency mutations caused by DNA damage treatments. Here, by developing an exquisite ultra-sensitive NGS (USNGS) platform “EasyMF” and incorporating it with a widely used supF shuttle vector-based mutagenesis system, we can conveniently dissect roles of lesion bypass polymerases in damage-induced mutagenesis. In this improved mutagenesis analysis pipeline, the initial steps are the same as in the supF mutation assay, involving damaging the pSP189 plasmid followed by its transfection into human 293T cells to allow replication to occur. Then “EasyMF” is employed to replace downstream MBM7070 bacterial transformation and other steps for analyzing damage-induced mutation frequencies and spectra. This pipeline was validated by using UV damaged plasmid after its replication in lesion bypass polymerase-deficient 293T cells. The increased throughput and reduced cost of this system will allow us to conveniently screen regulators of translesion DNA synthesis pathway and monitor environmental genotoxic substances, which can ultimately provide insight into the mechanisms of genome stability and mutagenesis.
Collapse
Affiliation(s)
- Kaile Wang
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xiaolu Ma
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xue Zhang
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Dafei Wu
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Chenyi Sun
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Yazhou Sun
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xuemei Lu
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Chung-I Wu
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, China.,Department of Ecology and Evolution, University of Chicago, USA
| | - Caixia Guo
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jue Ruan
- Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| |
Collapse
|
48
|
Wang K, Ma Q, Jiang L, Lai S, Lu X, Hou Y, Wu CI, Ruan J. Ultra-precise detection of mutations by droplet-based amplification of circularized DNA. BMC Genomics 2016; 17:214. [PMID: 26960407 PMCID: PMC4784281 DOI: 10.1186/s12864-016-2480-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 02/16/2016] [Indexed: 01/16/2023] Open
Abstract
Background NGS (next generation sequencing) has been widely used in studies of biological processes, ranging from microbial evolution to cancer genomics. However, the error rate of NGS (0.1 % ~ 1 %) is still remaining a great challenge for comprehensively investigating the low frequency variations, and the current solution methods have suffered severe amplification bias or low efficiency. Results We creatively developed Droplet-CirSeq for relatively efficient, low-bias and ultra-sensitive identification of variations by combining millions of picoliter uniform-sized droplets with Cir-seq. Droplet-CirSeq is entitled with an incredibly low error rate of 3 ~ 5 X 10-6. To systematically evaluate the performances of amplification uniformity and capability of mutation identification for Droplet-CirSeq, we took the mixtures of two E. coli strains as specific instances to simulate the circumstances of mutations with different frequencies. Compared with Cir-seq, the coefficient of variance of read depth for Droplet-CirSeq was 10 times less (p = 2.6 X 10-3), and the identified allele frequency presented more concentrated to the authentic frequency of mixtures (p = 4.8 X 10-3), illustrating a significant improvement of amplification bias and accuracy in allele frequency determination. Additionally, Droplet-CirSeq detected 2.5 times genuine SNPs (p < 0.001), achieved a 2.8 times lower false positive rate (p < 0.05) and a 1.5 times lower false negative rate (p < 0.001), in the case of a 3 pg DNA input. Intriguingly, the false positive sites predominantly represented in two types of base substitutions (G- > A, C- > T). Our findings indicated that 30 pg DNA input accommodated in 5 ~ 10 million droplets resulted in maximal detection of authentic mutations compared to 3 pg (p = 1.2 X 10-8) and 300 pg input (p = 2.2 X 10-3). Conclusions We developed a method namely Droplet-CirSeq to significantly improve the amplification bias, which presents obvious superiority over the currently prevalent methods in exploitation of ultra-low frequency mutations. Droplet-CirSeq would be promisingly used in the identification of low frequency mutations initiated from extremely low input DNA, such as DNA of uncultured microorganisms, captured DNA of target region, circulation DNA of plasma et al, and its creative conception of rolling circle amplification in droplets would also be used in other low input DNA amplification fields. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2480-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kaile Wang
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Qin Ma
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Lan Jiang
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Shujuan Lai
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Xuemei Lu
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Yali Hou
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
| | - Chung-I Wu
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China. .,State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, China. .,Department of Ecology and Evolution, University of Chicago, Illinois, USA.
| | - Jue Ruan
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China. .,Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| |
Collapse
|
49
|
Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants. LECTURE NOTES IN COMPUTER SCIENCE 2016. [DOI: 10.1007/978-3-319-31957-5_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
50
|
Ode H, Matsuda M, Matsuoka K, Hachiya A, Hattori J, Kito Y, Yokomaku Y, Iwatani Y, Sugiura W. Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq. Front Microbiol 2015; 6:1258. [PMID: 26617593 PMCID: PMC4641896 DOI: 10.3389/fmicb.2015.01258] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 10/29/2015] [Indexed: 12/29/2022] Open
Abstract
Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome.
Collapse
Affiliation(s)
- Hirotaka Ode
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Masakazu Matsuda
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Kazuhiro Matsuoka
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Atsuko Hachiya
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Junko Hattori
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Yumiko Kito
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Yoshiyuki Yokomaku
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan
| | - Yasumasa Iwatani
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan ; Department of AIDS Research, Graduate School of Medicine, Nagoya University Nagoya, Japan
| | - Wataru Sugiura
- Department of Infectious Diseases and Immunology, Clinical Research Center, National Hospital Organization Nagoya Medical Center Nagoya, Japan ; Department of AIDS Research, Graduate School of Medicine, Nagoya University Nagoya, Japan
| |
Collapse
|