1
|
Komatsu H, Inui A, Hoshino H, Umetsu S, Fujisawa T. Integration of Viral Genome to Human Genomic DNA in Nails of Patients with Chronic Hepatitis B Virus Infection. JMA J 2023; 6:426-436. [PMID: 37941707 PMCID: PMC10628332 DOI: 10.31662/jmaj.2023-0082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 07/11/2023] [Indexed: 11/10/2023] Open
Abstract
Introduction Hepatitis B virus (HBV) DNA and cytomegalovirus (CMV) DNA can be detected in patient genomes. However, it remains unknown whether viral DNA can be integrated into host genomic DNA and detected in fingernails. Methods Nails from patients with chronic HBV infection were investigated. A total of 60 patients (male/female = 20/40, age range from 2 years to 59 years, median 15 years) were included in this study. The viral DNA levels of herpes simplex virus 1 (HSV-1), herpes simplex virus 2 (HSV-2), varicella-zoster virus (VZV), Epstein‒Barr virus (EBV), cytomegalovirus (CMV), human herpes virus 6 (HHV-6), human herpes virus 7 (HHV-7), and HBV in nails were measured with real-time PCR. Viral DNA integration into host genomic DNA was analyzed by capture-based next-generation sequencing (NGS). Moreover, virus/host chimeric sequences, which were detected by capture-based NGS, were confirmed by Sanger sequencing. Results Of the 60 patients, 37 (62%) were positive for nail HBV DNA. All 60 patients were negative for nail HSV-1, HSV-2, VZV, CMV, EBV, or HHV-6 DNA. However, three patients were positive for nail HHV-7 DNA. All three nail HHV-7-positive patients were also positive for nail HBV DNA. The three nail samples that were positive for both HBV and HHV-7 DNA were used for viral integration analysis by capture-based NGS. One of the three nail samples showed HBV/host chimeric sequences. In addition, all three nail samples showed HHV-7/host chimeric sequences. However, these viral integration breakpoints were not confirmed by Sanger sequencing. Conclusions Viral integrations were detected in nails by capture-based NGS. However, Sanger sequencing did not confirm any virus/host chimeric sequences. This study could not show reliable evidence of viral integration in nails.
Collapse
Affiliation(s)
- Haruki Komatsu
- Department of Pediatrics, Toho University, Sakura Medical Center, Chiba, Japan
- Komatsu Children's Clinic, Chiba, Japan
| | - Ayano Inui
- Department of Pediatric Hepatology and Gastroenterology, Eastern Yokohama Hospital, Kanagawa, Japan
| | - Hiroki Hoshino
- Department of Pediatrics, Toho University, Sakura Medical Center, Chiba, Japan
| | - Shuichiro Umetsu
- Department of Pediatric Hepatology and Gastroenterology, Eastern Yokohama Hospital, Kanagawa, Japan
| | - Tomoo Fujisawa
- Department of Pediatric Hepatology and Gastroenterology, Eastern Yokohama Hospital, Kanagawa, Japan
| |
Collapse
|
2
|
Karimzadeh M, Arlidge C, Rostami A, Lupien M, Bratman SV, Hoffman MM. Human papillomavirus integration transforms chromatin to drive oncogenesis. Genome Biol 2023; 24:142. [PMID: 37365652 DOI: 10.1186/s13059-023-02926-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 04/07/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Human papillomavirus (HPV) drives almost all cervical cancers and up to 70% of head and neck cancers. Frequent integration into the host genome occurs predominantly in tumorigenic types of HPV. We hypothesize that changes in chromatin state at the location of integration can result in changes in gene expression that contribute to the tumorigenicity of HPV. RESULTS We find that viral integration events often occur along with changes in chromatin state and expression of genes near the integration site. We investigate whether introduction of new transcription factor binding sites due to HPV integration could invoke these changes. Some regions within the HPV genome, particularly the position of a conserved CTCF binding site, show enriched chromatin accessibility signal. ChIP-seq reveals that the conserved CTCF binding site within the HPV genome binds CTCF in 4 HPV+ cancer cell lines. Significant changes in CTCF binding pattern and increases in chromatin accessibility occur exclusively within 100 kbp of HPV integration sites. The chromatin changes co-occur with out-sized changes in transcription and alternative splicing of local genes. Analysis of The Cancer Genome Atlas (TCGA) HPV+ tumors indicates that HPV integration upregulates genes which have significantly higher essentiality scores compared to randomly selected upregulated genes from the same tumors. CONCLUSIONS Our results suggest that introduction of a new CTCF binding site due to HPV integration reorganizes chromatin state and upregulates genes essential for tumor viability in some HPV+ tumors. These findings emphasize a newly recognized role of HPV integration in oncogenesis.
Collapse
Affiliation(s)
- Mehran Karimzadeh
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Christopher Arlidge
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Ariana Rostami
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Mathieu Lupien
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
| | - Scott V Bratman
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
| | - Michael M Hoffman
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
Chen Y, Wang Y, Zhou P, Huang H, Li R, Zeng Z, Cui Z, Tian R, Jin Z, Liu J, Huang Z, Li L, Huang Z, Tian X, Yu M, Hu Z. VIS Atlas: A Database of Virus Integration Sites in Human Genome from NGS Data to Explore Integration Patterns. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:300-310. [PMID: 36804047 PMCID: PMC10626058 DOI: 10.1016/j.gpb.2023.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 01/08/2023] [Accepted: 02/10/2023] [Indexed: 02/17/2023]
Abstract
Integration of oncogenic DNA viruses into the human genome is a key step in most virus-induced carcinogenesis. Here, we constructed a virus integration site (VIS) Atlas database, an extensive collection of integration breakpoints for three most prevalent oncoviruses, human papillomavirus, hepatitis B virus, and Epstein-Barr virus based on the next-generation sequencing (NGS) data, literature, and experimental data. There are 63,179 breakpoints and 47,411 junctional sequences with full annotations deposited in the VIS Atlas database, comprising 47 virus genotypes and 17 disease types. The VIS Atlas database provides (1) a genome browser for NGS breakpoint quality check, visualization of VISs, and the local genomic context; (2) a novel platform to discover integration patterns; and (3) a statistics interface for a comprehensive investigation of genotype-specific integration features. Data collected in the VIS Atlas aid to provide insights into virus pathogenic mechanisms and the development of novel antitumor drugs. The VIS Atlas database is available at https://www.vis-atlas.tech/.
Collapse
Affiliation(s)
- Ye Chen
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Yuyan Wang
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Ping Zhou
- Department of Obstetrics and Gynecology, Dongguan Maternal and Child Health Care Hospital, Dongguan 523000, China
| | - Hao Huang
- Office of Scientific Research & Development, Sun Yat-sen University, Guangzhou 510000, China
| | - Rui Li
- Department of Obstetrics and Gynecology, Academician Expert Workstation, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Zhen Zeng
- Department of Obstetrics and Gynecology, Academician Expert Workstation, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Zifeng Cui
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Rui Tian
- Center for Translational Medicine, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Zhuang Jin
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Jiashuo Liu
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Zhaoyue Huang
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Lifang Li
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Zheying Huang
- Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Xun Tian
- Department of Obstetrics and Gynecology, Academician Expert Workstation, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China.
| | - Meiying Yu
- Department of Pathology, the Central Hospital of Enshi Tujia and Miao Autonomous Prefecture, Enshi 445000, China.
| | - Zheng Hu
- Department of Obstetrics and Gynecology, Zhongnan Hospital of Wuhan University, Wuhan 430062, China; Department of Obstetrics and Gynecology, the First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510000, China.
| |
Collapse
|
4
|
Sotcheff S, Zhou Y, Yeung J, Sun Y, Johnson JE, Torbett BE, Routh AL. ViReMa: a virus recombination mapper of next-generation sequencing data characterizes diverse recombinant viral nucleic acids. Gigascience 2023; 12:giad009. [PMID: 36939008 PMCID: PMC10025937 DOI: 10.1093/gigascience/giad009] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 11/30/2022] [Accepted: 02/03/2023] [Indexed: 03/21/2023] Open
Abstract
BACKGROUND Genetic recombination is a tremendous source of intrahost diversity in viruses and is critical for their ability to rapidly adapt to new environments or fitness challenges. While viruses are routinely characterized using high-throughput sequencing techniques, characterizing the genetic products of recombination in next-generation sequencing data remains a challenge. Viral recombination events can be highly diverse and variable in nature, including simple duplications and deletions, or more complex events such as copy/snap-back recombination, intervirus or intersegment recombination, and insertions of host nucleic acids. Due to the variable mechanisms driving virus recombination and the different selection pressures acting on the progeny, recombination junctions rarely adhere to simple canonical sites or sequences. Furthermore, numerous different events may be present simultaneously in a viral population, yielding a complex mutational landscape. FINDINGS We have previously developed an algorithm called ViReMa (Virus Recombination Mapper) that bootstraps the bowtie short-read aligner to capture and annotate a wide range of recombinant species found within virus populations. Here, we have updated ViReMa to provide an "error density" function designed to accurately detect recombination events in the longer reads now routinely generated by the Illumina platforms and provide output reports for multiple types of recombinant species using standardized formats. We demonstrate the utility and flexibility of ViReMa in different settings to report deletion events in simulated data from Flock House virus, copy-back RNA species in Sendai viruses, short duplication events in HIV, and virus-to-host recombination in an archaeal DNA virus.
Collapse
Affiliation(s)
- Stephanea Sotcheff
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Yiyang Zhou
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Jason Yeung
- John Sealy School of Medicine, The University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Yan Sun
- Department of Microbiology and Immunology, The University of Rochester Medical Center, Rochester, NY 14642, USA
| | - John E Johnson
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA 92037, USA
| | - Bruce E Torbett
- Department of Pediatrics, School of Medicine, University of Washington, Seattle, WA 98105, USA
- Center for Immunity and Immunotherapies, Seattle Children's Research Institute, Seattle, WA 98105, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA 98195, USA
| | - Andrew L Routh
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, TX 77555, USA
- Sealy Center for Structural Biology and Molecular Biophysics, The University of Texas Medical Branch, Galveston, TX 77555, USA
- Institute for Human Infections and Immunity, University of Texas Medical Branch, Galveston, TX 77555, USA
| |
Collapse
|
5
|
hgtseq: A Standard Pipeline to Study Horizontal Gene Transfer. Int J Mol Sci 2022; 23:ijms232314512. [PMID: 36498841 PMCID: PMC9738810 DOI: 10.3390/ijms232314512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 11/14/2022] [Accepted: 11/18/2022] [Indexed: 11/23/2022] Open
Abstract
Horizontal gene transfer (HGT) is well described in prokaryotes: it plays a crucial role in evolution, and has functional consequences in insects and plants. However, less is known about HGT in humans. Studies have reported bacterial integrations in cancer patients, and microbial sequences have been detected in data from well-known human sequencing projects. Few of the existing tools for investigating HGT are highly automated. Thanks to the adoption of Nextflow for life sciences workflows, and to the standards and best practices curated by communities such as nf-core, fully automated, portable, and scalable pipelines can now be developed. Here we present nf-core/hgtseq to facilitate the analysis of HGT from sequencing data in different organisms. We showcase its performance by analysing six exome datasets from five mammals. Hgtseq can be run seamlessly in any computing environment and accepts data generated by existing exome and whole-genome sequencing projects; this will enable researchers to expand their analyses into this area. Fundamental questions are still open about the mechanisms and the extent or role of horizontal gene transfer: by releasing hgtseq we provide a standardised tool which will enable a systematic investigation of this phenomenon, thus paving the way for a better understanding of HGT.
Collapse
|
6
|
Javadzadeh S, Rajkumar U, Nguyen N, Sarmashghi S, Luebeck J, Shang J, Bafna V. FastViFi: Fast and accurate detection of (Hybrid) Viral DNA and RNA. NAR Genom Bioinform 2022; 4:lqac032. [PMID: 35493723 PMCID: PMC9041341 DOI: 10.1093/nargab/lqac032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 03/04/2022] [Accepted: 03/06/2022] [Indexed: 11/13/2022] Open
Abstract
DNA viruses are important infectious agents known to mediate a large number of human diseases, including cancer. Viral integration into the host genome and the formation of hybrid transcripts are also associated with increased pathogenicity. The high variability of viral genomes, however requires the use of sensitive ensemble hidden Markov models that add to the computational complexity, often requiring > 40 CPU-hours per sample. Here, we describe FastViFi, a fast 2-stage filtering method that reduces the computational burden. On simulated and cancer genomic data, FastViFi improved the running time by 2 orders of magnitude with comparable accuracy on challenging data sets. Recently published methods have focused on identification of location of viral integration into the human host genome using local assembly, but do not extend to RNA. To identify human viral hybrid transcripts, we additionally developed ensemble Hidden Markov Models for the Epstein Barr virus (EBV) to add to the models for Hepatitis B (HBV), Hepatitis C (HCV) viruses and the Human Papillomavirus (HPV), and used FastViFi to query RNA-seq data from Gastric cancer (EBV) and liver cancer (HBV/HCV). FastViFi ran in <10 minutes per sample and identified multiple hybrids that fuse viral and human genes suggesting new mechanisms for oncoviral pathogenicity. FastViFi is available at https://github.com/sara-javadzadeh/FastViFi.
Collapse
Affiliation(s)
- Sara Javadzadeh
- Department of Computer Science & Engineering, UC San Diego, La Jolla, California, USA
| | - Utkrisht Rajkumar
- Department of Computer Science & Engineering, UC San Diego, La Jolla, California, USA
| | - Nam Nguyen
- Boundless Bio, Inc. 11099 N Torrey Pines Rd, La Jolla, CA, USA
| | - Shahab Sarmashghi
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, California, USA
| | - Jens Luebeck
- Bioinformatics & Systems Biology Graduate Program, UC San Diego, La Jolla, California, USA
| | - Jingbo Shang
- Department of Computer Science & Engineering, UC San Diego, La Jolla, California, USA
| | - Vineet Bafna
- Department of Computer Science & Engineering, UC San Diego, La Jolla, California, USA
- Boundless Bio, Inc. 11099 N Torrey Pines Rd, La Jolla, CA, USA
- Moores Cancer Center, UC San Diego, La Jolla, California, USA
| |
Collapse
|
7
|
Jurasz H, Pawłowski T, Perlejewski K. Contamination Issue in Viral Metagenomics: Problems, Solutions, and Clinical Perspectives. Front Microbiol 2021; 12:745076. [PMID: 34745046 PMCID: PMC8564396 DOI: 10.3389/fmicb.2021.745076] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 09/17/2021] [Indexed: 12/16/2022] Open
Abstract
We describe the most common internal and external sources and types of contamination encountered in viral metagenomic studies and discuss their negative impact on sequencing results, particularly for low-biomass samples and clinical applications. We also propose some basic recommendations for reducing the background noise in viral shotgun metagenomic (SM) studies, which would limit the bias introduced by various classes of contaminants. Regardless of the specific viral SM protocol, contamination cannot be totally avoided; in particular, the issue of reagent contamination should always be addressed with high priority. There is an urgent need for the development and validation of standards for viral metagenomic studies especially if viral SM protocols will be more widely applied in diagnostics.
Collapse
Affiliation(s)
- Henryk Jurasz
- Department of Immunopathology of Infectious and Parasitic Diseases, Medical University of Warsaw, Warsaw, Poland
| | - Tomasz Pawłowski
- Division of Psychotherapy and Psychosomatic Medicine, Department of Psychiatry, Wrocław Medical University, Wrocław, Poland
| | - Karol Perlejewski
- Department of Immunopathology of Infectious and Parasitic Diseases, Medical University of Warsaw, Warsaw, Poland
| |
Collapse
|
8
|
Causes and Consequences of HPV Integration in Head and Neck Squamous Cell Carcinomas: State of the Art. Cancers (Basel) 2021; 13:cancers13164089. [PMID: 34439243 PMCID: PMC8394665 DOI: 10.3390/cancers13164089] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 08/10/2021] [Accepted: 08/11/2021] [Indexed: 12/29/2022] Open
Abstract
A constantly increasing incidence in high-risk Human Papillomaviruses (HPV)s driven head and neck squamous cell carcinomas (HNSCC)s, especially of oropharyngeal origin, is being observed. During persistent infections, viral DNA integration into the host genome may occur. Studies are examining if the physical status of the virus (episomal vs. integration) affects carcinogenesis and eventually has further-reaching consequences on disease progression and outcome. Here, we review the literature of the most recent five years focusing on the impact of HPV integration in HNSCCs, covering aspects of detection techniques used (from PCR up to NGS approaches), integration loci identified, and associations with genomic and clinical data. The consequences of HPV integration in the human genome, including the methylation status and deregulation of genes involved in cell signaling pathways, immune evasion, and response to therapy, are also summarized.
Collapse
|
9
|
Rajaby R, Zhou Y, Meng Y, Zeng X, Li G, Wu P, Sung WK. SurVirus: a repeat-aware virus integration caller. Nucleic Acids Res 2021; 49:e33. [PMID: 33444454 PMCID: PMC8034624 DOI: 10.1093/nar/gkaa1237] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2020] [Revised: 12/01/2020] [Accepted: 01/12/2021] [Indexed: 01/01/2023] Open
Abstract
A significant portion of human cancers are due to viruses integrating into human genomes. Therefore, accurately predicting virus integrations can help uncover the mechanisms that lead to many devastating diseases. Virus integrations can be called by analysing second generation high-throughput sequencing datasets. Unfortunately, existing methods fail to report a significant portion of integrations, while predicting a large number of false positives. We observe that the inaccuracy is caused by incorrect alignment of reads in repetitive regions. False alignments create false positives, while missing alignments create false negatives. This paper proposes SurVirus, an improved virus integration caller that corrects the alignment of reads which are crucial for the discovery of integrations. We use publicly available datasets to show that existing methods predict hundreds of thousands of false positives; SurVirus, on the other hand, is significantly more precise while it also detects many novel integrations previously missed by other tools, most of which are in repetitive regions. We validate a subset of these novel integrations, and find that the majority are correct. Using SurVirus, we find that HPV and HBV integrations are enriched in LINE and Satellite regions which had been overlooked, as well as discover recurrent HBV and HPV breakpoints in human genome-virus fusion transcripts.
Collapse
Affiliation(s)
- Ramesh Rajaby
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, 28 Medical Drive, 117456, Singapore
| | - Yi Zhou
- Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Yifan Meng
- Department of Gynecologic Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.,Cancer Biology Research Center (Key laboratory of the ministry of education), Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xi Zeng
- Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Guoliang Li
- Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Peng Wu
- Department of Gynecologic Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.,Cancer Biology Research Center (Key laboratory of the ministry of education), Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Wing-Kin Sung
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.,Genome Institute of Singapore, 60 Biopolis Street, Genome 138672, Singapore
| |
Collapse
|
10
|
Cameron DL, Jacobs N, Roepman P, Priestley P, Cuppen E, Papenfuss AT. VIRUSBreakend: Viral Integration Recognition Using Single Breakends. Bioinformatics 2021; 37:3115-3119. [PMID: 33973999 PMCID: PMC8504616 DOI: 10.1093/bioinformatics/btab343] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 03/25/2021] [Accepted: 05/03/2021] [Indexed: 12/17/2022] Open
Abstract
Motivation Integration of viruses into infected host cell DNA can cause DNA damage and disrupt genes. Recent cost reductions and growth of whole genome sequencing has produced a wealth of data in which viral presence and integration detection is possible. While key research and clinically relevant insights can be uncovered, existing software has not achieved widespread adoption, limited in part due to high computational costs, the inability to detect a wide range of viruses, as well as precision and sensitivity. Results Here, we describe VIRUSBreakend, a high-speed tool that identifies viral DNA presence and genomic integration. It utilizes single breakends, breakpoints in which only one side can be unambiguously placed, in a novel virus-centric variant calling and assembly approach to identify viral integrations with high sensitivity and a near-zero false discovery rate. VIRUSBreakend detects viral integrations anywhere in the host genome including regions such as centromeres and telomeres unable to be called by existing tools. Applying VIRUSBreakend to a large metastatic cancer cohort, we demonstrate that it can reliably detect clinically relevant viral presence and integration including HPV, HBV, MCPyV, EBV and HHV-8. Availability and implementation VIRUSBreakend is part of the Genomic Rearrangement IDentification Software Suite (GRIDSS). It is available under a GPLv3 license from https://github.com/PapenfussLab/VIRUSBreakend. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniel L Cameron
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,Department of Medical Biology, University of Melbourne, Australia.,Hartwig Medical Foundation Australia, Sydney, Australia
| | - Nina Jacobs
- Hartwig Medical Foundation, Amsterdam, The Netherlands
| | - Paul Roepman
- Hartwig Medical Foundation, Amsterdam, The Netherlands
| | | | - Edwin Cuppen
- Hartwig Medical Foundation, Amsterdam, The Netherlands.,Center for Molecular Medicine and Oncode Institute, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Anthony T Papenfuss
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,Department of Medical Biology, University of Melbourne, Australia.,Peter MacCallum Cancer Centre, Melbourne, Australia.,Sir Peter MacCallum Department of Oncology, University of Melbourne, Australia
| |
Collapse
|
11
|
Chen X, Kost J, Li D. Comprehensive comparative analysis of methods and software for identifying viral integrations. Brief Bioinform 2020; 20:2088-2097. [PMID: 30102374 DOI: 10.1093/bib/bby070] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 07/02/2018] [Accepted: 07/12/2018] [Indexed: 12/13/2022] Open
Abstract
Many viruses are capable of integrating in the human genome, particularly viruses involved in tumorigenesis. Viral integrations can be considered genetic markers for discovering virus-caused cancers and inferring cancer cell development. Next-generation sequencing (NGS) technologies have been widely used to screen for viral integrations in cancer genomes, and a number of bioinformatics tools have been developed to detect viral integrations using NGS data. However, there has been no systematic comparison of the methods or software. In this study, we performed a comprehensive comparative analysis of the designs, performance, functionality and limitations among the existing methods and software for detecting viral integrations. We further compared the sensitivity, precision and runtime of integration detection of four representative tools. Our analyses showed that each of the existing software had its own merits; however, none of them were sufficient for parallel or accurate virome-wide detection. After carefully evaluating the limitations shared by the existing methods, we proposed strategies and directions for developing virome-wide integration detection.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Cancer Center, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
12
|
Giannuzzi D, Aresu L. A First NGS Investigation Suggests No Association Between Viruses and Canine Cancers. Front Vet Sci 2020; 7:365. [PMID: 32766289 PMCID: PMC7380080 DOI: 10.3389/fvets.2020.00365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 05/26/2020] [Indexed: 12/16/2022] Open
Abstract
Approximately 10–15% of worldwide human cancers are attributable to viral infection. When operating as carcinogenic elements, viruses may act with various mechanisms, but the most important is represented by viral integration into the host genome, causing chromosome instability, genomic mutations, and aberrations. In canine species, few reports have described an association between viral integration and canine cancers, but more comprehensive studies are needed. The advancement of next-generation sequencing and the cost reduction have resulted in a progressive increasing of sequencing data in veterinary oncology offering an opportunity to study virome in canine cancers. In this study, we have performed viral detection and integration analyses using VirusFinder2 software tool on available whole-genome and whole-exome sequencing data of different canine cancers. Several viral sequences were detected in lymphomas, hemangiosarcomas, melanomas, and osteosarcomas, but no reliable integration sites were identified. Even if with some limitations such as the depth and type of sequencing, a restricted number of available nonhuman genomes software, and a limited knowledge on endogenous retroviruses in the canine genome, results are compelling. However, further experiments are needed, and similarly to feline species, dedicated analysis tools for the identification of viral integration sites in canine cancers are required.
Collapse
Affiliation(s)
- Diana Giannuzzi
- Department of Comparative Biomedicine and Food Science, University of Padua, Legnaro, Italy
| | - Luca Aresu
- Department of Veterinary Science, University of Turin, Grugliasco, Italy
| |
Collapse
|
13
|
Chakravorty S, Yan B, Wang C, Wang L, Quaid JT, Lin CF, Briggs SD, Majumder J, Canaria DA, Chauss D, Chopra G, Olson MR, Zhao B, Afzali B, Kazemian M. Integrated Pan-Cancer Map of EBV-Associated Neoplasms Reveals Functional Host-Virus Interactions. Cancer Res 2019; 79:6010-6023. [PMID: 31481499 DOI: 10.1158/0008-5472.can-19-0615] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 06/24/2019] [Accepted: 08/27/2019] [Indexed: 12/20/2022]
Abstract
Epstein-Barr virus (EBV) is a complex oncogenic symbiont. The molecular mechanisms governing EBV carcinogenesis remain elusive and the functional interactions between virus and host cells are incompletely defined. Here we present a comprehensive map of the host cell-pathogen interactome in EBV-associated cancers. We systematically analyzed RNA sequencing from >1,000 patients with 15 different cancer types, comparing virus and host factors of EBV+ to EBV- tissues. EBV preferentially integrated at highly accessible regions of the cancer genome, with significant enrichment in super-enhancer architecture. Twelve EBV transcripts, including LMP1 and LMP2, correlated inversely with EBV reactivation signature. Overexpression of these genes significantly suppressed viral reactivation, consistent with a "virostatic" function. In cancer samples, hundreds of novel frequent missense and nonsense variations in virostatic genes were identified, and variant genes failed to regulate their viral and cellular targets in cancer. For example, one-third of patients with EBV+ NK/T-cell lymphoma carried two novel nonsense variants (Q322X, G342X) of LMP1 and both variant proteins failed to restrict viral reactivation, confirming loss of virostatic function. Host cell transcriptional changes in response to EBV infection classified tumors into two molecular subtypes based on patterns of IFN signature genes and immune checkpoint markers, such as PD-L1 and IDO1. Overall, these findings uncover novel points of interaction between a common oncovirus and the human genome and identify novel regulatory nodes and druggable targets for individualized EBV and cancer-specific therapies. SIGNIFICANCE: This study provides a comprehensive map of the host cell-pathogen interactome in EBV+ malignancies.See related commentary by Mbulaiteye and Prokunina-Olsson, p. 5917.
Collapse
Affiliation(s)
| | - Bingyu Yan
- Department of Biochemistry, Purdue University, West Lafayette, Indiana
| | - Chong Wang
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Luopin Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana
| | | | - Chin Fang Lin
- Department of Agricultural and Biological Engineering, Purdue University, West Lafayette, Indiana
| | - Scott D Briggs
- Department of Biochemistry, Purdue University, West Lafayette, Indiana
| | - Joydeb Majumder
- Department of Chemistry, Purdue University, West Lafayette, Indiana
| | - D Alejandro Canaria
- Department of Biological Science, Purdue University, West Lafayette, Indiana
| | - Daniel Chauss
- Immunoregulation Section, Kidney Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, Maryland
| | - Gaurav Chopra
- Department of Chemistry, Purdue University, West Lafayette, Indiana
| | - Matthew R Olson
- Department of Biological Science, Purdue University, West Lafayette, Indiana
| | - Bo Zhao
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Behdad Afzali
- Immunoregulation Section, Kidney Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, Maryland
| | - Majid Kazemian
- Department of Biochemistry, Purdue University, West Lafayette, Indiana. .,Department of Computer Science, Purdue University, West Lafayette, Indiana
| |
Collapse
|
14
|
Zhu C, Wu L, Lv Y, Guan J, Bai X, Lin J, Liu T, Yang X, Robson SC, Sang X, Xue C, Zhao H. The fusion landscape of hepatocellular carcinoma. Mol Oncol 2019; 13:1214-1225. [PMID: 30903738 PMCID: PMC6487730 DOI: 10.1002/1878-0261.12479] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2018] [Revised: 02/26/2019] [Accepted: 02/27/2019] [Indexed: 12/30/2022] Open
Abstract
Most cases of hepatocellular carcinoma (HCC) are already advanced at the time of diagnosis, which limits treatment options. Challenges in early‐stage diagnosis may be due to the genetic complexity of HCC. Gene fusion plays a critical function in tumorigenesis and cancer progression in multiple cancers, yet the identities of fusion genes as potential diagnostic markers in HCC have not been investigated. Here, we employed STAR‐Fusion and identified 43 recurrent fusion events in our own and four public RNA‐seq datasets. We identified 2354 different gene fusions in two hepatitis B virus (HBV)‐HCC patients. Validation analysis against the four RNA‐seq datasets revealed that only 1.8% (43/2354) were recurrent fusions. Comparison with the four fusion databases demonstrated that 19 recurrent fusions were not previously annotated to diseases and three were annotated as disease‐related fusion events. Finally, we validated six of the novel fusion events, including RP11‐476K15.1‐CTD‐2015H3.2, by RT‐PCR and Sanger sequencing of 14 pairs of HBV‐related HCC samples. In summary, our study provides new insights into gene fusions in HCC and may contribute to the development of anti‐HCC therapy.
Collapse
Affiliation(s)
- Chengpei Zhu
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Liangcai Wu
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yanling Lv
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.,My Health Gene Technology Co., Ltd., Service Centre of Tianjin Chentang Science and Technology Commercial District, China
| | - Jinxia Guan
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.,My Health Gene Technology Co., Ltd., Service Centre of Tianjin Chentang Science and Technology Commercial District, China
| | - Xue Bai
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jianzhen Lin
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Tingting Liu
- My Health Gene Technology Co., Ltd., Service Centre of Tianjin Chentang Science and Technology Commercial District, China
| | - Xiaobo Yang
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Simon C Robson
- Liver Center and The Transplant Institute, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Xinting Sang
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Chenghai Xue
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.,My Health Gene Technology Co., Ltd., Service Centre of Tianjin Chentang Science and Technology Commercial District, China.,Joint Laboratory of Large-scale Medical Data Pattern Mining and Application, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Haitao Zhao
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
15
|
Chen X, Kost J, Sulovari A, Wong N, Liang WS, Cao J, Li D. A virome-wide clonal integration analysis platform for discovering cancer viral etiology. Genome Res 2019; 29:819-830. [PMID: 30872350 PMCID: PMC6499315 DOI: 10.1101/gr.242529.118] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 03/11/2019] [Indexed: 12/31/2022]
Abstract
Oncoviral infection is responsible for 12%–15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contamination and noncausal viruses complicate the process of identifying genuine oncoviruses. Here, we propose a novel strategy to address these challenges by performing virome-wide screening of early-stage clonal viral integrations. To implement this strategy, we developed VIcaller, a novel platform for identifying viral integrations that are derived from any characterized viruses and shared by a large proportion of tumor cells using whole-genome sequencing (WGS) data. The sensitivity and precision were confirmed with simulated and benchmark cancer data sets. By applying this platform to cancer WGS data sets with proven or speculated viral etiology, we newly identified or confirmed clonal integrations of hepatitis B virus (HBV), human papillomavirus (HPV), Epstein-Barr virus (EBV), and BK Virus (BKV), suggesting the involvement of these viruses in early stages of tumorigenesis in affected tumors, such as HBV in TERT and KMT2B (also known as MLL4) gene loci in liver cancer, HPV and BKV in bladder cancer, and EBV in non-Hodgkin's lymphoma. We also showed the capacity of VIcaller to identify integrations from some uncharacterized viruses. This is the first study to systematically investigate the strategy and method of virome-wide screening of clonal integrations to identify oncoviruses. Searching clonal viral integrations with our platform has the capacity to identify virus-caused cancers and discover cancer viral etiologies.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Arvis Sulovari
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Nathalie Wong
- Department of Anatomical and Cellular Pathology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, NT, Hong Kong 999077, P.R. China
| | - Winnie S Liang
- Translational Genomics Research Institute, Phoenix, Arizona 85004, USA
| | - Jian Cao
- Division of Medical Oncology, Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA.,Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
16
|
Jiang Y, Sun A, Zhao Y, Ying W, Sun H, Yang X, Xing B, Sun W, Ren L, Hu B, Li C, Zhang L, Qin G, Zhang M, Chen N, Zhang M, Huang Y, Zhou J, Zhao Y, Liu M, Zhu X, Qiu Y, Sun Y, Huang C, Yan M, Wang M, Liu W, Tian F, Xu H, Zhou J, Wu Z, Shi T, Zhu W, Qin J, Xie L, Fan J, Qian X, He F. Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma. Nature 2019; 567:257-261. [PMID: 30814741 DOI: 10.1038/s41586-019-0987-8] [Citation(s) in RCA: 611] [Impact Index Per Article: 101.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 02/01/2019] [Indexed: 12/28/2022]
Abstract
Hepatocellular carcinoma is the third leading cause of deaths from cancer worldwide. Infection with the hepatitis B virus is one of the leading risk factors for developing hepatocellular carcinoma, particularly in East Asia1. Although surgical treatment may be effective in the early stages, the five-year overall rate of survival after developing this cancer is only 50-70%2. Here, using proteomic and phospho-proteomic profiling, we characterize 110 paired tumour and non-tumour tissues of clinical early-stage hepatocellular carcinoma related to hepatitis B virus infection. Our quantitative proteomic data highlight heterogeneity in early-stage hepatocellular carcinoma: we used this to stratify the cohort into the subtypes S-I, S-II and S-III, each of which has a different clinical outcome. S-III, which is characterized by disrupted cholesterol homeostasis, is associated with the lowest overall rate of survival and the greatest risk of a poor prognosis after first-line surgery. The knockdown of sterol O-acyltransferase 1 (SOAT1)-high expression of which is a signature specific to the S-III subtype-alters the distribution of cellular cholesterol, and effectively suppresses the proliferation and migration of hepatocellular carcinoma. Finally, on the basis of a patient-derived tumour xenograft mouse model of hepatocellular carcinoma, we found that treatment with avasimibe, an inhibitor of SOAT1, markedly reduced the size of tumours that had high levels of SOAT1 expression. The proteomic stratification of early-stage hepatocellular carcinoma presented in this study provides insight into the tumour biology of this cancer, and suggests opportunities for personalized therapies that target it.
Collapse
Affiliation(s)
- Ying Jiang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Aihua Sun
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Yang Zhao
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, China
| | - Wantao Ying
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Huichuan Sun
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China
| | - Xinrong Yang
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China
| | - Baocai Xing
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital & Institute, Beijing, China
| | - Wei Sun
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Liangliang Ren
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Bo Hu
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China
| | - Chaoying Li
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Li Zhang
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Guangrong Qin
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, China
| | - Menghuan Zhang
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, China
| | - Ning Chen
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Manli Zhang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Yin Huang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Jinan Zhou
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Yan Zhao
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Mingwei Liu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Xiaodong Zhu
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China
| | - Yang Qiu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Yanjun Sun
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Cheng Huang
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China
| | - Meng Yan
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Mingchao Wang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Wei Liu
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital & Institute, Beijing, China
| | - Fang Tian
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Huali Xu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Jian Zhou
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China
| | - Zhenyu Wu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Weimin Zhu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Jun Qin
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
| | - Lu Xie
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, China
| | - Jia Fan
- Department of Liver Surgery & Transplantation, Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China.
| | - Xiaohong Qian
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China.
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, China.
| | - Fuchu He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China.
| |
Collapse
|
17
|
Xia Y, Liu Y, Deng M, Xi R. Detecting virus integration sites based on multiple related sequencing data by VirTect. BMC Med Genomics 2019; 12:19. [PMID: 30704462 PMCID: PMC6357354 DOI: 10.1186/s12920-018-0461-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background Since tumor often has a high level of intra-tumor heterogeneity, multiple tumor samples from the same patient at different locations or different time points are often sequenced to study tumor intra-heterogeneity or tumor evolution. In virus-related tumors such as human papillomavirus- and Hepatitis B Virus-related tumors, virus genome integrations can be critical driving events. It is thus important to investigate the integration sites of the virus genomes. Currently, a few algorithms for detecting virus integration sites based on high-throughput sequencing have been developed, but their insufficient performance in their sensitivity, specificity and computational complexity hinders their applications in multiple related tumor sequencing. Results We develop VirTect for detecting virus integration sites simultaneously from multiple related-sample data. This algorithm is mainly based on the joint analysis of short reads spanning breakpoints of integration sites from multiple samples. To achieve high specificity and breakpoint accuracy, a local precise sandwich alignment algorithm is used. Simulation and real data analyses show that, compared with other algorithms, VirTect is significantly more sensitive and has a similar or lower false discovery rate. Conclusions VirTect can provide more accurate breakpoint position and is computationally much more efficient in terms both memory requirement and computational time. Electronic supplementary material The online version of this article (10.1186/s12920-018-0461-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yuchao Xia
- School of Mathematical Sciences, Peking University, Beijing, 100871, China
| | - Yun Liu
- School of Mathematical Sciences, Peking University, Beijing, 100871, China
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, Beijing, 100871, China.,Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - Ruibin Xi
- School of Mathematical Sciences, Peking University, Beijing, 100871, China. .,Center for Statistical Science, Peking University, Beijing, 100871, China. .,Center for Data Science, Peking University, Beijing, 100871, China.
| |
Collapse
|
18
|
Bioinformatics Applications in Advancing Animal Virus Research. RECENT ADVANCES IN ANIMAL VIROLOGY 2019. [PMCID: PMC7121192 DOI: 10.1007/978-981-13-9073-9_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Viruses serve as infectious agents for all living entities. There have been various research groups that focus on understanding the viruses in terms of their host-viral relationships, pathogenesis and immune evasion. However, with the current advances in the field of science, now the research field has widened up at the ‘omics’ level. Apparently, generation of viral sequence data has been increasing. There are numerous bioinformatics tools available that not only aid in analysing such sequence data but also aid in deducing useful information that can be exploited in developing preventive and therapeutic measures. This chapter elaborates on bioinformatics tools that are specifically designed for animal viruses as well as other generic tools that can be exploited to study animal viruses. The chapter further provides information on the tools that can be used to study viral epidemiology, phylogenetic analysis, structural modelling of proteins, epitope recognition and open reading frame (ORF) recognition and tools that enable to analyse host-viral interactions, gene prediction in the viral genome, etc. Various databases that organize information on animal and human viruses have also been described. The chapter will converse on overview of the current advances, online and downloadable tools and databases in the field of bioinformatics that will enable the researchers to study animal viruses at gene level.
Collapse
|
19
|
Hu Z, Ma D. The precision prevention and therapy of HPV-related cervical cancer: new concepts and clinical implications. Cancer Med 2018; 7:5217-5236. [PMID: 30589505 PMCID: PMC6198240 DOI: 10.1002/cam4.1501] [Citation(s) in RCA: 196] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/14/2018] [Accepted: 03/21/2018] [Indexed: 12/14/2022] Open
Abstract
Cervical cancer is the third most common cancer in women worldwide, with concepts and knowledge about its prevention and treatment evolving rapidly. Human papillomavirus (HPV) has been identified as a major factor that leads to cervical cancer, although HPV infection alone cannot cause the disease. In fact, HPV-driven cancer is a small probability event because most infections are transient and could be cleared spontaneously by host immune system. With persistent HPV infection, decades are required for progression to cervical cancer. Therefore, this long time window provides golden opportunity for clinical intervention, and the fundament here is to elucidate the carcinogenic pattern and applicable targets during HPV-host interaction. In this review, we discuss the key factors that contribute to the persistence of HPV and cervical carcinogenesis, emerging new concepts and technologies for cancer interventions, and more urgently, how these concepts and technologies might lead to clinical precision medicine which could provide prediction, prevention, and early treatment for patients.
Collapse
Affiliation(s)
- Zheng Hu
- Department of Gynecological oncologyThe First Affiliated Hospital of Sun Yat‐sen UniversityZhongshan 2nd RoadYuexiu, GuangzhouGuangdongChina
- Department of Obstetrics and GynecologyTongji HospitalTongji Medical CollegeHuazhong University of Science and TechnologyWuhan, Hubei430030China
| | - Ding Ma
- Department of Obstetrics and GynecologyTongji HospitalTongji Medical CollegeHuazhong University of Science and TechnologyWuhan, Hubei430030China
| |
Collapse
|
20
|
Baheti S, Tang X, O'Brien DR, Chia N, Roberts LR, Nelson H, Boughey JC, Wang L, Goetz MP, Kocher JPA, Kalari KR. HGT-ID: an efficient and sensitive workflow to detect human-viral insertion sites using next-generation sequencing data. BMC Bioinformatics 2018; 19:271. [PMID: 30016933 PMCID: PMC6050683 DOI: 10.1186/s12859-018-2260-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 06/25/2018] [Indexed: 12/11/2022] Open
Abstract
Background Transfer of genetic material from microbes or viruses into the host genome is known as horizontal gene transfer (HGT). The integration of viruses into the human genome is associated with multiple cancers, and these can now be detected using next-generation sequencing methods such as whole genome sequencing and RNA-sequencing. Results We designed a novel computational workflow, HGT-ID, to identify the integration of viruses into the human genome using the sequencing data. The HGT-ID workflow primarily follows a four-step procedure: i) pre-processing of unaligned reads, ii) virus detection using subtraction approach, iii) identification of virus integration site using discordant and soft-clipped reads and iv) HGT candidates prioritization through a scoring function. Annotation and visualization of the events, as well as primer design for experimental validation, are also provided in the final report. We evaluated the tool performance with the well-understood cervical cancer samples. The HGT-ID workflow accurately detected known human papillomavirus (HPV) integration sites with high sensitivity and specificity compared to previous HGT methods. We applied HGT-ID to The Cancer Genome Atlas (TCGA) whole-genome sequencing data (WGS) from liver tumor-normal pairs. Multiple hepatitis B virus (HBV) integration sites were identified in TCGA liver samples and confirmed by HGT-ID using the RNA-Seq data from the matched liver pairs. This shows the applicability of the method in both the data types and cross-validation of the HGT events in liver samples. We also processed 220 breast tumor WGS data through the workflow; however, there were no HGT events detected in those samples. Conclusions HGT-ID is a novel computational workflow to detect the integration of viruses in the human genome using the sequencing data. It is fast and accurate with functions such as prioritization, annotation, visualization and primer design for future validation of HGTs. The HGT-ID workflow is released under the MIT License and available at http://kalarikrlab.org/Software/HGT-ID.html.
Collapse
Affiliation(s)
- Saurabh Baheti
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Xiaojia Tang
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Daniel R O'Brien
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Nicholas Chia
- Department of Surgery, Mayo Clinic, Rochester, MN, USA
| | - Lewis R Roberts
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | - Heidi Nelson
- Department of Surgery, Mayo Clinic, Rochester, MN, USA
| | | | - Liewei Wang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN, USA
| | - Matthew P Goetz
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN, USA.,Department of Medical Oncology, Mayo Clinic, Rochester, MN, USA
| | - Jean-Pierre A Kocher
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Krishna R Kalari
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
21
|
Sekiba K, Otsuka M, Ohno M, Yamagami M, Kishikawa T, Suzuki T, Ishibashi R, Seimiya T, Tanaka E, Koike K. Hepatitis B virus pathogenesis: Fresh insights into hepatitis B virus RNA. World J Gastroenterol 2018; 24:2261-2268. [PMID: 29881235 PMCID: PMC5989240 DOI: 10.3748/wjg.v24.i21.2261] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 04/24/2018] [Accepted: 04/26/2018] [Indexed: 02/06/2023] Open
Abstract
Hepatitis B virus (HBV) is still a worldwide health concern. While divergent factors are involved in its pathogenesis, it is now clear that HBV RNAs, principally templates for viral proteins and viral DNAs, have diverse biological functions involved in HBV pathogenesis. These functions include viral replication, hepatic fibrosis and hepatocarcinogenesis. Depending on the sequence similarities, HBV RNAs may act as sponges for host miRNAs and may deregulate miRNA functions, possibly leading to pathological consequences. Some parts of the HBV RNA molecule may function as viral-derived miRNA, which regulates viral replication. HBV DNA can integrate into the host genomic DNA and produce novel viral-host fusion RNA, which may have pathological functions. To date, elimination of HBV-derived covalently closed circular DNA has not been achieved. However, RNA transcription silencing may be an alternative practical approach to treat HBV-induced pathogenesis. A full understanding of HBV RNA transcription and the biological functions of HBV RNA may open a new avenue for the development of novel HBV therapeutics.
Collapse
Affiliation(s)
- Kazuma Sekiba
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Motoyuki Otsuka
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Motoko Ohno
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Mari Yamagami
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Takahiro Kishikawa
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Tatsunori Suzuki
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Rei Ishibashi
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Takahiro Seimiya
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Eri Tanaka
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| | - Kazuhiko Koike
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-8655, Japan
| |
Collapse
|
22
|
Bhuvaneshwar K, Song L, Madhavan S, Gusev Y. viGEN: An Open Source Pipeline for the Detection and Quantification of Viral RNA in Human Tumors. Front Microbiol 2018; 9:1172. [PMID: 29922260 PMCID: PMC5996193 DOI: 10.3389/fmicb.2018.01172] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 05/15/2018] [Indexed: 01/05/2023] Open
Abstract
An estimated 17% of cancers worldwide are associated with infectious causes. The extent and biological significance of viral presence/infection in actual tumor samples is generally unknown but could be measured using human transcriptome (RNA-seq) data from tumor samples. We present an open source bioinformatics pipeline viGEN, which allows for not only the detection and quantification of viral RNA, but also variants in the viral transcripts. The pipeline includes 4 major modules: The first module aligns and filter out human RNA sequences; the second module maps and count (remaining un-aligned) reads against reference genomes of all known and sequenced human viruses; the third module quantifies read counts at the individual viral-gene level thus allowing for downstream differential expression analysis of viral genes between case and controls groups. The fourth module calls variants in these viruses. To the best of our knowledge, there are no publicly available pipelines or packages that would provide this type of complete analysis in one open source package. In this paper, we applied the viGEN pipeline to two case studies. We first demonstrate the working of our pipeline on a large public dataset, the TCGA cervical cancer cohort. In the second case study, we performed an in-depth analysis on a small focused study of TCGA liver cancer patients. In the latter cohort, we performed viral-gene quantification, viral-variant extraction and survival analysis. This allowed us to find differentially expressed viral-transcripts and viral-variants between the groups of patients, and connect them to clinical outcome. From our analyses, we show that we were able to successfully detect the human papilloma virus among the TCGA cervical cancer patients. We compared the viGEN pipeline with two metagenomics tools and demonstrate similar sensitivity/specificity. We were also able to quantify viral-transcripts and extract viral-variants using the liver cancer dataset. The results presented corresponded with published literature in terms of rate of detection, and impact of several known variants of HBV genome. This pipeline is generalizable, and can be used to provide novel biological insights into microbial infections in complex diseases and tumorigeneses. Our viral pipeline could be used in conjunction with additional type of immuno-oncology analysis based on RNA-seq data of host RNA for cancer immunology applications. The source code, with example data and tutorial is available at: https://github.com/ICBI/viGEN/.
Collapse
Affiliation(s)
- Krithika Bhuvaneshwar
- Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC, United States
| | - Lei Song
- Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC, United States
| | - Subha Madhavan
- Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC, United States
| | - Yuriy Gusev
- Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC, United States
| |
Collapse
|
23
|
Nooij S, Schmitz D, Vennema H, Kroneman A, Koopmans MPG. Overview of Virus Metagenomic Classification Methods and Their Biological Applications. Front Microbiol 2018; 9:749. [PMID: 29740407 PMCID: PMC5924777 DOI: 10.3389/fmicb.2018.00749] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 04/03/2018] [Indexed: 12/20/2022] Open
Abstract
Metagenomics poses opportunities for clinical and public health virology applications by offering a way to assess complete taxonomic composition of a clinical sample in an unbiased way. However, the techniques required are complicated and analysis standards have yet to develop. This, together with the wealth of different tools and workflows that have been proposed, poses a barrier for new users. We evaluated 49 published computational classification workflows for virus metagenomics in a literature review. To this end, we described the methods of existing workflows by breaking them up into five general steps and assessed their ease-of-use and validation experiments. Performance scores of previous benchmarks were summarized and correlations between methods and performance were investigated. We indicate the potential suitability of the different workflows for (1) time-constrained diagnostics, (2) surveillance and outbreak source tracing, (3) detection of remote homologies (discovery), and (4) biodiversity studies. We provide two decision trees for virologists to help select a workflow for medical or biodiversity studies, as well as directions for future developments in clinical viral metagenomics.
Collapse
Affiliation(s)
- Sam Nooij
- Emerging and Endemic Viruses, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands.,Viroscience Laboratory, Erasmus University Medical Centre, Rotterdam, Netherlands
| | - Dennis Schmitz
- Emerging and Endemic Viruses, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands.,Viroscience Laboratory, Erasmus University Medical Centre, Rotterdam, Netherlands
| | - Harry Vennema
- Emerging and Endemic Viruses, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Annelies Kroneman
- Emerging and Endemic Viruses, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Marion P G Koopmans
- Emerging and Endemic Viruses, Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands.,Viroscience Laboratory, Erasmus University Medical Centre, Rotterdam, Netherlands
| |
Collapse
|
24
|
Nguyen NPD, Deshpande V, Luebeck J, Mischel PS, Bafna V. ViFi: accurate detection of viral integration and mRNA fusion reveals indiscriminate and unregulated transcription in proximal genomic regions in cervical cancer. Nucleic Acids Res 2018; 46:3309-3325. [PMID: 29579309 PMCID: PMC6283451 DOI: 10.1093/nar/gky180] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Revised: 02/12/2018] [Accepted: 03/05/2018] [Indexed: 12/20/2022] Open
Abstract
The integration of viral sequences into the host genome is an important driver of tumorigenesis in many viral mediated cancers, notably cervical cancer and hepatocellular carcinoma. We present ViFi, a computational method that combines phylogenetic methods with reference-based read mapping to detect viral integrations. In contrast with read-based reference mapping approaches, ViFi is faster, and shows high precision and sensitivity on both simulated and biological data, even when the integrated virus is a novel strain or highly mutated. We applied ViFi to matched genomic and mRNA data from 68 cervical cancer samples from TCGA and found high concordance between the two. Surprisingly, viral integration resulted in a dramatic transcriptional upregulation in all proximal elements, including LINEs and LTRs that are not normally transcribed. This upregulation is highly correlated with the presence of a viral gene fused with a downstream human element. Moreover, genomic rearrangements suggest the formation of apparent circular extrachromosomal (ecDNA) human-viral structures. Our results suggest the presence of apparent small circular fusion viral/human ecDNA, which correlates with indiscriminate and unregulated expression of proximal genomic elements, potentially contributing to the pathogenesis of HPV-associated cervical cancers. ViFi is available at https://github.com/namphuon/ViFi.
Collapse
Affiliation(s)
- Nam-phuong D Nguyen
- Computer Science and Engineering, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Viraj Deshpande
- Computer Science and Engineering, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Jens Luebeck
- Bioinformatics and Systems Biology Program, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Paul S Mischel
- Ludwig Institute for Cancer Research, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
- Department of Pathology, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
- Moores Cancer Center, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Vineet Bafna
- Computer Science and Engineering, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| |
Collapse
|
25
|
Cornwell M, Vangala M, Taing L, Herbert Z, Köster J, Li B, Sun H, Li T, Zhang J, Qiu X, Pun M, Jeselsohn R, Brown M, Liu XS, Long HW. VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis. BMC Bioinformatics 2018; 19:135. [PMID: 29649993 PMCID: PMC5897949 DOI: 10.1186/s12859-018-2139-9] [Citation(s) in RCA: 142] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 03/26/2018] [Indexed: 02/05/2023] Open
Abstract
Background RNA sequencing has become a ubiquitous technology used throughout life sciences as an effective method of measuring RNA abundance quantitatively in tissues and cells. The increase in use of RNA-seq technology has led to the continuous development of new tools for every step of analysis from alignment to downstream pathway analysis. However, effectively using these analysis tools in a scalable and reproducible way can be challenging, especially for non-experts. Results Using the workflow management system Snakemake we have developed a user friendly, fast, efficient, and comprehensive pipeline for RNA-seq analysis. VIPER (Visualization Pipeline for RNA-seq analysis) is an analysis workflow that combines some of the most popular tools to take RNA-seq analysis from raw sequencing data, through alignment and quality control, into downstream differential expression and pathway analysis. VIPER has been created in a modular fashion to allow for the rapid incorporation of new tools to expand the capabilities. This capacity has already been exploited to include very recently developed tools that explore immune infiltrate and T-cell CDR (Complementarity-Determining Regions) reconstruction abilities. The pipeline has been conveniently packaged such that minimal computational skills are required to download and install the dozens of software packages that VIPER uses. Conclusions VIPER is a comprehensive solution that performs most standard RNA-seq analyses quickly and effectively with a built-in capacity for customization and expansion. Electronic supplementary material The online version of this article (10.1186/s12859-018-2139-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- MacIntosh Cornwell
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Mahesh Vangala
- University of Massachusetts Medical School, Worcester, MA, 01655, USA
| | - Len Taing
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Zachary Herbert
- Molecular Biology Core Facilities, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Johannes Köster
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Institute of Human Genetics, University of Duisburg-Essen, Essen, Germany
| | - Bo Li
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA, 02215, USA
| | - Hanfei Sun
- Department of Bioinformatics, School of Life Sciences, Tongji University, Shanghai, 200092, China
| | - Taiwen Li
- State Key Laboratory of Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Jian Zhang
- Beijing Institute of Basic Medical Sciences, Beijing, China
| | - Xintao Qiu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Matthew Pun
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Rinath Jeselsohn
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Myles Brown
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - X Shirley Liu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA, 02215, USA
| | - Henry W Long
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA. .,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
| |
Collapse
|
26
|
Gannon OM, Antonsson A, Bennett IC, Saunders NA. Viral infections and breast cancer - A current perspective. Cancer Lett 2018; 420:182-189. [PMID: 29410005 DOI: 10.1016/j.canlet.2018.01.076] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/08/2018] [Accepted: 01/31/2018] [Indexed: 01/25/2023]
Abstract
Sporadic human breast cancer is the most common cancer to afflict women. Since the discovery, decades ago, of the oncogenic mouse mammary tumour virus, there has been significant interest in the potential aetiologic role of infectious agents in sporadic human breast cancer. To address this, many studies have examined the presence of viruses (e.g. papillomaviruses, herpes viruses and retroviruses), endogenous retroviruses and more recently, microbes, as a means of implicating them in the aetiology of human breast cancer. Such studies have generated conflicting experimental and clinical reports of the role of infection in breast cancer. This review evaluates the current evidence for a productive oncogenic viral infection in human breast cancer, with a focus on the integration of sensitive and specific next generation sequencing technologies with pathogen discovery. Collectively, the majority of the recent literature using the more powerful next generation sequencing technologies fail to support an oncogenic viral infection being involved in disease causality in breast cancer. In balance, the weight of the current experimental evidence supports the conclusion that viral infection is unlikely to play a significant role in the aetiology of breast cancer.
Collapse
Affiliation(s)
- O M Gannon
- University of Queensland Diamantina Institute, The Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | - A Antonsson
- Department of Population Health, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, Queensland 4006, Australia; School of Medicine, The University of Queensland, Herston Road, Herston, Queensland 4006, Australia
| | - I C Bennett
- School of Medicine, The University of Queensland, Herston Road, Herston, Queensland 4006, Australia; Private Practice, The Wesley and St Andrews Hospital, Auchenflower 4066, Australia
| | - N A Saunders
- University of Queensland Diamantina Institute, The Faculty of Medicine, The University of Queensland, Brisbane, Australia.
| |
Collapse
|
27
|
Cieślik M, Chinnaiyan AM. Cancer transcriptome profiling at the juncture of clinical translation. Nat Rev Genet 2017; 19:93-109. [PMID: 29279605 DOI: 10.1038/nrg.2017.96] [Citation(s) in RCA: 173] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Methodological breakthroughs over the past four decades have repeatedly revolutionized transcriptome profiling. Using RNA sequencing (RNA-seq), it has now become possible to sequence and quantify the transcriptional outputs of individual cells or thousands of samples. These transcriptomes provide a link between cellular phenotypes and their molecular underpinnings, such as mutations. In the context of cancer, this link represents an opportunity to dissect the complexity and heterogeneity of tumours and to discover new biomarkers or therapeutic strategies. Here, we review the rationale, methodology and translational impact of transcriptome profiling in cancer.
Collapse
Affiliation(s)
- Marcin Cieślik
- Michigan Center for Translational Pathology, University of Michigan.,Department of Pathology, University of Michigan
| | - Arul M Chinnaiyan
- Michigan Center for Translational Pathology, University of Michigan.,Department of Pathology, University of Michigan.,Comprehensive Cancer Center, University of Michigan.,Department of Urology, University of Michigan.,Howard Hughes Medical Institute, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
28
|
Yoo S, Wang W, Wang Q, Fiel MI, Lee E, Hiotis SP, Zhu J. A pilot systematic genomic comparison of recurrence risks of hepatitis B virus-associated hepatocellular carcinoma with low- and high-degree liver fibrosis. BMC Med 2017; 15:214. [PMID: 29212479 PMCID: PMC5719570 DOI: 10.1186/s12916-017-0973-7] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 11/08/2017] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Chronic hepatitis B virus (HBV) infection leads to liver fibrosis, which is a major risk factor in hepatocellular carcinoma (HCC) and an independent risk factor of recurrence after HCC tumor resection. The HBV genome can be inserted into the human genome, and chronic inflammation may trigger somatic mutations. However, how HBV integration and other genomic changes contribute to the risk of tumor recurrence with regards to the different degree of liver fibrosis is not clearly understood. METHODS We sequenced mRNAs of 21 pairs of tumor and distant non-neoplastic liver tissues of HBV-HCC patients and performed comprehensive genomic analyses of our RNAseq data and public available HBV-HCC sequencing data. RESULTS We developed a robust pipeline for sensitively identifying HBV integration sites based on sequencing data. Simulations showed that our method outperformed existing methods. Applying it to our data, 374 and 106 HBV host genes were identified in non-neoplastic liver and tumor tissues, respectively. When applying it to other RNA sequencing datasets, consistently more HBV integrations were identified in non-neoplastic liver than in tumor tissues. HBV host genes identified in non-neoplastic liver samples significantly overlapped with known tumor suppressor genes. More significant enrichment of tumor suppressor genes was observed among HBV host genes identified from patients with tumor recurrence, indicating the potential risk of tumor recurrence driven by HBV integration in non-neoplastic liver tissues. We also compared SNPs of each sample with SNPs in a cancer census database and inferred samples' pathogenic SNP loads. Pathogenic SNP loads in non-neoplastic liver tissues were consistently higher than those in normal liver tissues. Additionally, HBV host genes identified in non-neoplastic liver tissues significantly overlapped with pathogenic somatic mutations, suggesting that HBV integration and somatic mutations targeting the same set of genes are important to tumorigenesis. HBV integrations and pathogenic mutations showed distinct patterns between low and high liver fibrosis patients with regards to tumor recurrence. CONCLUSIONS The results suggest that HBV integrations and pathogenic SNPs in non-neoplastic tissues are important for tumorigenesis and different recurrence risk models are needed for patients with low and high degrees of liver fibrosis.
Collapse
Affiliation(s)
- Seungyeul Yoo
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Wenhui Wang
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Qin Wang
- Department of Surgery, Division of Surgical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - M Isabel Fiel
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eunjee Lee
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Sema4, a Mount Sinai venture, Stamford, CT, USA
| | - Spiros P Hiotis
- Department of Surgery, Division of Surgical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Sema4, a Mount Sinai venture, Stamford, CT, USA.
| |
Collapse
|
29
|
Shieh FS, Jongeneel P, Steffen JD, Lin S, Jain S, Song W, Su YH. ChimericSeq: An open-source, user-friendly interface for analyzing NGS data to identify and characterize viral-host chimeric sequences. PLoS One 2017; 12:e0182843. [PMID: 28829778 PMCID: PMC5567911 DOI: 10.1371/journal.pone.0182843] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 07/25/2017] [Indexed: 11/18/2022] Open
Abstract
Identification of viral integration sites has been important in understanding the pathogenesis and progression of diseases associated with particular viral infections. The advent of next-generation sequencing (NGS) has enabled researchers to understand the impact that viral integration has on the host, such as tumorigenesis. Current computational methods to analyze NGS data of virus-host junction sites have been limited in terms of their accessibility to a broad user base. In this study, we developed a software application (named ChimericSeq), that is the first program of its kind to offer a graphical user interface, compatibility with both Windows and Mac operating systems, and optimized for effectively identifying and annotating virus-host chimeric reads within NGS data. In addition, ChimericSeq’s pipeline implements custom filtering to remove artifacts and detect reads with quantitative analytical reporting to provide functional significance to discovered integration sites. The improved accessibility of ChimericSeq through a GUI interface in both Windows and Mac has potential to expand NGS analytical support to a broader spectrum of the scientific community.
Collapse
Affiliation(s)
- Fwu-Shan Shieh
- JBS Science, Inc., Doylestown, Pennsylvania, United States of America
- U-Screen Dx Inc., Doylestown, Pennsylvania, United States of America
| | - Patrick Jongeneel
- JBS Science, Inc., Doylestown, Pennsylvania, United States of America
| | - Jamin D. Steffen
- JBS Science, Inc., Doylestown, Pennsylvania, United States of America
| | - Selena Lin
- U-Screen Dx Inc., Doylestown, Pennsylvania, United States of America
- Drexel University College of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Surbhi Jain
- JBS Science, Inc., Doylestown, Pennsylvania, United States of America
| | - Wei Song
- JBS Science, Inc., Doylestown, Pennsylvania, United States of America
- U-Screen Dx Inc., Doylestown, Pennsylvania, United States of America
- * E-mail: (Y.H.S.); (W.S.)
| | - Ying-Hsiu Su
- Drexel University College of Medicine, Philadelphia, Pennsylvania, United States of America
- The Baruch S. Blumberg Institute, Doylestown, Pennsylvania, United States of America
- * E-mail: (Y.H.S.); (W.S.)
| |
Collapse
|
30
|
Santander CG, Gambron P, Marchi E, Karamitros T, Katzourakis A, Magiorkinis G. STEAK: A specific tool for transposable elements and retrovirus detection in high-throughput sequencing data. Virus Evol 2017; 3:vex023. [PMID: 28948042 PMCID: PMC5597868 DOI: 10.1093/ve/vex023] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The advancements of high-throughput genomics have unveiled much about the human genome highlighting the importance of variations between individuals and their contribution to disease. Even though numerous software have been developed to make sense of large genomics datasets, a major short falling of these has been the inability to cope with repetitive regions, specifically to validate structural variants and accordingly assess their role in disease. Here we describe our program STEAK, a massively parallel software designed to detect chimeric reads in high-throughput sequencing data for a broad number of applications such as identifying presence/absence, as well as discovery of transposable elements (TEs), and retroviral integrations. We highlight the capabilities of STEAK by comparing its efficacy in locating HERV-K HML-2 in clinical whole genome projects, target enrichment sequences, and in the 1000 Genomes CEU Trio to the performance of other TE and virus detecting tools. We show that STEAK outperforms other software in terms of computational efficiency, sensitivity, and specificity. We demonstrate that STEAK is a robust tool, which allows analysts to flexibly detect and evaluate TE and retroviral integrations in a diverse range of sequencing projects for both research and clinical purposes.
Collapse
Affiliation(s)
| | - Philippe Gambron
- Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Science and Innovation Campus, Didcot, Oxfordshire, UK
| | - Emanuele Marchi
- Nuffield Department of Medicine, University of Oxford, Oxfordshire, UK
| | | | | | - Gkikas Magiorkinis
- Department of Zoology, University of Oxford, Oxfordshire, UK
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
31
|
Rosewick N, Durkin K, Artesi M, Marçais A, Hahaut V, Griebel P, Arsic N, Avettand-Fenoel V, Burny A, Charlier C, Hermine O, Georges M, Van den Broeke A. Cis-perturbation of cancer drivers by the HTLV-1/BLV proviruses is an early determinant of leukemogenesis. Nat Commun 2017; 8:15264. [PMID: 28534499 PMCID: PMC5457497 DOI: 10.1038/ncomms15264] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2016] [Accepted: 03/14/2017] [Indexed: 12/12/2022] Open
Abstract
Human T-cell leukaemia virus type-1 (HTLV-1) and bovine leukaemia virus (BLV) infect T- and B-lymphocytes, respectively, provoking a polyclonal expansion that will evolve into an aggressive monoclonal leukaemia in ∼5% of individuals following a protracted latency period. It is generally assumed that early oncogenic changes are largely dependent on virus-encoded products, especially TAX and HBZ, while progression to acute leukaemia/lymphoma involves somatic mutations, yet that both are independent of proviral integration site that has been found to be very variable between tumours. Here, we show that HTLV-1/BLV proviruses are integrated near cancer drivers which they affect either by provirus-dependent transcription termination or as a result of viral antisense RNA-dependent cis-perturbation. The same pattern is observed at polyclonal non-malignant stages, indicating that provirus-dependent host gene perturbation contributes to the initial selection of the multiple clones characterizing the asymptomatic stage, requiring additional alterations in the clone that will evolve into full-blown leukaemia/lymphoma. Human T-cell leukaemia virus type-1 and bovine leukaemia virus infect T and B lymphocytes and lead to aggressive leukaemia. Here, the authors show these proviruses integrate near cancer drivers perturbing transcription termination or antisense RNA-dependent interaction, suggesting post-transcriptional mechanisms in some cases.
Collapse
Affiliation(s)
- Nicolas Rosewick
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium
| | - Keith Durkin
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium
| | - Maria Artesi
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium
| | - Ambroise Marçais
- Service d'hématologie, Hôpital Universitaire Necker, Université René Descartes, Assistance publique hôpitaux de Paris, 149-161 rue de Sèvres, Paris 75010, France
| | - Vincent Hahaut
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium
| | - Philip Griebel
- Vaccine and Infectious Disease Organization, VIDO-Intervac, University of Saskatchewan, 120 Veterinary Road, Saskatoon, Canada S7N 5E3
| | - Natasa Arsic
- Vaccine and Infectious Disease Organization, VIDO-Intervac, University of Saskatchewan, 120 Veterinary Road, Saskatoon, Canada S7N 5E3
| | - Véronique Avettand-Fenoel
- Laboratoire de Virologie, AP-HP, Hôpital Necker-Enfants Malades, Université Paris Descartes, Sorbonne Paris Cité, EA7327, 149 rue de Sèvres, Paris 75010, France
| | - Arsène Burny
- Laboratory of Experimental Hematology, Institut Jules Bordet, Université Libre de Bruxelles (ULB), Boulevard de Waterloo 121, Brussels 1000, Belgium
| | - Carole Charlier
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium
| | - Olivier Hermine
- Service d'hématologie, Hôpital Universitaire Necker, Université René Descartes, Assistance publique hôpitaux de Paris, 149-161 rue de Sèvres, Paris 75010, France.,INSERM U1163-ERL8254, Institut Imagine, 24 B Boulevard du Montparnasse, Paris 75010, France
| | - Michel Georges
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium
| | - Anne Van den Broeke
- Unit of Animal Genomics, GIGA-R, Université de Liège (ULg), Avenue de l'Hôpital 11, B34, Liège 4000, Belgium.,Laboratory of Experimental Hematology, Institut Jules Bordet, Université Libre de Bruxelles (ULB), Boulevard de Waterloo 121, Brussels 1000, Belgium
| |
Collapse
|
32
|
Abstract
Background The study of virus integrations in human genome is important since virus integrations were shown to be associated with diseases. In the literature, few methods have been proposed that predict virus integrations using next generation sequencing datasets. Although they work, they are slow and are not very sensitive. Results and discussion This paper introduces a new method BatVI to predict viral integrations. Our method uses a fast screening method to filter out chimeric reads containing possible viral integrations. Next, sensitive alignments of these candidate chimeric reads are called by BLAST. Chimeric reads that are co-localized in the human genome are clustered. Finally, by assembling the chimeric reads in each cluster, high confident virus integration sites are extracted. Conclusion We compared the performance of BatVI with existing methods VirusFinder and VirusSeq using both simulated and real-life datasets of liver cancer patients. BatVI ran an order of magnitude faster and was able to predict almost twice the number of true positives compared to other methods while maintaining a false positive rate less than 1%. For the liver cancer datasets, BatVI uncovered novel integrations to two important genes TERT and MLL4, which were missed by previous studies. Through gene expression data, we verified the correctness of these additional integrations. BatVI can be downloaded from http://biogpu.ddns.comp.nus.edu.sg/~ksung/batvi/index.html.
Collapse
Affiliation(s)
- Chandana Tennakoon
- Department of Computational and Systems Biology, Genome Institute of Singapore, Singapore, 138672, Singapore.,UAE University, PO Box, 17551, Al Ain, United Arab Emirates
| | - Wing Kin Sung
- Department of Computational and Systems Biology, Genome Institute of Singapore, Singapore, 138672, Singapore. .,Department of Computer Science, National Unversity of Singapore, Singapore, 117417, Singapore.
| |
Collapse
|
33
|
Afzal S, Wilkening S, von Kalle C, Schmidt M, Fronza R. GENE-IS: Time-Efficient and Accurate Analysis of Viral Integration Events in Large-Scale Gene Therapy Data. MOLECULAR THERAPY-NUCLEIC ACIDS 2016; 6:133-139. [PMID: 28325279 PMCID: PMC5363413 DOI: 10.1016/j.omtn.2016.12.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 11/24/2016] [Accepted: 12/01/2016] [Indexed: 12/22/2022]
Abstract
Integration site profiling and clonality analysis of viral vector distribution in gene therapy is a key factor to monitor the fate of gene-corrected cells, assess the risk of malignant transformation, and establish vector biosafety. We developed the Genome Integration Site Analysis Pipeline (GENE-IS) for highly time-efficient and accurate detection of next-generation sequencing (NGS)-based viral vector integration sites (ISs) in gene therapy data. It is the first available tool with dual analysis mode that allows IS analysis both in data generated by PCR-based methods, such as linear amplification method PCR (LAM-PCR), and by rapidly evolving targeted sequencing (e.g., Agilent SureSelect) technologies. GENE-IS makes use of trimming strategies, customized reference genome, and soft-clipped information with sequential filtering steps to provide annotated IS with clonality information. It is a scalable, robust, precise, and reliable tool for large-scale pre-clinical and clinical data analysis that provides users complete flexibility and control over analysis with a broad range of configurable parameters. GENE-IS is available at https://github.com/G100DKFZ/gene-is.
Collapse
Affiliation(s)
- Saira Afzal
- Department of Translational Oncology, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Stefan Wilkening
- Department of Translational Oncology, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Christof von Kalle
- Department of Translational Oncology, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Manfred Schmidt
- Department of Translational Oncology, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Raffaele Fronza
- Department of Translational Oncology, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany.
| |
Collapse
|
34
|
Jackson R, Rosa BA, Lameiras S, Cuninghame S, Bernard J, Floriano WB, Lambert PF, Nicolas A, Zehbe I. Functional variants of human papillomavirus type 16 demonstrate host genome integration and transcriptional alterations corresponding to their unique cancer epidemiology. BMC Genomics 2016; 17:851. [PMID: 27806689 PMCID: PMC5094076 DOI: 10.1186/s12864-016-3203-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 10/25/2016] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Human papillomaviruses (HPVs) are a worldwide burden as they are a widespread group of tumour viruses in humans. Having a tropism for mucosal tissues, high-risk HPVs are detected in nearly all cervical cancers. HPV16 is the most common high-risk type but not all women infected with high-risk HPV develop a malignant tumour. Likely relevant, HPV genomes are polymorphic and some HPV16 single nucleotide polymorphisms (SNPs) are under evolutionary constraint instigating variable oncogenicity and immunogenicity in the infected host. RESULTS To investigate the tumourigenicity of two common HPV16 variants, we used our recently developed, three-dimensional organotypic model reminiscent of the natural HPV infectious cycle and conducted various "omics" and bioinformatics approaches. Based on epidemiological studies we chose to examine the HPV16 Asian-American (AA) and HPV16 European Prototype (EP) variants. They differ by three non-synonymous SNPs in the transforming and virus-encoded E6 oncogene where AAE6 is classified as a high- and EPE6 as a low-risk variant. Remarkably, the high-risk AAE6 variant genome integrated into the host DNA, while the low-risk EPE6 variant genome remained episomal as evidenced by highly sensitive Capt-HPV sequencing. RNA-seq experiments showed that the truncated form of AAE6, integrated in chromosome 5q32, produced a local gene over-expression and a large variety of viral-human fusion transcripts, including long distance spliced transcripts. In addition, differential enrichment of host cell pathways was observed between both HPV16 E6 variant-containing epithelia. Finally, in the high-risk variant, we detected a molecular signature of host chromosomal instability, a common property of cancer cells. CONCLUSIONS We show how naturally occurring SNPs in the HPV16 E6 oncogene cause significant changes in the outcome of HPV infections and subsequent viral and host transcriptome alterations prone to drive carcinogenesis. Host genome instability is closely linked to viral integration into the host genome of HPV-infected cells, which is a key phenomenon for malignant cellular transformation and the reason for uncontrolled E6 oncogene expression. In particular, the finding of variant-specific integration potential represents a new paradigm in HPV variant biology.
Collapse
Affiliation(s)
- Robert Jackson
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada.,Biotechnology Program, Lakehead University, Thunder Bay, Ontario, Canada
| | - Bruce A Rosa
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Sonia Lameiras
- NGS platform, Institut Curie, PSL Research University, 26 rue d'Ulm, 75248, Paris, Cedex, France
| | - Sean Cuninghame
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada.,Northern Ontario School of Medicine, Lakehead University, Thunder Bay, Ontario, Canada
| | - Josee Bernard
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada.,Department of Biology, Lakehead University, Thunder Bay, Ontario, Canada
| | - Wely B Floriano
- Department of Chemistry, Lakehead University, Thunder Bay, Ontario, Canada
| | - Paul F Lambert
- McArdle Laboratory for Cancer Research, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Alain Nicolas
- Institut Curie, PSL Research University, Centre National de la Recherche Scientifique UMR3244, Sorbonne Universités, Paris, France
| | - Ingeborg Zehbe
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada. .,Northern Ontario School of Medicine, Lakehead University, Thunder Bay, Ontario, Canada. .,Department of Biology, Lakehead University, Thunder Bay, Ontario, Canada.
| |
Collapse
|
35
|
Feber A, Worth DC, Chakravarthy A, de Winter P, Shah K, Arya M, Saqib M, Nigam R, Malone PR, Tan WS, Rodney S, Freeman A, Jameson C, Wilson GA, Powles T, Beck S, Fenton T, Sharp TV, Muneer A, Kelly JD. CSN1 Somatic Mutations in Penile Squamous Cell Carcinoma. Cancer Res 2016; 76:4720-4727. [PMID: 27325650 PMCID: PMC5302160 DOI: 10.1158/0008-5472.can-15-3134] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 05/09/2016] [Indexed: 12/20/2022]
Abstract
Other than an association with HPV infection, little is known about the genetic alterations determining the development of penile cancer. Although penile cancer is rare in the developed world, it presents a significant burden in developing countries. Here, we report the findings of whole-exome sequencing (WES) to determine the somatic mutational landscape of penile cancer. WES was performed on penile cancer and matched germline DNA from 27 patients undergoing surgical resection. Targeted resequencing of candidate genes was performed in an independent 70 patient cohort. Mutation data were also integrated with DNA methylation and copy-number information from the same patients. We identified an HPV-associated APOBEC mutation signature and an NpCpG signature in HPV-negative disease. We also identified recurrent mutations in the novel penile cancer tumor suppressor genes CSN1(GPS1) and FAT1 Expression of CSN1 mutants in cells resulted in colocalization with AGO2 in cytoplasmic P-bodies, ultimately leading to the loss of miRNA-mediated gene silencing, which may contribute to disease etiology. Our findings represent the first comprehensive analysis of somatic alterations in penile cancer, highlighting the complex landscape of alterations in this malignancy. Cancer Res; 76(16); 4720-7. ©2016 AACR.
Collapse
Affiliation(s)
- Andrew Feber
- UCL Cancer Institute, University College London, London, United Kingdom
| | - Daniel C. Worth
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, United Kingdom
| | | | - Patricia de Winter
- Division of Surgery and Interventional Science, UCL Medical School, University College London, London, United Kingdom
| | - Kunal Shah
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, United Kingdom
| | - Manit Arya
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, United Kingdom
- Department of Urology, University College Hospital, London, United Kingdom
| | - Muhammad Saqib
- Department of Urology, University College Hospital, London, United Kingdom
| | - Raj Nigam
- Department of Urology, The Royal Surrey County Hospital, Surrey, United Kingdom
| | - Peter R. Malone
- Department of Urology, The Royal Berkshire NHS Foundation Trust, Reading, United Kingdom
| | - Wei Shen Tan
- Division of Surgery and Interventional Science, UCL Medical School, University College London, London, United Kingdom
| | - Simon Rodney
- Division of Surgery and Interventional Science, UCL Medical School, University College London, London, United Kingdom
| | - Alex Freeman
- Department of Histopathology, University College London Hospital, London, United Kingdom
| | - Charles Jameson
- Department of Histopathology, University College London Hospital, London, United Kingdom
| | - Gareth A. Wilson
- UCL Cancer Institute, University College London, London, United Kingdom
| | - Tom Powles
- Experimental Cancer Medicine Centre, Barts Cancer Institute, Barts Health and the Royal Free NHS Trust, Queen Mary University of London, London, United Kingdom
| | - Stephan Beck
- UCL Cancer Institute, University College London, London, United Kingdom
| | - Tim Fenton
- UCL Cancer Institute, University College London, London, United Kingdom
| | - Tyson V. Sharp
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, United Kingdom
| | - Asif Muneer
- Department of Urology, University College Hospital, London, United Kingdom
- NIHR Biomedical Research Centre, University College London Hospitals, London, United Kingdom
| | - John D. Kelly
- Division of Surgery and Interventional Science, UCL Medical School, University College London, London, United Kingdom
| |
Collapse
|
36
|
Divergent viral presentation among human tumors and adjacent normal tissues. Sci Rep 2016; 6:28294. [PMID: 27339696 PMCID: PMC4919655 DOI: 10.1038/srep28294] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2016] [Accepted: 05/26/2016] [Indexed: 12/13/2022] Open
Abstract
We applied a newly developed bioinformatics system called VirusScan to investigate the viral basis of 6,813 human tumors and 559 adjacent normal samples across 23 cancer types and identified 505 virus positive samples with distinctive, organ system- and cancer type-specific distributions. We found that herpes viruses (e.g., subtypes HHV4, HHV5, and HHV6) that are highly prevalent across cancers of the digestive tract showed significantly higher abundances in tumor versus adjacent normal samples, supporting their association with these cancers. We also found three HPV16-positive samples in brain lower grade glioma (LGG). Further, recurrent HBV integration at the KMT2B locus is present in three liver tumors, but absent in their matched adjacent normal samples, indicating that viral integration induced host driver genetic alterations are required on top of viral oncogene expression for initiation and progression of liver hepatocellular carcinoma. Notably, viral integrations were found in many genes, including novel recurrent HPV integrations at PTPN13 in cervical cancer. Finally, we observed a set of HHV4 and HBV variants strongly associated with ethnic groups, likely due to viral sequence evolution under environmental influences. These findings provide important new insights into viral roles of tumor initiation and progression and potential new therapeutic targets.
Collapse
|
37
|
Virus-Clip: a fast and memory-efficient viral integration site detection tool at single-base resolution with annotation capability. Oncotarget 2016; 6:20959-63. [PMID: 26087185 PMCID: PMC4673242 DOI: 10.18632/oncotarget.4187] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 05/12/2015] [Indexed: 01/30/2023] Open
Abstract
Viral integration into the human genome upon infection is an important risk factor for various human malignancies. We developed viral integration site detection tool called Virus-Clip, which makes use of information extracted from soft-clipped sequencing reads to identify exact positions of human and virus breakpoints of integration events. With initial read alignment to virus reference genome and streamlined procedures, Virus-Clip delivers a simple, fast and memory-efficient solution to viral integration site detection. Moreover, it can also automatically annotate the integration events with the corresponding affected human genes. Virus-Clip has been verified using whole-transcriptome sequencing data and its detection was validated to have satisfactory sensitivity and specificity. Marked advancement in performance was detected, compared to existing tools. It is applicable to versatile types of data including whole-genome sequencing, whole-transcriptome sequencing, and targeted sequencing. Virus-Clip is available at http://web.hku.hk/~dwhho/Virus-Clip.zip.
Collapse
|
38
|
Flippot R, Malouf GG, Su X, Khayat D, Spano JP. Oncogenic viruses: Lessons learned using next-generation sequencing technologies. Eur J Cancer 2016; 61:61-8. [PMID: 27156225 DOI: 10.1016/j.ejca.2016.03.086] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2015] [Revised: 03/25/2016] [Accepted: 03/30/2016] [Indexed: 01/04/2023]
Abstract
Fifteen percent of cancers are driven by oncogenic human viruses. Four of those viruses, hepatitis B virus, human papillomavirus, Merkel cell polyomavirus, and human T-cell lymphotropic virus, integrate the host genome. Viral oncogenesis is the result of epigenetic and genetic alterations that happen during viral integration. So far, little data have been available regarding integration mechanisms and modifications in the host genome. However, the emergence of high-throughput sequencing and bioinformatic tools enables researchers to establish the landscape of genomic alterations and predict the events that follow viral integration. Cooperative working groups are currently investigating these factors in large data sets. Herein, we provide novel insights into the initiating events of cancer onset during infection with integrative viruses. Although much remains to be discovered, many improvements are expected from the clinical point of view, from better prognosis classifications to better therapeutic strategies.
Collapse
Affiliation(s)
- Ronan Flippot
- University Hospital Pitié Salpêtrière, Department of Medical Oncology, University Pierre and Marie Curie, Paris, France
| | - Gabriel G Malouf
- University Hospital Pitié Salpêtrière, Department of Medical Oncology, University Pierre and Marie Curie, Paris, France.
| | - Xiaoping Su
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, U.S.A
| | - David Khayat
- University Hospital Pitié Salpêtrière, Department of Medical Oncology, University Pierre and Marie Curie, Paris, France
| | - Jean-Philippe Spano
- University Hospital Pitié Salpêtrière, Department of Medical Oncology, University Pierre and Marie Curie, Paris, France
| |
Collapse
|
39
|
Next-generation sequencing of elite berry germplasm and data analysis using a bioinformatics pipeline for virus detection and discovery. Methods Mol Biol 2016; 1302:301-13. [PMID: 25981263 DOI: 10.1007/978-1-4939-2620-6_22] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus, and Vaccinium) are known hosts for more than 70 viruses and new ones are identified continually. In modern berry cultivars, viruses tend to be asymptomatic in single infections and symptoms only develop after plants accumulate multiple viruses. Most certification programs are based on visual observations. Infected, asymptomatic material may be propagated in the nursery system and shipped to farms where plants acquire additional viruses and develop symptoms. This practice may result in disease epidemics with great impact to producers and the natural ecosystem alike. In this chapter we present work that allows for the detection of known and discovery of new viruses in elite germplasm, having the potential to greatly reduce virus dispersal associated with movement of propagation material.
Collapse
|
40
|
Chakraborty C, George Priya Doss C, Zhu H, Agoramoorthy G. Rising Strengths Hong Kong SAR in Bioinformatics. Interdiscip Sci 2016; 9:224-236. [PMID: 26961385 PMCID: PMC7091071 DOI: 10.1007/s12539-016-0147-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2015] [Revised: 12/07/2015] [Accepted: 01/08/2016] [Indexed: 12/18/2022]
Abstract
Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Bio-informatics, School of Computer and Information Sciences, Galgotias University, Greater Noida, UP, 201306, India
- Department of Computer Sciences, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - C George Priya Doss
- Medical Biotechnology Division, School of BioSciences and Technology, VIT University, Vellore, TN, 632014, India
| | - Hailong Zhu
- Department of Computer Sciences, Hong Kong Baptist University, Kowloon Tong, Hong Kong.
| | | |
Collapse
|
41
|
Liang HW, Wang N, Wang Y, Wang F, Fu Z, Yan X, Zhu H, Diao W, Ding Y, Chen X, Zhang CY, Zen K. Hepatitis B virus-human chimeric transcript HBx-LINE1 promotes hepatic injury via sequestering cellular microRNA-122. J Hepatol 2016; 64:278-291. [PMID: 26409216 DOI: 10.1016/j.jhep.2015.09.013] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2015] [Revised: 09/04/2015] [Accepted: 09/13/2015] [Indexed: 02/06/2023]
Abstract
BACKGROUND & AIMS Chronic hepatitis B virus (HBV) carriers have a high risk to develop hepatocellular carcinoma (HCC) but the underlying mechanism remains unclear. Recent studies suggest that viral-human hybrid RNA transcripts, which play a critical role in promoting HCC progression, may be the molecules responsible for the development of HCC in HBV infected patients. Here we determine whether HBx-LINE1, a hybrid RNA transcript of the human LINE1 and the HBV-encoded X gene generated in tumor cells of HBV-positive HCC, can serve as a molecular sponge for sequestering miR-122 and promoting liver cell abnormal mitosis and mouse hepatic injury. METHODS Paired tumor and distal normal liver tissue specimens, as well as HBx-LINE1 overexpressing hepatic cells, were used to test the relationship between HBx-LINE1 and miR-122. Levels of HBx-LINE1 and miR-122 were assayed by qRT-PCR and Northern blot. HBx-LINE1-miR-122 binding was analyzed by luciferase reporter assay. Mouse hepatic injury was monitored by tissue staining and serum aspartate transaminase, alanine aminotransferase and total bilirubin measurement. RESULTS HBx-LINE1 in HBV-positive HCC tissues was inversely correlated with miR-122. Each HBx-LINE1 consists of six miR-122-binding sites, and forced expression of HBx-LINE1 effectively depleted cellular miR-122, promoting hepatic cell epithelial-mesenchymal transition (EMT)-like changes, including β-catenin signaling activation, E-cadherin reduction and cell migration enhancement. Mice administered with HBx-LINE1 display a significant mouse liver cell abnormal mitosis and hepatic injury. However, all these effects of HBx-LINE1 are completely abolished by miR-122. CONCLUSIONS Our finding illustrates a previously uncharacterized miR-122-sequestering mechanism by which HBx-LINE1 promotes hepatic cell EMT-like changes and mouse liver injury.
Collapse
Affiliation(s)
- Hong-Wei Liang
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China
| | - Nan Wang
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China
| | - Yanbo Wang
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China
| | - Feng Wang
- Department of General Surgery, the Affiliated Gulou Hospital of Nanjing University, Nanjing, Jiangsu 210093, China
| | - Zheng Fu
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China
| | - Xin Yan
- Comprehensive Cancer Center, the Affiliated Gulou Hospital of Nanjing University, Nanjing, Jiangsu 210093, China
| | - Hao Zhu
- Department of Gastroenterology, the Affiliated Gulou Hospital of Nanjing University, Nanjing, Jiangsu 210093, China
| | - Wenli Diao
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China
| | - Yitao Ding
- Department of Hepatobiliary Surgery, the Affiliated Gulou Hospital of Nanjing University, Nanjing, Jiangsu 210093, China.
| | - Xi Chen
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China.
| | - Chen-Yu Zhang
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China.
| | - Ke Zen
- State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University Advanced Institute of Life Sciences, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, Nanjing University, Nanjing, Jiangsu 210093, China.
| |
Collapse
|
42
|
Arsenijevic V, Davis-Dusenbery BN. Reproducible, Scalable Fusion Gene Detection from RNA-Seq. Methods Mol Biol 2016; 1381:223-37. [PMID: 26667464 DOI: 10.1007/978-1-4939-3204-7_13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Chromosomal rearrangements resulting in the creation of novel gene products, termed fusion genes, have been identified as driving events in the development of multiple types of cancer. As these gene products typically do not exist in normal cells, they represent valuable prognostic and therapeutic targets. Advances in next-generation sequencing and computational approaches have greatly improved our ability to detect and identify fusion genes. Nevertheless, these approaches require significant computational resources. Here we describe an approach which leverages cloud computing technologies to perform fusion gene detection from RNA sequencing data at any scale. We additionally highlight methods to enhance reproducibility of bioinformatics analyses which may be applied to any next-generation sequencing experiment.
Collapse
Affiliation(s)
- Vladan Arsenijevic
- Department of Bioinformatics, Seven Bridges Genomics, One Broadway, 14th Floor, Cambridge, MA, 02142, USA
| | - Brandi N Davis-Dusenbery
- Department of Bioinformatics, Seven Bridges Genomics, One Broadway, 14th Floor, Cambridge, MA, 02142, USA.
| |
Collapse
|
43
|
Abstract
The occurrence of chimeric transcripts has been reported in many cancer cells and seen as potential biomarkers and therapeutic targets. Modern high-throughput sequencing technologies offer a way to investigate individual chimeric transcripts and the systematic information of associated gene expressions about underlying genome structural variations and genomic interactions. The detection methods of finding chimeric transcripts from massive amount of short read sequence data are discussed here. Both assembly-based and alignment-based methods are used for the investigation of chimeric transcripts.
Collapse
|
44
|
Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data. Sci Rep 2015; 5:11534. [PMID: 26166306 PMCID: PMC4499804 DOI: 10.1038/srep11534] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Accepted: 05/07/2015] [Indexed: 11/10/2022] Open
Abstract
Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.
Collapse
|
45
|
Parfenov M, Seidman JG. Finding Pathogenic Nucleic Acid Sequences in Next Generation Sequencing Data. ACTA ACUST UNITED AC 2015; 86:18.9.1-18.9.10. [PMID: 26132004 DOI: 10.1002/0471142905.hg1809s86] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Viruses and bacteria are established as one of the main causes of human diseases from hepatitis to cancer. Recently, the presence of such pathogens has been extensively studied using human whole genome and transcriptome sequencing data. However, detecting and studying pathogens via next generation sequencing data is a challenging task in terms of time and computational resources. In this protocol we give instructions for a simple and quick method to find pathogenic DNA or RNA and detect possible integration of the pathogen genome into the host genome.
Collapse
Affiliation(s)
- Michael Parfenov
- Department of Genetics, Harvard Medical School, Boston, Massachusetts
| | - J G Seidman
- Department of Genetics, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
46
|
Chandrani P, Kulkarni V, Iyer P, Upadhyay P, Chaubal R, Das P, Mulherkar R, Singh R, Dutt A. NGS-based approach to determine the presence of HPV and their sites of integration in human cancer genome. Br J Cancer 2015; 112:1958-1965. [PMID: 25973533 PMCID: PMC4580395 DOI: 10.1038/bjc.2015.121] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Revised: 03/03/2015] [Accepted: 03/07/2015] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Human papilloma virus (HPV) accounts for the most common cause of all virus-associated human cancers. Here, we describe the first graphic user interface (GUI)-based automated tool 'HPVDetector', for non-computational biologists, exclusively for detection and annotation of the HPV genome based on next-generation sequencing data sets. METHODS We developed a custom-made reference genome that comprises of human chromosomes along with annotated genome of 143 HPV types as pseudochromosomes. The tool runs on a dual mode as defined by the user: a 'quick mode' to identify presence of HPV types and an 'integration mode' to determine genomic location for the site of integration. The input data can be a paired-end whole-exome, whole-genome or whole-transcriptome data set. The HPVDetector is available in public domain for download: http://www.actrec.gov.in/pi-webpages/AmitDutt/HPVdetector/HPVDetector.html. RESULTS On the basis of our evaluation of 116 whole-exome, 23 whole-transcriptome and 2 whole-genome data, we were able to identify presence of HPV in 20 exomes and 4 transcriptomes of cervical and head and neck cancer tumour samples. Using the inbuilt annotation module of HPVDetector, we found predominant integration of viral gene E7, a known oncogene, at known 17q21, 3q27, 7q35, Xq28 and novel sites of integration in the human genome. Furthermore, co-infection with high-risk HPVs such as 16 and 31 were found to be mutually exclusive compared with low-risk HPV71. CONCLUSIONS HPVDetector is a simple yet precise and robust tool for detecting HPV from tumour samples using variety of next-generation sequencing platforms including whole genome, whole exome and transcriptome. Two different modes (quick detection and integration mode) along with a GUI widen the usability of HPVDetector for biologists and clinicians with minimal computational knowledge.
Collapse
Affiliation(s)
- P Chandrani
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - V Kulkarni
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - P Iyer
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - P Upadhyay
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - R Chaubal
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - P Das
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - R Mulherkar
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - R Singh
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| | - A Dutt
- Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, Maharashtra 410210, India
| |
Collapse
|
47
|
Ringelhan M, O'Connor T, Protzer U, Heikenwalder M. The direct and indirect roles of HBV in liver cancer: prospective markers for HCC screening and potential therapeutic targets. J Pathol 2015; 235:355-67. [PMID: 25196558 DOI: 10.1002/path.4434] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2014] [Revised: 09/01/2014] [Accepted: 09/02/2014] [Indexed: 02/06/2023]
Abstract
Chronic hepatitis B virus (HBV) infection remains the number one risk factor for hepatocellular carcinoma (HCC), accounting for more than 600 000 deaths/year. Despite highly effective antiviral treatment options, chronic hepatitis B (CHB), subsequent end-stage liver disease and HCC development remain a major challenge worldwide. In CHB, liver damage is mainly caused by the influx of immune cells and destruction of infected hepatocytes, causing necro-inflammation. Treatment with nucleoside/nucleotide analogues can effectively suppress HBV replication in patients with CHB and thus decrease the risk for HCC development. Nevertheless, the risk of HCC in treated patients showing sufficient suppression of HBV DNA replication is significantly higher than in patients with inactive CHB, regardless of the presence of baseline liver cirrhosis, suggesting direct, long-lasting, predisposing effects of HBV. Direct oncogenic effects of HBV include integration in the host genome, leading to deletions, cis/trans-activation, translocations, the production of fusion transcripts and generalized genomic instability, as well as pleiotropic effects of viral transcripts (HBsAg and HBx). Analysis of these viral factors in active surveillance may allow early identification of high-risk patients, and their integration into a molecular classification of HCC subtypes might help in the development of novel therapeutic approaches.
Collapse
Affiliation(s)
- Marc Ringelhan
- Institute of Virology, Technische Universität München/Helmholtz Zentrum München, Munich, Germany; Second Medical Department, Klinikum Rechts der Isar, Technische Universität München, Munich, Germany; German Centre for Infection research (DZIF), Munich Partner Site, Germany
| | | | | | | |
Collapse
|
48
|
Wang Q, Jia P, Zhao Z. VERSE: a novel approach to detect virus integration in host genomes through reference genome customization. Genome Med 2015; 7:2. [PMID: 25699093 PMCID: PMC4333248 DOI: 10.1186/s13073-015-0126-6] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 01/05/2015] [Indexed: 12/28/2022] Open
Abstract
Fueled by widespread applications of high-throughput next generation sequencing (NGS) technologies and urgent need to counter threats of pathogenic viruses, large-scale studies were conducted recently to investigate virus integration in host genomes (for example, human tumor genomes) that may cause carcinogenesis or other diseases. A limiting factor in these studies, however, is rapid virus evolution and resulting polymorphisms, which prevent reads from aligning readily to commonly used virus reference genomes, and, accordingly, make virus integration sites difficult to detect. Another confounding factor is host genomic instability as a result of virus insertions. To tackle these challenges and improve our capability to identify cryptic virus-host fusions, we present a new approach that detects Virus intEgration sites through iterative Reference SEquence customization (VERSE). To the best of our knowledge, VERSE is the first approach to improve detection through customizing reference genomes. Using 19 human tumors and cancer cell lines as test data, we demonstrated that VERSE substantially enhanced the sensitivity of virus integration site detection. VERSE is implemented in the open source package VirusFinder 2 that is available at http://bioinfo.mc.vanderbilt.edu/VirusFinder/.
Collapse
Affiliation(s)
- Qingguo Wang
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203 USA
| | - Peilin Jia
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203 USA ; Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN 37232 USA
| | - Zhongming Zhao
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203 USA ; Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN 37232 USA ; Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN 37232 USA ; Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN 37232 USA
| |
Collapse
|
49
|
Guo Y, Zhao S, Bjoring M, Han L. Advanced Datamining Using RNAseq Data. BIG DATA ANALYTICS IN BIOINFORMATICS AND HEALTHCARE 2015. [DOI: 10.4018/978-1-4666-6611-5.ch001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
In recent years, RNA sequencing (RNAseq) technology has experienced a rapid rise in popularity. Often seen as a competitor of and the ultimate successor to microarray technology given its more accurate and quantitative gene expression measurement, RNAseq also offers a wealth of additional information that is often overlooked, and given the massive accumulation of RNAseq data available in public data repositories over the past few years, these data are ripe for discovery. Abundant opportunities exist for researchers to conduct in-depth, non-traditional analyses that take advantage of these secondary uses and for bioinformaticians to develop tools to make these data more accessible. This is discussed in this chapter.
Collapse
|
50
|
Unraveling the web of viroinformatics: computational tools and databases in virus research. J Virol 2014; 89:1489-501. [PMID: 25428870 DOI: 10.1128/jvi.02027-14] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The beginning of the second century of research in the field of virology (the first virus was discovered in 1898) was marked by its amalgamation with bioinformatics, resulting in the birth of a new domain--viroinformatics. The availability of more than 100 Web servers and databases embracing all or specific viruses (for example, dengue virus, influenza virus, hepatitis virus, human immunodeficiency virus [HIV], hemorrhagic fever virus [HFV], human papillomavirus [HPV], West Nile virus, etc.) as well as distinct applications (comparative/diversity analysis, viral recombination, small interfering RNA [siRNA]/short hairpin RNA [shRNA]/microRNA [miRNA] studies, RNA folding, protein-protein interaction, structural analysis, and phylotyping and genotyping) will definitely aid the development of effective drugs and vaccines. However, information about their access and utility is not available at any single source or on any single platform. Therefore, a compendium of various computational tools and resources dedicated specifically to virology is presented in this article.
Collapse
|