1
|
Yan A, Baricordi C, Nguyen Q, Barbarossa L, Loperfido M, Biasco L. IS-Seq: a bioinformatics pipeline for integration sites analysis with comprehensive abundance quantification methods. BMC Bioinformatics 2023; 24:286. [PMID: 37464281 DOI: 10.1186/s12859-023-05390-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/16/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Integration site (IS) analysis is a fundamental analytical platform for evaluating the safety and efficacy of viral vector based preclinical and clinical Gene Therapy (GT). A handful of groups have developed standardized bioinformatics pipelines to process IS sequencing data, to generate reports, and/or to perform comparative studies across different GT trials. Keeping up with the technological advances in the field of IS analysis, different computational pipelines have been published over the past decade. These pipelines focus on identifying IS from single-read sequencing or paired-end sequencing data either using read-based or using sonication fragment-based methods, but there is a lack of a bioinformatics tool that automatically includes unique molecular identifiers (UMI) for IS abundance estimations and allows comparing multiple quantification methods in one integrated pipeline. RESULTS Here we present IS-Seq a bioinformatics pipeline that can process data from paired-end sequencing of both old restriction sites-based IS collection methods and new sonication-based IS retrieval systems while allowing the selection of different abundance estimation methods, including read-based, Fragment-based and UMI-based systems. CONCLUSIONS We validated the performance of IS-Seq by testing it against the most popular analytical workflow available in the literature (INSPIIRED) and using different scenarios. Lastly, by performing extensive simulation studies and a comprehensive wet-lab assessment of our IS-Seq pipeline we could show that in clinically relevant scenarios, UMI quantification provides better accuracy than the currently most widely used sonication fragment counts as a method for IS abundance estimation.
Collapse
Affiliation(s)
| | | | | | | | | | - Luca Biasco
- AVROBIO, Inc., Cambridge, MA, USA.
- Infection, Immunity and Inflammation Department, Great Ormond Street Institute of Child Health, University College London, London, UK.
| |
Collapse
|
2
|
Causes and Consequences of HPV Integration in Head and Neck Squamous Cell Carcinomas: State of the Art. Cancers (Basel) 2021; 13:cancers13164089. [PMID: 34439243 PMCID: PMC8394665 DOI: 10.3390/cancers13164089] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 08/10/2021] [Accepted: 08/11/2021] [Indexed: 12/29/2022] Open
Abstract
A constantly increasing incidence in high-risk Human Papillomaviruses (HPV)s driven head and neck squamous cell carcinomas (HNSCC)s, especially of oropharyngeal origin, is being observed. During persistent infections, viral DNA integration into the host genome may occur. Studies are examining if the physical status of the virus (episomal vs. integration) affects carcinogenesis and eventually has further-reaching consequences on disease progression and outcome. Here, we review the literature of the most recent five years focusing on the impact of HPV integration in HNSCCs, covering aspects of detection techniques used (from PCR up to NGS approaches), integration loci identified, and associations with genomic and clinical data. The consequences of HPV integration in the human genome, including the methylation status and deregulation of genes involved in cell signaling pathways, immune evasion, and response to therapy, are also summarized.
Collapse
|
3
|
Rajaby R, Zhou Y, Meng Y, Zeng X, Li G, Wu P, Sung WK. SurVirus: a repeat-aware virus integration caller. Nucleic Acids Res 2021; 49:e33. [PMID: 33444454 PMCID: PMC8034624 DOI: 10.1093/nar/gkaa1237] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2020] [Revised: 12/01/2020] [Accepted: 01/12/2021] [Indexed: 01/01/2023] Open
Abstract
A significant portion of human cancers are due to viruses integrating into human genomes. Therefore, accurately predicting virus integrations can help uncover the mechanisms that lead to many devastating diseases. Virus integrations can be called by analysing second generation high-throughput sequencing datasets. Unfortunately, existing methods fail to report a significant portion of integrations, while predicting a large number of false positives. We observe that the inaccuracy is caused by incorrect alignment of reads in repetitive regions. False alignments create false positives, while missing alignments create false negatives. This paper proposes SurVirus, an improved virus integration caller that corrects the alignment of reads which are crucial for the discovery of integrations. We use publicly available datasets to show that existing methods predict hundreds of thousands of false positives; SurVirus, on the other hand, is significantly more precise while it also detects many novel integrations previously missed by other tools, most of which are in repetitive regions. We validate a subset of these novel integrations, and find that the majority are correct. Using SurVirus, we find that HPV and HBV integrations are enriched in LINE and Satellite regions which had been overlooked, as well as discover recurrent HBV and HPV breakpoints in human genome-virus fusion transcripts.
Collapse
Affiliation(s)
- Ramesh Rajaby
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, 28 Medical Drive, 117456, Singapore
| | - Yi Zhou
- Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Yifan Meng
- Department of Gynecologic Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.,Cancer Biology Research Center (Key laboratory of the ministry of education), Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xi Zeng
- Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Guoliang Li
- Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Peng Wu
- Department of Gynecologic Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.,Cancer Biology Research Center (Key laboratory of the ministry of education), Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Wing-Kin Sung
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,Agricultural Bioinformatics Key Laboratory of Hubei Province, Hubei Engineering Technology Research Center of Agricultural Big Data, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.,Genome Institute of Singapore, 60 Biopolis Street, Genome 138672, Singapore
| |
Collapse
|
4
|
Espinoza DA, Mortlock RD, Koelle SJ, Wu C, Dunbar CE. Interrogation of clonal tracking data using barcodetrackR. NATURE COMPUTATIONAL SCIENCE 2021; 1:280-289. [PMID: 37621673 PMCID: PMC10449013 DOI: 10.1038/s43588-021-00057-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/17/2021] [Indexed: 08/26/2023]
Abstract
Clonal tracking methods provide quantitative insights into the cellular output of genetically labelled progenitor cells across time and cellular compartments. In the context of gene and cell therapies, clonal tracking methods have enabled the tracking of progenitor cell output both in humans receiving therapies and in corresponding animal models, providing valuable insight into lineage reconstitution, clonal dynamics, and vector genotoxicity. However, the absence of a toolbox for analysis of clonal tracking data has precluded the development of standardized analytical frameworks within the field. Thus, we developed barcodetrackR, an R package and accompanying Shiny app containing diverse tools for the analysis and visualization of clonal tracking data. We demonstrate the utility of barcodetrackR in exploring longitudinal clonal patterns and lineage relationships in a number of clonal tracking studies of hematopoietic stem and progenitor cells (HSPCs) in humans receiving HSPC gene therapy and in animals receiving lentivirally transduced HSPC transplants or tumor cells.
Collapse
Affiliation(s)
- Diego A. Espinoza
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Translational Stem Cell Biology Branch, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ryland D. Mortlock
- Translational Stem Cell Biology Branch, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Samson J. Koelle
- Translational Stem Cell Biology Branch, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
- Department of Statistics, University of Washington, Seattle, WA, USA
| | - Chuanfeng Wu
- Translational Stem Cell Biology Branch, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Cynthia E. Dunbar
- Translational Stem Cell Biology Branch, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
5
|
Huaying C, Xing J, Luya J, Linhui N, Di S, Xianjun D. A Signature of Five Long Non-Coding RNAs for Predicting the Prognosis of Alzheimer's Disease Based on Competing Endogenous RNA Networks. Front Aging Neurosci 2021; 12:598606. [PMID: 33584243 PMCID: PMC7876075 DOI: 10.3389/fnagi.2020.598606] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 12/23/2020] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) play important roles in the pathogenesis of Alzheimer's disease (AD). However, the functions and regulatory mechanisms of lncRNA are largely unclear. Herein, we obtained 3,158 lncRNAs by microarray re-annotation. A global network of competing endogenous RNAs (ceRNAs) was developed for AD and normal samples were based on the gene expressions profiles. A total of 255 AD-deficient messenger RNA (mRNA)-lncRNAs were identified by the expression correlation analysis. Genes in the dysregulated ceRNAs were found to be mainly enriched in transcription factors and micro RNAs (miRNAs). Analysis of the disordered miRNA in the lncRNA-mRNA network revealed that 40 pairs of lncRNA shared more than one disordered miRNA. Among them, nine lncRNAs were closely associated with AD, Parkinson's disease, and other neurodegenerative diseases. Of note, five lncRNAs were found to be potential biomarkers for AD. Real-time quantitative reverse transcription PCR (qRT-PCR) assay revealed that PART1 was downregulated, while SNHG14 was upregulated in AD serum samples when compared to normal samples. This study elucidates the role of lncRNAs in the pathogenesis of AD and presents new lncRNAs that can be exploited to design diagnostic and therapeutic agents for AD.
Collapse
Affiliation(s)
- Cai Huaying
- Department of Neurology, Neuroscience Center, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| | - Jin Xing
- Department of Neurology, Neuroscience Center, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| | - Jin Luya
- Department of Neurology, Neuroscience Center, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| | - Ni Linhui
- Department of Neurology, Neuroscience Center, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| | - Sun Di
- Department of Neurology, Neuroscience Center, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| | - Ding Xianjun
- Department of Orthopedic Surgery, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| |
Collapse
|
6
|
Chen X, Kost J, Li D. Comprehensive comparative analysis of methods and software for identifying viral integrations. Brief Bioinform 2020; 20:2088-2097. [PMID: 30102374 DOI: 10.1093/bib/bby070] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 07/02/2018] [Accepted: 07/12/2018] [Indexed: 12/13/2022] Open
Abstract
Many viruses are capable of integrating in the human genome, particularly viruses involved in tumorigenesis. Viral integrations can be considered genetic markers for discovering virus-caused cancers and inferring cancer cell development. Next-generation sequencing (NGS) technologies have been widely used to screen for viral integrations in cancer genomes, and a number of bioinformatics tools have been developed to detect viral integrations using NGS data. However, there has been no systematic comparison of the methods or software. In this study, we performed a comprehensive comparative analysis of the designs, performance, functionality and limitations among the existing methods and software for detecting viral integrations. We further compared the sensitivity, precision and runtime of integration detection of four representative tools. Our analyses showed that each of the existing software had its own merits; however, none of them were sufficient for parallel or accurate virome-wide detection. After carefully evaluating the limitations shared by the existing methods, we proposed strategies and directions for developing virome-wide integration detection.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Cancer Center, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
7
|
Tang J, Cui Q, Zhang D, Kong D, Liao X, Ren J, Gong Y, Wu G. A prognostic eight-lncRNA expression signature in predicting recurrence of ER-positive breast cancer receiving endocrine therapy. J Cell Physiol 2019; 235:4746-4755. [PMID: 31663114 DOI: 10.1002/jcp.29352] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 09/30/2019] [Indexed: 12/26/2022]
Abstract
Long noncoding RNAs (lncRNAs) have the main role in the tumorigenesis of breast cancer. In the present study, lncRNA expression profiling was collected to identify a lncRNA expression signature from the Gene Expression Omnibus database. An eight-lncRNA signature was established to predict the survival of patients with estrogen receptor (ER)-positive breast cancer receiving endocrine therapy. Patients were separated into a low-risk group and a high-risk group based on this signature. Patients in high-risk group have worse survival compared to those in low-risk group using Kaplan-Meier curve analysis with log-rank test. Receiver operating characteristic analysis suggested good diagnostic efficiency of the eight-lncRNA signature. When adjusting the clinical features, including age, grade, lymph node status, and tumor size, this signature was independently associated with the relapse-free survival. The prognostic value of the lncRNA prognostic model was then validated in validation sets. When validated in a cohort of patients treated with neoadjuvant chemotherapy and endocrine therapy, this signature demonstrated good performance as well. Besides, we have built a nomogram that integrated the conventional clinicopathological features and the eight-lncRNA-based signature. To sum up, our results indicated that the eight-lncRNA prognostic model was a reliable tool to group patients at high and low risk of disease relapse. This signature may have possible implication in prognostic evaluations of patients with ER-positive breast cancer receiving endocrine therapy.
Collapse
Affiliation(s)
- Jianing Tang
- Department of Thyroid and Breast Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Qiuxia Cui
- Department of Thyroid and Breast Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Dan Zhang
- Department of Thyroid and Breast Surgery, Tongji Hospital, Huazhong University of Science and Technology, Wuhan, China
| | - Deguang Kong
- Department of General Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Xing Liao
- Department of Thyroid and Breast Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jiangbo Ren
- Department of Biological Repositories, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Yan Gong
- Department of Biological Repositories, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Gaosong Wu
- Department of Thyroid and Breast Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| |
Collapse
|
8
|
Chen X, Kost J, Sulovari A, Wong N, Liang WS, Cao J, Li D. A virome-wide clonal integration analysis platform for discovering cancer viral etiology. Genome Res 2019; 29:819-830. [PMID: 30872350 PMCID: PMC6499315 DOI: 10.1101/gr.242529.118] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 03/11/2019] [Indexed: 12/31/2022]
Abstract
Oncoviral infection is responsible for 12%–15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contamination and noncausal viruses complicate the process of identifying genuine oncoviruses. Here, we propose a novel strategy to address these challenges by performing virome-wide screening of early-stage clonal viral integrations. To implement this strategy, we developed VIcaller, a novel platform for identifying viral integrations that are derived from any characterized viruses and shared by a large proportion of tumor cells using whole-genome sequencing (WGS) data. The sensitivity and precision were confirmed with simulated and benchmark cancer data sets. By applying this platform to cancer WGS data sets with proven or speculated viral etiology, we newly identified or confirmed clonal integrations of hepatitis B virus (HBV), human papillomavirus (HPV), Epstein-Barr virus (EBV), and BK Virus (BKV), suggesting the involvement of these viruses in early stages of tumorigenesis in affected tumors, such as HBV in TERT and KMT2B (also known as MLL4) gene loci in liver cancer, HPV and BKV in bladder cancer, and EBV in non-Hodgkin's lymphoma. We also showed the capacity of VIcaller to identify integrations from some uncharacterized viruses. This is the first study to systematically investigate the strategy and method of virome-wide screening of clonal integrations to identify oncoviruses. Searching clonal viral integrations with our platform has the capacity to identify virus-caused cancers and discover cancer viral etiologies.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Arvis Sulovari
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Nathalie Wong
- Department of Anatomical and Cellular Pathology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, NT, Hong Kong 999077, P.R. China
| | - Winnie S Liang
- Translational Genomics Research Institute, Phoenix, Arizona 85004, USA
| | - Jian Cao
- Division of Medical Oncology, Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA.,Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
9
|
Bioinformatics Applications in Advancing Animal Virus Research. RECENT ADVANCES IN ANIMAL VIROLOGY 2019. [PMCID: PMC7121192 DOI: 10.1007/978-981-13-9073-9_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Viruses serve as infectious agents for all living entities. There have been various research groups that focus on understanding the viruses in terms of their host-viral relationships, pathogenesis and immune evasion. However, with the current advances in the field of science, now the research field has widened up at the ‘omics’ level. Apparently, generation of viral sequence data has been increasing. There are numerous bioinformatics tools available that not only aid in analysing such sequence data but also aid in deducing useful information that can be exploited in developing preventive and therapeutic measures. This chapter elaborates on bioinformatics tools that are specifically designed for animal viruses as well as other generic tools that can be exploited to study animal viruses. The chapter further provides information on the tools that can be used to study viral epidemiology, phylogenetic analysis, structural modelling of proteins, epitope recognition and open reading frame (ORF) recognition and tools that enable to analyse host-viral interactions, gene prediction in the viral genome, etc. Various databases that organize information on animal and human viruses have also been described. The chapter will converse on overview of the current advances, online and downloadable tools and databases in the field of bioinformatics that will enable the researchers to study animal viruses at gene level.
Collapse
|
10
|
Liu R, Hu R, Zhang W, Zhou HH. Long noncoding RNA signature in predicting metastasis following tamoxifen treatment for ER-positive breast cancer. Pharmacogenomics 2018; 19:825-835. [PMID: 29983093 DOI: 10.2217/pgs-2018-0032] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
AIM We aimed to develop a long noncoding RNA (lncRNA) expression signature that can predict response to tamoxifen. MATERIALS & METHODS LncRNA expression profiling was mined in two cohorts from Gene Expression Omnibus (GSE6532, GSE9195, n = 412). RESULTS A set of lncRNAs (LINC01191, RP4-639F20.1 and CTC-429P9.3) associated with distant metastasis-free survival was established. Estrogen receptor-positive breast cancer patients in the training series could be classified into high- and low-risk groups with significantly different distant metastasis-free survival values based on this signature (hazard ratio [HR]: 5.11; p = 7.28 × 10-8). The prognostic ability of this signature was confirmed in validation sets 1 (HR: 2.58; p = 1.54 × 10-2) and 2 (HR: 10.06; p = 6.85 × 10-3). CONCLUSION The lncRNA signature may have possible clinical implications in the selection of high-risk patients for tamoxifen therapy.
Collapse
Affiliation(s)
- Rong Liu
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha 410008, PR China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha 410078, PR China
| | - Rong Hu
- Department of Obstetrics & Gynecology, Xiangya Hospital, Central South University, Changsha 410008, PR China
| | - Wei Zhang
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha 410008, PR China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha 410078, PR China
| | - Hong-Hao Zhou
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha 410008, PR China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha 410078, PR China
| |
Collapse
|
11
|
Wang W, Bartholomae CC, Gabriel R, Deichmann A, Schmidt M. The LAM-PCR Method to Sequence LV Integration Sites. Methods Mol Biol 2018; 1448:107-20. [PMID: 27317177 DOI: 10.1007/978-1-4939-3753-0_9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
Integrating viral gene transfer vectors are commonly used gene delivery tools in clinical gene therapy trials providing stable integration and continuous gene expression of the transgene in the treated host cell. However, integration of the reverse-transcribed vector DNA into the host genome is a potentially mutagenic event that may directly contribute to unwanted side effects. A comprehensive and accurate analysis of the integration site (IS) repertoire is indispensable to study clonality in transduced cells obtained from patients undergoing gene therapy and to identify potential in vivo selection of affected cell clones. To date, next-generation sequencing (NGS) of vector-genome junctions allows sophisticated studies on the integration repertoire in vitro and in vivo. We have explored the use of the Illumina MiSeq Personal Sequencer platform to sequence vector ISs amplified by non-restrictive linear amplification-mediated PCR (nrLAM-PCR) and LAM-PCR. MiSeq-based high-quality IS sequence retrieval is accomplished by the introduction of a double-barcode strategy that substantially minimizes the frequency of IS sequence collisions compared to the conventionally used single-barcode protocol. Here, we present an updated protocol of (nr)LAM-PCR for the analysis of lentiviral IS using a double-barcode system and followed by deep sequencing using the MiSeq device.
Collapse
Affiliation(s)
- Wei Wang
- Department of Translational Oncology, National Center for Tumor Diseases and German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
| | - Cynthia C Bartholomae
- Department of Translational Oncology, National Center for Tumor Diseases and German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
| | - Richard Gabriel
- Department of Translational Oncology, National Center for Tumor Diseases and German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
| | - Annette Deichmann
- Department of Translational Oncology, National Center for Tumor Diseases and German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
| | - Manfred Schmidt
- Department of Translational Oncology, National Center for Tumor Diseases and German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany.
| |
Collapse
|
12
|
VISPA2: a scalable pipeline for high-throughput identification and annotation of vector integration sites. BMC Bioinformatics 2017; 18:520. [PMID: 29178837 PMCID: PMC5702242 DOI: 10.1186/s12859-017-1937-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 11/14/2017] [Indexed: 01/09/2023] Open
Abstract
Background Bioinformatics tools designed to identify lentiviral or retroviral vector insertion sites in the genome of host cells are used to address the safety and long-term efficacy of hematopoietic stem cell gene therapy applications and to study the clonal dynamics of hematopoietic reconstitution. The increasing number of gene therapy clinical trials combined with the increasing amount of Next Generation Sequencing data, aimed at identifying integration sites, require both highly accurate and efficient computational software able to correctly process “big data” in a reasonable computational time. Results Here we present VISPA2 (Vector Integration Site Parallel Analysis, version 2), the latest optimized computational pipeline for integration site identification and analysis with the following features: (1) the sequence analysis for the integration site processing is fully compliant with paired-end reads and includes a sequence quality filter before and after the alignment on the target genome; (2) an heuristic algorithm to reduce false positive integration sites at nucleotide level to reduce the impact of Polymerase Chain Reaction or trimming/alignment artifacts; (3) a classification and annotation module for integration sites; (4) a user friendly web interface as researcher front-end to perform integration site analyses without computational skills; (5) the time speedup of all steps through parallelization (Hadoop free). Conclusions We tested VISPA2 performances using simulated and real datasets of lentiviral vector integration sites, previously obtained from patients enrolled in a hematopoietic stem cell gene therapy clinical trial and compared the results with other preexisting tools for integration site analysis. On the computational side, VISPA2 showed a > 6-fold speedup and improved precision and recall metrics (1 and 0.97 respectively) compared to previously developed computational pipelines. These performances indicate that VISPA2 is a fast, reliable and user-friendly tool for integration site analysis, which allows gene therapy integration data to be handled in a cost and time effective fashion. Moreover, the web access of VISPA2 (http://openserver.itb.cnr.it/vispa/) ensures accessibility and ease of usage to researches of a complex analytical tool. We released the source code of VISPA2 in a public repository (https://bitbucket.org/andreacalabria/vispa2). Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1937-9) contains supplementary material, which is available to authorized users.
Collapse
|
13
|
Sherman E, Nobles C, Berry CC, Six E, Wu Y, Dryga A, Malani N, Male F, Reddy S, Bailey A, Bittinger K, Everett JK, Caccavelli L, Drake MJ, Bates P, Hacein-Bey-Abina S, Cavazzana M, Bushman FD. INSPIIRED: A Pipeline for Quantitative Analysis of Sites of New DNA Integration in Cellular Genomes. MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT 2016; 4:39-49. [PMID: 28344990 PMCID: PMC5363316 DOI: 10.1016/j.omtm.2016.11.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 11/15/2016] [Indexed: 01/24/2023]
Abstract
Integration of new DNA into cellular genomes mediates replication of retroviruses and transposons; integration reactions have also been adapted for use in human gene therapy. Tracking the distributions of integration sites is important to characterize populations of transduced cells and to monitor potential outgrow of pathogenic cell clones. Here, we describe a pipeline for quantitative analysis of integration site distributions named INSPIIRED (integration site pipeline for paired-end reads). We describe optimized biochemical steps for site isolation using Illumina paired-end sequencing, including new technology for suppressing recovery of unwanted contaminants, then software for alignment, quality control, and management of integration site sequences. During library preparation, DNAs are broken by sonication, so that after ligation-mediated PCR the number of ligation junction sites can be used to infer abundance of gene-modified cells. We generated integration sites of known positions in silico, and we describe optimization of sample processing parameters refined by comparison to truth. We also present a novel graph-theory-based method for quantifying integration sites in repeated sequences, and we characterize the consequences using synthetic and experimental data. In an accompanying paper, we describe an additional set of statistical tools for data analysis and visualization. Software is available at https://github.com/BushmanLab/INSPIIRED.
Collapse
Affiliation(s)
- Eric Sherman
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Christopher Nobles
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Charles C Berry
- Department of Family Medicine and Public Health, University of California, San Diego, La Jolla, CA 92093, USA
| | - Emmanuelle Six
- Imagine Institute, Paris Descartes-Sorbonne Paris Cité University, 75014 Paris, France; Laboratory of Human Lymphohematopoiesis, INSERM 24, 75014 Paris, France
| | - Yinghua Wu
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Anatoly Dryga
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Nirav Malani
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Frances Male
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Shantan Reddy
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Aubrey Bailey
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Kyle Bittinger
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - John K Everett
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Laure Caccavelli
- Biotherapy Department, Necker Children's Hospital, Assistance Publique-Hôpitaux de Paris, 75014 Paris, France; Biotherapy Clinical Investigation Center, Groupe Hospitalier Universitaire Ouest, Assistance Publique-Hôpitaux de Paris, INSERM, 75014 Paris, France
| | - Mary J Drake
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Paul Bates
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Salima Hacein-Bey-Abina
- Biotherapy Department, Necker Children's Hospital, Assistance Publique-Hôpitaux de Paris, 75014 Paris, France; Biotherapy Clinical Investigation Center, Groupe Hospitalier Universitaire Ouest, Assistance Publique-Hôpitaux de Paris, INSERM, 75014 Paris, France
| | - Marina Cavazzana
- Biotherapy Department, Necker Children's Hospital, Assistance Publique-Hôpitaux de Paris, 75014 Paris, France; Biotherapy Clinical Investigation Center, Groupe Hospitalier Universitaire Ouest, Assistance Publique-Hôpitaux de Paris, INSERM, 75014 Paris, France
| | - Frederic D Bushman
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| |
Collapse
|
14
|
Berry CC, Nobles C, Six E, Wu Y, Malani N, Sherman E, Dryga A, Everett JK, Male F, Bailey A, Bittinger K, Drake MJ, Caccavelli L, Bates P, Hacein-Bey-Abina S, Cavazzana M, Bushman FD. INSPIIRED: Quantification and Visualization Tools for Analyzing Integration Site Distributions. MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT 2016; 4:17-26. [PMID: 28344988 PMCID: PMC5363318 DOI: 10.1016/j.omtm.2016.11.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2016] [Accepted: 11/15/2016] [Indexed: 01/08/2023]
Abstract
Analysis of sites of newly integrated DNA in cellular genomes is important to several fields, but methods for analyzing and visualizing these datasets are still under development. Here, we describe tools for data analysis and visualization that take as input integration site data from our INSPIIRED pipeline. Paired-end sequencing allows inference of the numbers of transduced cells as well as the distributions of integration sites in target genomes. We present interactive heatmaps that allow comparison of distributions of integration sites to genomic features and that support numerous user-defined statistical tests. To summarize integration site data from human gene therapy samples, we developed a reproducible report format that catalogs sample population structure, longitudinal dynamics, and integration frequency near cancer-associated genes. We also introduce a novel summary statistic, the UC50 (unique cell progenitors contributing the most expanded 50% of progeny cell clones), which provides a single number summarizing possible clonal expansion. Using these tools, we characterize ongoing longitudinal characterization of a patient from the first trial to treat severe combined immunodeficiency-X1 (SCID-X1), showing successful reconstitution for 15 years accompanied by persistence of a cell clone with an integration site near the cancer-associated gene CCND2. Software is available at https://github.com/BushmanLab/INSPIIRED.
Collapse
Affiliation(s)
- Charles C Berry
- Department of Family Medicine and Public Health, UC San Diego, La Jolla, CA 92093, USA
| | - Christopher Nobles
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Emmanuelle Six
- Paris Descartes-Sorbonne Paris Cité University, Imagine Institute, 75015 Paris, France; INSERM 24, Laboratory of Human Lymphohematopoiesis, 75015 Paris, France
| | - Yinghua Wu
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Nirav Malani
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Eric Sherman
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Anatoly Dryga
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - John K Everett
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Frances Male
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Aubrey Bailey
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Kyle Bittinger
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Mary J Drake
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Laure Caccavelli
- Biotherapy Department, Necker Children's Hospital, Assistance Publique-Hôpitaux de Paris, 75014 Paris, France; Biotherapy Clinical Investigation Center, Groupe Hospitalier Universitaire Ouest, Assistance Publique-Hôpitaux de Paris, INSERM, 75014 Paris, France
| | - Paul Bates
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| | - Salima Hacein-Bey-Abina
- Biotherapy Department, Necker Children's Hospital, Assistance Publique-Hôpitaux de Paris, 75014 Paris, France; Biotherapy Clinical Investigation Center, Groupe Hospitalier Universitaire Ouest, Assistance Publique-Hôpitaux de Paris, INSERM, 75014 Paris, France
| | - Marina Cavazzana
- Biotherapy Department, Necker Children's Hospital, Assistance Publique-Hôpitaux de Paris, 75014 Paris, France; Biotherapy Clinical Investigation Center, Groupe Hospitalier Universitaire Ouest, Assistance Publique-Hôpitaux de Paris, INSERM, 75014 Paris, France
| | - Frederic D Bushman
- Department of Microbiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104-6076, USA
| |
Collapse
|
15
|
Jackson R, Rosa BA, Lameiras S, Cuninghame S, Bernard J, Floriano WB, Lambert PF, Nicolas A, Zehbe I. Functional variants of human papillomavirus type 16 demonstrate host genome integration and transcriptional alterations corresponding to their unique cancer epidemiology. BMC Genomics 2016; 17:851. [PMID: 27806689 PMCID: PMC5094076 DOI: 10.1186/s12864-016-3203-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 10/25/2016] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Human papillomaviruses (HPVs) are a worldwide burden as they are a widespread group of tumour viruses in humans. Having a tropism for mucosal tissues, high-risk HPVs are detected in nearly all cervical cancers. HPV16 is the most common high-risk type but not all women infected with high-risk HPV develop a malignant tumour. Likely relevant, HPV genomes are polymorphic and some HPV16 single nucleotide polymorphisms (SNPs) are under evolutionary constraint instigating variable oncogenicity and immunogenicity in the infected host. RESULTS To investigate the tumourigenicity of two common HPV16 variants, we used our recently developed, three-dimensional organotypic model reminiscent of the natural HPV infectious cycle and conducted various "omics" and bioinformatics approaches. Based on epidemiological studies we chose to examine the HPV16 Asian-American (AA) and HPV16 European Prototype (EP) variants. They differ by three non-synonymous SNPs in the transforming and virus-encoded E6 oncogene where AAE6 is classified as a high- and EPE6 as a low-risk variant. Remarkably, the high-risk AAE6 variant genome integrated into the host DNA, while the low-risk EPE6 variant genome remained episomal as evidenced by highly sensitive Capt-HPV sequencing. RNA-seq experiments showed that the truncated form of AAE6, integrated in chromosome 5q32, produced a local gene over-expression and a large variety of viral-human fusion transcripts, including long distance spliced transcripts. In addition, differential enrichment of host cell pathways was observed between both HPV16 E6 variant-containing epithelia. Finally, in the high-risk variant, we detected a molecular signature of host chromosomal instability, a common property of cancer cells. CONCLUSIONS We show how naturally occurring SNPs in the HPV16 E6 oncogene cause significant changes in the outcome of HPV infections and subsequent viral and host transcriptome alterations prone to drive carcinogenesis. Host genome instability is closely linked to viral integration into the host genome of HPV-infected cells, which is a key phenomenon for malignant cellular transformation and the reason for uncontrolled E6 oncogene expression. In particular, the finding of variant-specific integration potential represents a new paradigm in HPV variant biology.
Collapse
Affiliation(s)
- Robert Jackson
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada.,Biotechnology Program, Lakehead University, Thunder Bay, Ontario, Canada
| | - Bruce A Rosa
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Sonia Lameiras
- NGS platform, Institut Curie, PSL Research University, 26 rue d'Ulm, 75248, Paris, Cedex, France
| | - Sean Cuninghame
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada.,Northern Ontario School of Medicine, Lakehead University, Thunder Bay, Ontario, Canada
| | - Josee Bernard
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada.,Department of Biology, Lakehead University, Thunder Bay, Ontario, Canada
| | - Wely B Floriano
- Department of Chemistry, Lakehead University, Thunder Bay, Ontario, Canada
| | - Paul F Lambert
- McArdle Laboratory for Cancer Research, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Alain Nicolas
- Institut Curie, PSL Research University, Centre National de la Recherche Scientifique UMR3244, Sorbonne Universités, Paris, France
| | - Ingeborg Zehbe
- Probe Development and Biomarker Exploration, Thunder Bay Regional Research Institute, Thunder Bay, Ontario, Canada. .,Northern Ontario School of Medicine, Lakehead University, Thunder Bay, Ontario, Canada. .,Department of Biology, Lakehead University, Thunder Bay, Ontario, Canada.
| |
Collapse
|
16
|
Shaw AM, Joseph GL, Jasti AC, Sastry-Dent L, Witting S, Cornetta K. Differences in vector-genome processing and illegitimate integration of non-integrating lentiviral vectors. Gene Ther 2016; 24:12-20. [PMID: 27682478 PMCID: PMC5269419 DOI: 10.1038/gt.2016.69] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Revised: 09/07/2016] [Accepted: 09/12/2016] [Indexed: 12/13/2022]
Abstract
A variety of mutations in lentiviral vector expression systems have been shown to generate a non-integrating phenotype. We studied a novel 12 base-pair U3-long terminal repeats (LTR) integrase (IN) attachment site deletion (U3-LTR att site) mutant and found similar physical titers to the previously reported IN catalytic core mutant IN/D116N. Both mutations led to a greater than two log reduction in vector integration; with IN/D116N providing lower illegitimate integration frequency, whereas the U3-LTR att site mutant provided a higher level of transgene expression. The improved expression of the U3-LTR att site mutant could not be explained solely based on an observed modest increase in integration frequency. In evaluating processing, we noted significant differences in unintegrated vector forms, with the U3-LTR att site mutant leading to a predominance of 1-LTR circles. The mutations also differed in the manner of illegitimate integration. The U3-LTR att site mutant vector demonstrated IN-mediated integration at the intact U5-LTR att site and non-IN-mediated integration at the mutated U3-LTR att site. Finally, we combined a variety of mutations and modifications and assessed transgene expression and integration frequency to show that combining modifications can improve the potential clinical utility of non-integrating lentiviral vectors.
Collapse
Affiliation(s)
- A M Shaw
- Departments of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - G L Joseph
- Departments of Microbiology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - A C Jasti
- Departments of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - L Sastry-Dent
- Departments of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - S Witting
- Department of Experimental Hematology and Cancer Biology, Cincinnati Children's Hospital, Cincinnati, OH, USA
| | - K Cornetta
- Departments of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA.,Departments of Microbiology, Indiana University School of Medicine, Indianapolis, IN, USA.,Departments of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA
| |
Collapse
|
17
|
Hocum JD, Battrell LR, Maynard R, Adair JE, Beard BC, Rawlings DJ, Kiem HP, Miller DG, Trobridge GD. VISA--Vector Integration Site Analysis server: a web-based server to rapidly identify retroviral integration sites from next-generation sequencing. BMC Bioinformatics 2015; 16:212. [PMID: 26150117 PMCID: PMC4493804 DOI: 10.1186/s12859-015-0653-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Accepted: 06/29/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analyzing the integration profile of retroviral vectors is a vital step in determining their potential genotoxic effects and developing safer vectors for therapeutic use. Identifying retroviral vector integration sites is also important for retroviral mutagenesis screens. RESULTS We developed VISA, a vector integration site analysis server, to analyze next-generation sequencing data for retroviral vector integration sites. Sequence reads that contain a provirus are mapped to the human genome, sequence reads that cannot be localized to a unique location in the genome are filtered out, and then unique retroviral vector integration sites are determined based on the alignment scores of the remaining sequence reads. CONCLUSIONS VISA offers a simple web interface to upload sequence files and results are returned in a concise tabular format to allow rapid analysis of retroviral vector integration sites.
Collapse
Affiliation(s)
- Jonah D Hocum
- Department of Pharmaceutical Sciences, Washington State University, Spokane, WA, 99210, USA.
| | - Logan R Battrell
- Department of Pharmaceutical Sciences, Washington State University, Spokane, WA, 99210, USA.
| | - Ryan Maynard
- Department of Pharmaceutical Sciences, Washington State University, Spokane, WA, 99210, USA.
| | - Jennifer E Adair
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA.
| | - Brian C Beard
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA.
| | - David J Rawlings
- Department of Pediatrics, University of Washington, Seattle, WA, 98195, USA.
| | - Hans-Peter Kiem
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA.
| | - Daniel G Miller
- Department of Pediatrics, University of Washington, Seattle, WA, 98195, USA.
| | - Grant D Trobridge
- Department of Pharmaceutical Sciences, Washington State University, Spokane, WA, 99210, USA. .,School of Molecular Biosciences, Washington State University, Pullman, WA, 99164, USA.
| |
Collapse
|
18
|
Unraveling the web of viroinformatics: computational tools and databases in virus research. J Virol 2014; 89:1489-501. [PMID: 25428870 DOI: 10.1128/jvi.02027-14] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The beginning of the second century of research in the field of virology (the first virus was discovered in 1898) was marked by its amalgamation with bioinformatics, resulting in the birth of a new domain--viroinformatics. The availability of more than 100 Web servers and databases embracing all or specific viruses (for example, dengue virus, influenza virus, hepatitis virus, human immunodeficiency virus [HIV], hemorrhagic fever virus [HFV], human papillomavirus [HPV], West Nile virus, etc.) as well as distinct applications (comparative/diversity analysis, viral recombination, small interfering RNA [siRNA]/short hairpin RNA [shRNA]/microRNA [miRNA] studies, RNA folding, protein-protein interaction, structural analysis, and phylotyping and genotyping) will definitely aid the development of effective drugs and vaccines. However, information about their access and utility is not available at any single source or on any single platform. Therefore, a compendium of various computational tools and resources dedicated specifically to virology is presented in this article.
Collapse
|
19
|
Calabria A, Leo S, Benedicenti F, Cesana D, Spinozzi G, Orsini M, Merella S, Stupka E, Zanetti G, Montini E. VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites. Genome Med 2014; 6:67. [PMID: 25342980 PMCID: PMC4169225 DOI: 10.1186/s13073-014-0067-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2014] [Accepted: 08/22/2014] [Indexed: 11/10/2022] Open
Abstract
The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings.
Collapse
Affiliation(s)
- Andrea Calabria
- San Raffaele Telethon Institute for Gene Therapy (TIGET), San Raffaele Scientific Institute, 20132 Milano, Italy
| | - Simone Leo
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09010 Pula, CA Italy ; Università degli Studi di Cagliari, 09124 Cagliari, Italy
| | - Fabrizio Benedicenti
- San Raffaele Telethon Institute for Gene Therapy (TIGET), San Raffaele Scientific Institute, 20132 Milano, Italy
| | - Daniela Cesana
- San Raffaele Telethon Institute for Gene Therapy (TIGET), San Raffaele Scientific Institute, 20132 Milano, Italy
| | - Giulio Spinozzi
- San Raffaele Telethon Institute for Gene Therapy (TIGET), San Raffaele Scientific Institute, 20132 Milano, Italy ; Department of Informatics, Systems and Communication (DISCo) - University of Milano-Bicocca, Milano, Italy
| | - Massimilano Orsini
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09010 Pula, CA Italy
| | - Stefania Merella
- Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Elia Stupka
- Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 58, 20132 Milano, Italy
| | - Gianluigi Zanetti
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09010 Pula, CA Italy
| | - Eugenio Montini
- San Raffaele Telethon Institute for Gene Therapy (TIGET), San Raffaele Scientific Institute, 20132 Milano, Italy
| |
Collapse
|
20
|
Gao H, Hawkins T, Jasti A, Chen YH, Mockaitis K, Dinauer M, Cornetta K. Development and Evaluation of Quality Metrics for Bioinformatics Analysis of Viral Insertion Site Data Generated Using High Throughput Sequencing. Biomedicines 2014; 2:195-210. [PMID: 28548067 PMCID: PMC5423470 DOI: 10.3390/biomedicines2020195] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Revised: 03/26/2014] [Accepted: 04/28/2014] [Indexed: 11/18/2022] Open
Abstract
Integration of viral vectors into a host genome is associated with insertional mutagenesis and subjects in clinical gene therapy trials must be monitored for this adverse event. Several PCR based methods such as ligase-mediated (LM) PCR, linear-amplification-mediated (LAM) PCR and non-restrictive (nr) LAM PCR were developed to identify sites of vector integration. Coupling the power of next-generation sequencing technologies with various PCR approaches will provide a comprehensive and genome-wide profiling of insertion sites and increase throughput. In this bioinformatics study, we aimed to develop and apply quality metrics to viral insertion data obtained using next-generation sequencing. We developed five simple metrics for assessing next-generation sequencing data from different PCR products and showed how the metrics can be used to objectively compare runs performed with the same methodology as well as data generated using different PCR techniques. The results will help researchers troubleshoot complex methodologies, understand the quality of sequencing data, and provide a starting point for developing standardization of vector insertion site data analysis.
Collapse
Affiliation(s)
- Hongyu Gao
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, IB 130, 975 West Walnut Street, Indianapolis, IN 46202, USA.
| | - Troy Hawkins
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, IB 130, 975 West Walnut Street, Indianapolis, IN 46202, USA.
| | - Aparna Jasti
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, IB 130, 975 West Walnut Street, Indianapolis, IN 46202, USA.
| | - Yu-Hsiang Chen
- Department of Biology, Indiana University-Purdue University, Indianapolis, IN 46202, USA.
| | - Keithanne Mockaitis
- Department of Biology and Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405-3700, USA.
| | - Mary Dinauer
- Department of Pediatrics and Pathology and Immunology, Washington University, St. Louis, MO 63110, USA.
| | - Kenneth Cornetta
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, IB 130, 975 West Walnut Street, Indianapolis, IN 46202, USA.
- Departments of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
- Microbiology and Immunology, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
| |
Collapse
|
21
|
Li JW, Wan R, Yu CS, Co NN, Wong N, Chan TF. ViralFusionSeq: accurately discover viral integration events and reconstruct fusion transcripts at single-base resolution. ACTA ACUST UNITED AC 2013; 29:649-51. [PMID: 23314323 PMCID: PMC3582262 DOI: 10.1093/bioinformatics/btt011] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
SUMMARY Insertional mutagenesis from virus infection is an important pathogenic risk for the development of cancer. Despite the advent of high-throughput sequencing, discovery of viral integration sites and expressed viral fusion events are still limited. Here, we present ViralFusionSeq (VFS), which combines soft-clipping information, read-pair analysis and targeted de novo assembly to discover and annotate viral-human fusions. VFS was used in an RNA-Seq experiment, simulated DNA-Seq experiment and re-analysis of published DNA-Seq datasets. Our experiments demonstrated that VFS is both sensitive and highly accurate. AVAILABILITY VFS is distributed under GPL version 3 at http://hkbic.cuhk.edu.hk/software/viralfusionseq
Collapse
Affiliation(s)
- Jing-Woei Li
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
| | | | | | | | | | | |
Collapse
|
22
|
Cornetta K, Tessanne K, Long C, Yao J, Satterfield C, Westhusin M. Transgenic sheep generated by lentiviral vectors: safety and integration analysis of surrogates and their offspring. Transgenic Res 2012. [PMID: 23180364 DOI: 10.1007/s11248-012-9674-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The safety of HIV-1 based vectors was evaluated during the production of transgenic sheep. Vectors were introduced into the perivitelline space of in vivo derived one-cell sheep embryos by microinjection then transferred into the oviducts of recipient females. At 60-70 days of gestation, a portion of the recipients were euthanized and tissues collected from both surrogates and fetuses. Other ewes were allowed to carry lambs to term. Inadvertent transfer of vector from offspring to surrogates was evaluated in 330 blood and tissue samples collected from 57 ewes that served as embryo recipients. Excluding uterine contents, none of the samples tested positive for vector, indicating that that the vector did not cross the fetal maternal interface and infect surrogate ewes. Evaluating ewes, fetuses and lambs for replication competent lentivirus (RCL); 84 serum samples analyzed for HIV-1 capsid by ELISA and over 600 blood and tissue samples analyzed by quantitative PCR for the VSV-G envelopes revealed no evidence of RCL. Results of these experiments provide further evidence as to the safety of HIV-1 based vectors in animal and human applications.
Collapse
Affiliation(s)
- Kenneth Cornetta
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, IB 130, 975 W. Walnut St., Indianapolis, IN, 4620, USA.
| | | | | | | | | | | |
Collapse
|
23
|
Tarantal AF, Skarlatos SI. Center for fetal monkey gene transfer for heart, lung, and blood diseases: an NHLBI resource for the gene therapy community. Hum Gene Ther 2012; 23:1130-5. [PMID: 22974119 PMCID: PMC3498881 DOI: 10.1089/hum.2012.178] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2012] [Accepted: 09/12/2012] [Indexed: 12/17/2022] Open
Abstract
The goals of the National Heart, Lung, and Blood Institute (NHLBI) Center for Fetal Monkey Gene Transfer for Heart, Lung, and Blood Diseases are to conduct gene transfer studies in monkeys to evaluate safety and efficiency; and to provide NHLBI-supported investigators with expertise, resources, and services to actively pursue gene transfer approaches in monkeys in their research programs. NHLBI-supported projects span investigators throughout the United States and have addressed novel approaches to gene delivery; "proof-of-principle"; assessed whether findings in small-animal models could be demonstrated in a primate species; or were conducted to enable new grant or IND submissions. The Center for Fetal Monkey Gene Transfer for Heart, Lung, and Blood Diseases successfully aids the gene therapy community in addressing regulatory barriers, and serves as an effective vehicle for advancing the field.
Collapse
Affiliation(s)
- Alice F Tarantal
- Center for Fetal Monkey Gene Transfer for Heart, Lung, and Blood Diseases, University of California, Davis, 95616, USA.
| | | |
Collapse
|
24
|
Gabriel R, Schmidt M, von Kalle C. Integration of retroviral vectors. Curr Opin Immunol 2012; 24:592-7. [DOI: 10.1016/j.coi.2012.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2012] [Accepted: 08/23/2012] [Indexed: 11/26/2022]
|
25
|
Huston MW, Brugman MH, Horsman S, Stubbs A, van der Spek P, Wagemaker G. Comprehensive investigation of parameter choice in viral integration site analysis and its effects on the gene annotations produced. Hum Gene Ther 2012; 23:1209-19. [PMID: 22909036 DOI: 10.1089/hum.2011.037] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Introducing therapeutic genes into hematopoietic stem cells using retroviral vector-mediated gene transfer is an effective treatment for monogenic diseases. The risks of therapeutic gene integration include aberrant expression of a neighboring gene, resulting in oncogenesis at low frequencies (10(-7)-10(-6)/transduced cell). Mechanisms governing insertional mutagenesis are the subject of intensive ongoing studies that produce large amounts of sequencing data representing genomic regions flanking viral integration sites (IS). Validating and analyzing these data require automated bioinformatics applications. The exact methods used vary between applications, based on the requirements and preferences of the designer. The parameters used to analyze sequence data are capable of shaping the resulting integration site annotations, but a comprehensive examination of these effects is lacking. Here we present a web-based tool for integration site analysis, called Methods for Analyzing ViRal Integration Collections (MAVRIC), and use its highly customizable interface to look at how IS annotations can vary based on the analysis parameters. We used the integration data of the previously published adenosine deaminase severe combined immunodeficiency (ADA-SCID) gene therapy trials for evaluation of MAVRIC. The output illustrates how MAVRIC allows for direct multiparameter comparison of integration patterns. Careful analysis of the SCID data and reanalyses using different parameters for trimming, alignment, and repeat masking revealed the degree of variation that can be expected to arise due to changes in these parameters. We observed mainly small differences in annotation, with the largest effects caused by masking repeat sequences and by changing the size of the window around the IS.
Collapse
Affiliation(s)
- Marshall W Huston
- Department of Hematology, Erasmus University Medical Center, GE Rotterdam, The Netherlands
| | | | | | | | | | | |
Collapse
|
26
|
Arens A, Appelt JU, Bartholomae CC, Gabriel R, Paruzynski A, Gustafson D, Cartier N, Aubourg P, Deichmann A, Glimm H, von Kalle C, Schmidt M. Bioinformatic clonality analysis of next-generation sequencing-derived viral vector integration sites. Hum Gene Ther Methods 2012; 23:111-8. [PMID: 22559057 DOI: 10.1089/hgtb.2011.219] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Clonality analysis of viral vector-transduced cell populations represents a convincing approach to dissect the physiology of tissue and organ regeneration, to monitor the fate of individual gene-corrected cells in vivo, and to assess vector biosafety. With the decoding of mammalian genomes and the introduction of next-generation sequencing technologies, the demand for automated bioinformatic analysis tools that can rapidly process and annotate vector integration sites is rising. Here, we provide a publicly accessible, graphical user interface-guided automated bioinformatic high-throughput integration site analysis pipeline. Its performance and key features are illustrated on pyrosequenced linear amplification-mediated PCR products derived from one patient previously enrolled in the first lentiviral vector clinical gene therapy study. Analysis includes trimming of vector genome junctions, alignment of genomic sequence fragments to the host genome for the identification of integration sites, and the annotation of nearby genomic elements. Most importantly, clinically relevant features comprise the determination of identical integration sites with respect to different time points or cell lineages, as well as the retrieval of the most prominent cell clones and common integration sites. The resulting output is summarized in tables within a convenient spreadsheet and can be further processed by researchers without profound bioinformatic knowledge.
Collapse
Affiliation(s)
- Anne Arens
- Department of Translational Oncology, National Center for Tumor Diseases and German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Urbański DF, Małolepszy A, Stougaard J, Andersen SU. Genome-wide LORE1 retrotransposon mutagenesis and high-throughput insertion detection in Lotus japonicus. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 69:731-41. [PMID: 22014280 DOI: 10.1111/j.1365-313x.2011.04827.x] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Use of insertion mutants facilitates functional analysis of genes, but it has been difficult to identify a suitable mutagen and to establish large populations for reverse genetics in most plant species. The main challenge is developing efficient high-throughput procedures for both mutagenesis and identification of insertion sites. To date, only floral-dip T-DNA transformation of Arabidopsis has produced independent germinal insertions, thereby allowing generation of mutant populations from seeds of single plants. In addition, advances in insertion detection have been hampered by a lack of protocols, including software for automated data analysis, that take full advantage of high-throughput next-generation sequencing. We have addressed these challenges by developing the FSTpoolit protocol and software package, and here we demonstrate its efficacy by detecting 8935 LORE1 insertions in 3744 Lotus japonicus plants. The identified insertions show that the endogenous LORE1 retrotransposon is well suited for insertion mutagenesis due to homogenous gene targeting and exonic insertion preference. As LORE1 transposition occurs in the germline, harvesting seeds from a single founder line and cultivating progeny generates a complete mutant population. This ease of LORE1 mutagenesis, combined with the efficient FSTpoolit protocol, which exploits 2D pooling, Illumina sequencing and automated data analysis, allows highly cost-efficient development of a comprehensive reverse genetic resource.
Collapse
Affiliation(s)
- Dorian Fabian Urbański
- Centre for Carbohydrate Recognition and Signalling, Department of Molecular Biology, Aarhus University, Gustav Wieds Vej 10, DK-8000 Aarhus C, Denmark
| | | | | | | |
Collapse
|
28
|
Bartholomae CC, Glimm H, von Kalle C, Schmidt M. Insertion site pattern: global approach by linear amplification-mediated PCR and mass sequencing. Methods Mol Biol 2012; 859:255-265. [PMID: 22367877 DOI: 10.1007/978-1-61779-603-6_15] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
In gene therapy, viral or nonviral integrating vectors are used to deliver a corrected gene to replace the corresponding defective cellular gene. As vector delivery is (yet) commonly not targeted to a specific site in the host genome, and vector integration may lead to unwanted cellular gene deregulation, the comprehensive analysis of vector locations is a crucial approach to assess vector biosafety and to follow the fate of the gene corrected cells in vivo. The retrieved vector integration sites are unique for each transduced cell clone, thereby serving as a molecular marker and allowing to track distinct cell clones in various samples. Today, several PCR-based methods are available for the identification and characterization of unknown flanking DNA sequences (Mueller and Wold Science 246:780-786, 1989; Paruzynski et al. Nat Protoc 5:1379-1395, 2010; Schmidt et al. Nat Methods 4:1051-1057, 2007; Silver and Keerikatte J Virol 63:1924-1928, 1989). Thereof, the linear amplification-mediated PCR (LAM-PCR) proved to exhibit the highest sensitivity, allowing the detection of miscellaneous vector integration sites in one sample. The broad application spectrum and robustness of LAM-PCR has been approved by its application as a tool for the molecular follow up of gene-modified cells in preclinical and clinical gene therapy trials (Li et al. Science 296:497, 2002; Cartier et al. Science 326:818-823, 2009; Ott et al. Nat Med 12:401-409, 2006; Deichmann et al. J Clin Invest 117:2225-2232, 2007). The combination of LAM-PCR and next-generation sequencing (NGS) platforms offers the opportunity to study the clonal inventory and pharmacokinetics in clinical gene therapy studies.
Collapse
Affiliation(s)
- Cynthia C Bartholomae
- Department of Translational Oncology, National Center of Tumor Diseases and German Cancer Research Center, Heidelberg, Germany
| | | | | | | |
Collapse
|
29
|
Paruzynski A, Glimm H, Schmidt M, Kalle CV. Analysis of the clonal repertoire of gene-corrected cells in gene therapy. Methods Enzymol 2012; 507:59-87. [PMID: 22365769 DOI: 10.1016/b978-0-12-386509-0.00004-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Gene therapy-based clinical phase I/II studies using integrating retroviral vectors could successfully treat different monogenetic inherited diseases. However, with increased efficiency of this therapy, severe side effects occurred in various gene therapy trials. In all cases, integration of the vector close to or within a proto-oncogene contributed substantially to the development of the malignancies. Thus, the in-depth analysis of integration site patterns is of high importance to uncover potential clonal outgrowth and to assess the safety of gene transfer vectors and gene therapy protocols. The standard and nonrestrictive linear amplification-mediated PCR (nrLAM-PCR) in combination with high-throughput sequencing exhibits technologies that allow to comprehensively analyze the clonal repertoire of gene-corrected cells and to assess the safety of the used vector system at an early stage on the molecular level. It enables clarifying the biological consequences of the vector system on the fate of the transduced cell. Furthermore, the downstream performance of real-time PCR allows a quantitative estimation of the clonality of individual cells and their clonal progeny. Here, we present a guideline that should allow researchers to perform comprehensive integration site analysis in preclinical and clinical studies.
Collapse
Affiliation(s)
- Anna Paruzynski
- Department of Translational Oncology, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581 and 460, Heidelberg, Germany
| | | | | | | |
Collapse
|
30
|
Zhao X, Liu Q, Cai Q, Li Y, Xu C, Li Y, Li Z, Zhang X. Dr.VIS: a database of human disease-related viral integration sites. Nucleic Acids Res 2011; 40:D1041-6. [PMID: 22135288 PMCID: PMC3245036 DOI: 10.1093/nar/gkr1142] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Viral integration plays an important role in the development of malignant diseases. Viruses differ in preferred integration site and flanking sequence. Viral integration sites (VIS) have been found next to oncogenes and common fragile sites. Understanding the typical DNA features near VIS is useful for the identification of potential oncogenes, prediction of malignant disease development and assessing the probability of malignant transformation in gene therapy. Therefore, we have built a database of human disease-related VIS (Dr.VIS, http://www.scbit.org/dbmi/drvis) to collect and maintain human disease-related VIS data, including characteristics of the malignant disease, chromosome region, genomic position and viral–host junction sequence. The current build of Dr.VIS covers about 600 natural VIS of 5 oncogenic viruses representing 11 diseases. Among them, about 200 VIS have viral–host junction sequence.
Collapse
Affiliation(s)
- Xin Zhao
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai, China
| | | | | | | | | | | | | | | |
Collapse
|