1
|
Merelli I, Beretta S, Cesana D, Gennari A, Benedicenti F, Spinozzi G, Cesini D, Montini E, D’Agostino D, Calabria A. InCliniGene enables high-throughput and comprehensive in vivo clonal tracking toward clinical genomics data integration. Database (Oxford) 2023; 2023:baad069. [PMID: 37935583 PMCID: PMC10630073 DOI: 10.1093/database/baad069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 08/15/2023] [Accepted: 10/04/2023] [Indexed: 11/09/2023]
Abstract
High-throughput clonal tracking in patients under hematopoietic stem cell gene therapy with integrating vector is instrumental in assessing bio-safety and efficacy. Monitoring the fate of millions of transplanted clones and their progeny across differentiation and proliferation over time leverages the identification of the vector integration sites, used as surrogates of clonal identity. Although γ-tracking retroviral insertion sites (γ-TRIS) is the state-of-the-art algorithm for clonal identification, the computational drawbacks in the tracking algorithm, based on a combinatorial all-versus-all strategy, limit its use in clinical studies with several thousands of samples per patient. We developed the first clonal tracking graph database, InCliniGene (https://github.com/calabrialab/InCliniGene), that imports the output files of γ-TRIS and generates the graph of clones (nodes) connected by arches if two nodes share common genomic features as defined by the γ-TRIS rules. Embedding both clonal data and their connections in the graph, InCliniGene can track all clones longitudinally over samples through data queries that fully explore the graph. This approach resulted in being highly accurate and scalable. We validated InCliniGene using an in vitro dataset, specifically designed to mimic clinical cases, and tested the accuracy and precision. InCliniGene allows extensive use of γ-TRIS in large gene therapy clinical applications and naturally realizes the full data integration of molecular and genomics data, clinical and treatment measurements and genomic annotations. Further extensions of InCliniGene with data federation and with application programming interface will support data mining toward precision, personalized and predictive medicine in gene therapy. Database URL: https://github.com/calabrialab/InCliniGene.
Collapse
Affiliation(s)
| | - Stefano Beretta
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Daniela Cesana
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Alessandro Gennari
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Fabrizio Benedicenti
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Giulio Spinozzi
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Daniele Cesini
- Centro Nazionale Analisi Fotogrammi (CNAF), Istituto Nazionale di Fisica Nucleare, Viale Carlo Berti Pichat 6/2, Bologna 40127, Italy
| | - Eugenio Montini
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Daniele D’Agostino
- Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi (DIBRIS), Università degli Studi di Genova, Viale Causa 13, Genoa 16145, Italy
- Institute of Biomedical Technologies, Italian National Research Council, Via Fratelli Cervi 93, Segrate (MI) 20054, Italy
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| | - Andrea Calabria
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milano 20132, Italy
| |
Collapse
|
2
|
Yan A, Baricordi C, Nguyen Q, Barbarossa L, Loperfido M, Biasco L. IS-Seq: a bioinformatics pipeline for integration sites analysis with comprehensive abundance quantification methods. BMC Bioinformatics 2023; 24:286. [PMID: 37464281 DOI: 10.1186/s12859-023-05390-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/16/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Integration site (IS) analysis is a fundamental analytical platform for evaluating the safety and efficacy of viral vector based preclinical and clinical Gene Therapy (GT). A handful of groups have developed standardized bioinformatics pipelines to process IS sequencing data, to generate reports, and/or to perform comparative studies across different GT trials. Keeping up with the technological advances in the field of IS analysis, different computational pipelines have been published over the past decade. These pipelines focus on identifying IS from single-read sequencing or paired-end sequencing data either using read-based or using sonication fragment-based methods, but there is a lack of a bioinformatics tool that automatically includes unique molecular identifiers (UMI) for IS abundance estimations and allows comparing multiple quantification methods in one integrated pipeline. RESULTS Here we present IS-Seq a bioinformatics pipeline that can process data from paired-end sequencing of both old restriction sites-based IS collection methods and new sonication-based IS retrieval systems while allowing the selection of different abundance estimation methods, including read-based, Fragment-based and UMI-based systems. CONCLUSIONS We validated the performance of IS-Seq by testing it against the most popular analytical workflow available in the literature (INSPIIRED) and using different scenarios. Lastly, by performing extensive simulation studies and a comprehensive wet-lab assessment of our IS-Seq pipeline we could show that in clinically relevant scenarios, UMI quantification provides better accuracy than the currently most widely used sonication fragment counts as a method for IS abundance estimation.
Collapse
Affiliation(s)
| | | | | | | | | | - Luca Biasco
- AVROBIO, Inc., Cambridge, MA, USA.
- Infection, Immunity and Inflammation Department, Great Ormond Street Institute of Child Health, University College London, London, UK.
| |
Collapse
|
3
|
Longitudinal single-cell profiling of chemotherapy response in acute myeloid leukemia. Nat Commun 2023; 14:1285. [PMID: 36890137 PMCID: PMC9995364 DOI: 10.1038/s41467-023-36969-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 02/22/2023] [Indexed: 03/10/2023] Open
Abstract
Acute myeloid leukemia may be characterized by a fraction of leukemia stem cells (LSCs) that sustain disease propagation eventually leading to relapse. Yet, the contribution of LSCs to early therapy resistance and AML regeneration remains controversial. We prospectively identify LSCs in AML patients and xenografts by single-cell RNA sequencing coupled with functional validation by a microRNA-126 reporter enriching for LSCs. Through nucleophosmin 1 (NPM1) mutation calling or chromosomal monosomy detection in single-cell transcriptomes, we discriminate LSCs from regenerating hematopoiesis, and assess their longitudinal response to chemotherapy. Chemotherapy induced a generalized inflammatory and senescence-associated response. Moreover, we observe heterogeneity within progenitor AML cells, some of which proliferate and differentiate with expression of oxidative-phosphorylation (OxPhos) signatures, while others are OxPhos (low) miR-126 (high) and display enforced stemness and quiescence features. miR-126 (high) LSCs are enriched at diagnosis in chemotherapy-refractory AML and at relapse, and their transcriptional signature robustly stratifies patients for survival in large AML cohorts.
Collapse
|
4
|
Dalwadi DA, Calabria A, Tiyaboonchai A, Posey J, Naugler WE, Montini E, Grompe M. AAV integration in human hepatocytes. Mol Ther 2021; 29:2898-2909. [PMID: 34461297 DOI: 10.1016/j.ymthe.2021.08.031] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 08/01/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022] Open
Abstract
Recombinant adeno-associated viral (rAAV) vectors are considered promising tools for gene therapy directed at the liver. Whereas rAAV is thought to be an episomal vector, its single-stranded DNA genome is prone to intra- and inter-molecular recombination leading to rearrangements and integration into the host cell genome. Here, we ascertained the integration frequency of rAAV in human hepatocytes transduced either ex vivo or in vivo and subsequently expanded in a mouse model of xenogeneic liver regeneration. Chromosomal rAAV integration events and vector integrity were determined using the capture-PacBio sequencing approach, a long-read next-generation sequencing method that has not previously been used for this purpose. Chromosomal integrations were found at a surprisingly high frequency of 1%-3% both in vitro and in vivo. Importantly, most of the inserted rAAV sequences were heavily rearranged and were accompanied by deletions of the host genomic sequence at the integration site.
Collapse
Affiliation(s)
- Dhwanil A Dalwadi
- Papé Family Pediatric Research Institute, Department of Pediatrics, Oregon Health and Science University, Portland, OR 97239, USA
| | - Andrea Calabria
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele Scientific Institute, Milan, Italy
| | - Amita Tiyaboonchai
- Papé Family Pediatric Research Institute, Department of Pediatrics, Oregon Health and Science University, Portland, OR 97239, USA
| | - Jeffrey Posey
- Papé Family Pediatric Research Institute, Department of Pediatrics, Oregon Health and Science University, Portland, OR 97239, USA
| | - Willscott E Naugler
- Department of Medicine, Division of Gastroenterology and Hepatology, Oregon Health and Science University, Portland, OR 97239, USA
| | - Eugenio Montini
- San Raffaele Telethon Institute for Gene Therapy, IRCCS Ospedale San Raffaele Scientific Institute, Milan, Italy
| | - Markus Grompe
- Papé Family Pediatric Research Institute, Department of Pediatrics, Oregon Health and Science University, Portland, OR 97239, USA.
| |
Collapse
|
5
|
Afzal S, Fronza R, Schmidt M. VSeq-Toolkit: Comprehensive Computational Analysis of Viral Vectors in Gene Therapy. MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT 2020; 17:752-757. [PMID: 32346552 PMCID: PMC7177155 DOI: 10.1016/j.omtm.2020.03.024] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 03/25/2020] [Indexed: 11/17/2022]
Abstract
Viral vector characterization and analysis are important components for the development of safe gene therapeutic products, elucidating the potential genotoxic and immunogenic effects of vectors and establishing their safety profiles. Here, we present VSeq-Toolkit, which offers varying analysis modes for viral gene therapy data. The first mode determines the undesirable known contaminants and their frequency in viral preparations or other sequencing data. The second mode is designed for the analysis of intra-vector fusion breakpoints and the third mode for unraveling the viral-host fusion events distribution. Analysis modes of our toolkit can be executed independently or together and allow the analysis of multiple viral vectors concurrently. It has been designed and evaluated for the analysis of short read high-throughput sequencing data, including whole-genome or targeted sequencing. VSeq-Toolkit is developed in Perl and Bash programming languages and is available at https://github.com/CompMeth/VSeq-Toolkit.
Collapse
Affiliation(s)
- Saira Afzal
- Department of Translational Oncology, German Cancer Research Center (DKFZ), National Center for Tumor Diseases (NCT), Heidelberg, Germany
- Corresponding author: Saira Afzal, Department of Translational Oncology (G100), German Cancer Research Center (DKFZ), National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 581, 69120 Heidelberg, Germany.
| | | | - Manfred Schmidt
- Department of Translational Oncology, German Cancer Research Center (DKFZ), National Center for Tumor Diseases (NCT), Heidelberg, Germany
- GeneWerk GmbH, Heidelberg, Germany
| |
Collapse
|