1
|
Bolduan F, Müller-Bötticher N, Debnath O, Eichhorn I, Giesecke Y, Wetzel A, Sahay S, Zemojtel T, Jaeger M, Ungethuem U, Roderburg C, Kunze CA, Lehmann A, Horst D, Tacke F, Eils R, Wiedenmann B, Sigal M, Ishaque N. Small intestinal neuroendocrine tumors lack early genomic drivers, acquire DNA repair defects and harbor hallmarks of low REST expression. Sci Rep 2025; 15:17969. [PMID: 40410286 PMCID: PMC12102166 DOI: 10.1038/s41598-025-01912-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2024] [Accepted: 05/09/2025] [Indexed: 05/25/2025] Open
Abstract
The tumorigenesis of small intestinal neuroendocrine tumors (siNETs) is not understood and comprehensive genomic and transcriptomic data sets are limited. Therefore, we performed whole genome and transcriptome analysis of 39 well differentiated siNET samples. Our genomic data revealed a lack of recurrent driver mutations and demonstrated that multifocal siNETs from individual patients can arise genetically independently. We detected germline mutations in Fanconi anemia DNA repair pathway (FANC) genes, involved in homologous recombination (HR) DNA repair, in 9% of patients and found mutational signatures of defective HR DNA repair in late-stage tumor evolution. Furthermore, transcriptomic analysis revealed low expression of the transcriptional repressor REST. Summarizing, we identify a novel common transcriptomic signature of siNETs and demonstrate that genomic alterations alone do not explain initial tumor formation, while impaired DNA repair likely contributes to tumor evolution and represents a potential pharmaceutical target in a subset of patients.
Collapse
Affiliation(s)
- Felix Bolduan
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany
- BIH Charité Junior Digital Clinician Scientist Program, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, BIH Biomedical Innovation Academy, Charitéplatz 1, 10117, Berlin, Germany
| | - Niklas Müller-Bötticher
- Center of Digital Health, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195, Berlin, Germany
| | - Olivia Debnath
- Center of Digital Health, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Ines Eichhorn
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany
| | - Yvonne Giesecke
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany
| | - Alexandra Wetzel
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany
| | - Shashwat Sahay
- Center of Digital Health, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195, Berlin, Germany
| | - Tomasz Zemojtel
- Core Facility Genomics, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Marten Jaeger
- Core Facility Genomics, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Ute Ungethuem
- Core Facility Genomics, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Christoph Roderburg
- Department of Gastroenterology, Hepatology and Infectious Diseases, University Hospital Düsseldorf, Medical Faculty of Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Catarina Alisa Kunze
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, 10117, Berlin, Germany
| | - Annika Lehmann
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, 10117, Berlin, Germany
| | - David Horst
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, 10117, Berlin, Germany
- German Cancer Consortium (DKTK), Partner Site Berlin, CCCC (Campus Mitte), Berlin, Germany
| | - Frank Tacke
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany
| | - Roland Eils
- Center of Digital Health, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195, Berlin, Germany
| | - Bertram Wiedenmann
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany.
| | - Michael Sigal
- Department of Hepatology & Gastroenterology, Charité Universitätsmedizin Berlin, Campus Virchow-Klinikum and Campus Charité Mitte, 13353, Berlin, Germany.
- Berlin Institute for Medical Systems Biology, Hannoversche Straße 28, 10115, Berlin, Germany.
| | - Naveed Ishaque
- Center of Digital Health, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.
| |
Collapse
|
2
|
Cuenca-Guardiola J, de la Morena-Barrio B, Corral J, Fernández-Breis JT. Advanced analysis of retrotransposon variation in the human genome with nanopore sequencing using RetroInspector. Sci Rep 2025; 15:14489. [PMID: 40281075 PMCID: PMC12032414 DOI: 10.1038/s41598-025-98847-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 04/15/2025] [Indexed: 04/29/2025] Open
Abstract
Transposable elements (TEs) make up 45% of the human genome, are a source of genetic variability difficult to detect, and involved in processes related to gene regulation and disease. Nanopore sequencing is recognized as one of the best technologies for detecting TEs; however, tools for analyzing of human TE insertions and deletions with nanopore-based data can be improved. RetroInspector is an easy to use, configurable Snakemake pipeline that performs detection, annotation, enrichment, and genotyping of TEs. RetroInspector requires the FASTQ files of the samples and the reference genome to start the identification and analysis of TEs. The user can also set the threshold for the number of supporting reads for the variant filtering. RetroInspector also allows users to compare the results of two samples. Different versions of the reference genome can be used and the presence of retrotransposition features can be annotated. RetroInspector has been run on three nanopore sequencing datasets and validated experimentally using proprietary and public data with over 80% precision.
Collapse
Affiliation(s)
- Javier Cuenca-Guardiola
- Departamento de Informática y Sistemas, IMIB-Pascual Parrilla, CEIR Campus Mare Nostrum, Universidad de Murcia, 30100, Murcia, Spain
| | - Belén de la Morena-Barrio
- Servicio de Hematología, CIBERER-ISCIII, IMIB-Pascual Parrilla, Centro Regional de Hemodonación, Hospital Universitario Morales Meseguer, Universidad de Murcia, 30003, Murcia, Spain
| | - Javier Corral
- Servicio de Hematología, CIBERER-ISCIII, IMIB-Pascual Parrilla, Centro Regional de Hemodonación, Hospital Universitario Morales Meseguer, Universidad de Murcia, 30003, Murcia, Spain
| | - Jesualdo Tomás Fernández-Breis
- Departamento de Informática y Sistemas, IMIB-Pascual Parrilla, CEIR Campus Mare Nostrum, Universidad de Murcia, 30100, Murcia, Spain.
| |
Collapse
|
3
|
Porubsky D, Dashnow H, Sasani TA, Logsdon GA, Hallast P, Noyes MD, Kronenberg ZN, Mokveld T, Koundinya N, Nolan C, Steely CJ, Guarracino A, Dolzhenko E, Harvey WT, Rowell WJ, Grigorev K, Nicholas TJ, Goldberg ME, Oshima KK, Lin J, Ebert P, Watkins WS, Leung TY, Hanlon VCT, McGee S, Pedersen BS, Happ HC, Jeong H, Munson KM, Hoekzema K, Chan DD, Wang Y, Knuth J, Garcia GH, Fanslow C, Lambert C, Lee C, Smith JD, Levy S, Mason CE, Garrison E, Lansdorp PM, Neklason DW, Jorde LB, Quinlan AR, Eberle MA, Eichler EE. Human de novo mutation rates from a four-generation pedigree reference. Nature 2025:10.1038/s41586-025-08922-2. [PMID: 40269156 DOI: 10.1038/s41586-025-08922-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 03/20/2025] [Indexed: 04/25/2025]
Abstract
Understanding the human de novo mutation (DNM) rate requires complete sequence information1. Here using five complementary short-read and long-read sequencing technologies, we phased and assembled more than 95% of each diploid human genome in a four-generation, twenty-eight-member family (CEPH 1463). We estimate 98-206 DNMs per transmission, including 74.5 de novo single-nucleotide variants, 7.4 non-tandem repeat indels, 65.3 de novo indels or structural variants originating from tandem repeats, and 4.4 centromeric DNMs. Among male individuals, we find 12.4 de novo Y chromosome events per generation. Short tandem repeats and variable-number tandem repeats are the most mutable, with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 16% of de novo single-nucleotide variants are postzygotic in origin with no paternal bias, including early germline mosaic mutations. We place all this variation in the context of a high-resolution recombination map (~3.4 kb breakpoint resolution) and find no correlation between meiotic crossover and de novo structural variants. These near-telomere-to-telomere familial genomes provide a truth set to understand the most fundamental processes underlying human genetic variation.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Nidhi Koundinya
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Cody J Steely
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Andrea Guarracino
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Kirill Grigorev
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Michael E Goldberg
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Keisuke K Oshima
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter Ebert
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Tiffany Y Leung
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Vincent C T Hanlon
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Sean McGee
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hannah C Happ
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Daniel D Chan
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Yanni Wang
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Joshua D Smith
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Erik Garrison
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Peter M Lansdorp
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Deborah W Neklason
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
4
|
Nummi P, Cajuso T, Norri T, Taira A, Kuisma H, Välimäki N, Lepistö A, Renkonen-Sinisalo L, Koskensalo S, Seppälä TT, Ristimäki A, Tahkola K, Mattila A, Böhm J, Mecklin JP, Siili E, Pasanen A, Heikinheimo O, Bützow R, Karhu A, Burns KH, Palin K, Aaltonen LA. Structural features of somatic and germline retrotransposition events in humans. Mob DNA 2025; 16:20. [PMID: 40264183 PMCID: PMC12016303 DOI: 10.1186/s13100-025-00357-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Accepted: 04/08/2025] [Indexed: 04/24/2025] Open
Abstract
BACKGROUND Transposons are DNA sequences able to move or copy themselves to other genomic locations leading to insertional mutagenesis. Although transposon-derived sequences account for half of the human genome, most elements are no longer transposition competent. Moreover, transposons are normally repressed through epigenetic silencing in healthy adult tissues but become derepressed in several human cancers, with high activity detected in colorectal cancer. Their impact on non-malignant and malignant tissue as well as the differences between somatic and germline retrotransposition remain poorly understood. With new sequencing technologies, including long read sequencing, we can access intricacies of retrotransposition, such as insertion sequence details and nested repeats, that have been previously challenging to characterize. RESULTS In this study, we investigate somatic and germline retrotransposition by analyzing long read sequencing from 56 colorectal cancers and 112 uterine leiomyomas. We identified 1495 somatic insertions in colorectal samples, while striking lack of insertions was detected in uterine leiomyomas. Our findings highlight differences between somatic and germline events, such as transposon type distribution, insertion length, and target site preference. Leveraging long-read sequencing, we provide an in-depth analysis of the twin-priming phenomenon, detecting it across transposable element types that remain active in humans, including Alus. Additionally, we detect an abundance of germline transposons in repetitive DNA, along with a relationship between replication timing and insertion target site. CONCLUSIONS Our study reveals a stark contrast in somatic transposon activity between colorectal cancers and uterine leiomyomas, and highlights differences between somatic and germline transposition. This suggests potentially different conditions in malignant and non-malignant tissues, as well as in germline and somatic tissues, which could be involved in the transposition process. Long-read sequencing provided important insights into transposon behavior, allowing detailed examination of structural features such as twin priming and nested elements.
Collapse
Affiliation(s)
- Päivi Nummi
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
| | - Tatiana Cajuso
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, 00014, Helsinki, Finland
- Department of Pathology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Tuukka Norri
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
- Department of Computer Science, University of Helsinki, Helsinki, 00014, Finland
| | - Aurora Taira
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
| | - Heli Kuisma
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
| | - Niko Välimäki
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
| | - Anna Lepistö
- Department of Gastrointestinal Surgery, Helsinki University Central Hospital, University of Helsinki, Helsinki, 00290, Finland
| | - Laura Renkonen-Sinisalo
- Department of Gastrointestinal Surgery, Helsinki University Central Hospital, University of Helsinki, Helsinki, 00290, Finland
| | - Selja Koskensalo
- Department of Gastrointestinal Surgery, Helsinki University Central Hospital, University of Helsinki, Helsinki, 00290, Finland
| | - Toni T Seppälä
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Faculty of Medicine and Health Technology, University of Tampere and TAYS Cancer Centre, Tampere, 33100, Finland
- Department of Gastroenterology and Alimentary Tract Surgery, Tampere University Hospital, Tampere, 33520, Finland
- Abdominal Center, Helsinki University Hospital, Helsinki University, Helsinki, 00290, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, 00290, Finland
| | - Ari Ristimäki
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Pathology, HUS Diagnostic Center, Helsinki University Hospital and University of Helsinki, Helsinki, 00290, Finland
| | - Kyösti Tahkola
- Department of Surgery, Wellbeing Services County of Central Finland / Hospital Nova of Central Finland, Jyväskylä, 40620, Finland
| | - Anne Mattila
- Department of Surgery, Wellbeing Services County of Central Finland / Hospital Nova of Central Finland, Jyväskylä, 40620, Finland
| | - Jan Böhm
- Department of Surgery, Wellbeing Services County of Central Finland / Hospital Nova of Central Finland, Jyväskylä, 40620, Finland
| | - Jukka-Pekka Mecklin
- Department of Science, Well Being Services County of Central Finland, Jyväskylä, 40620, Finland
- Department of Health Sciences, Faculty of Sport and Health Sciences, University of Jyväskylä, Jyväskylä, 40014, Finland
| | - Emma Siili
- Department of Pathology, HUS Diagnostic Center, Helsinki University Hospital and University of Helsinki, Helsinki, 00290, Finland
| | - Annukka Pasanen
- Department of Pathology, HUS Diagnostic Center, Helsinki University Hospital and University of Helsinki, Helsinki, 00290, Finland
| | - Oskari Heikinheimo
- Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Hospital, Helsinki, 00290, Finland
| | - Ralf Bützow
- Department of Pathology, HUS Diagnostic Center, Helsinki University Hospital and University of Helsinki, Helsinki, 00290, Finland
- Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Hospital, Helsinki, 00290, Finland
| | - Auli Karhu
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
| | - Kathleen H Burns
- Department of Pathology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02115, USA
- Department of Pathology, Mass General Brigham and Harvard Medical School, Boston, MA, 02115, USA
| | - Kimmo Palin
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland.
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland.
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, 00290, Finland.
| | - Lauri A Aaltonen
- Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, 00014, Finland
- Department of Medical and Clinical Genetics, Medicum, University of Helsinki, Helsinki, 00014, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, 00290, Finland
| |
Collapse
|
5
|
Ramirez P, Sun W, Dehkordi SK, Zare H, Pascarella G, Carninci P, Fongang B, Bieniek KF, Frost B. Nanopore Long-Read Sequencing Unveils Genomic Disruptions in Alzheimer's Disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.02.01.578450. [PMID: 38370753 PMCID: PMC10871260 DOI: 10.1101/2024.02.01.578450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Studies in laboratory models and postmortem human brain tissue from patients with Alzheimer's disease have revealed disruption of basic cellular processes such as DNA repair and epigenetic control as drivers of neurodegeneration. While genomic alterations in regions of the genome that are rich in repetitive sequences, often termed "dark regions," are difficult to resolve using traditional sequencing approaches, long-read technologies offer promising new avenues to explore previously inaccessible regions of the genome. In the current study, we leverage nanopore-based long-read whole-genome sequencing of DNA extracted from postmortem human frontal cortex at early and late stages of Alzheimer's disease, as well as age-matched controls, to analyze retrotransposon insertion events, non-allelic homologous recombination (NAHR), structural variants and DNA methylation within retrotransposon loci and other repetitive/dark regions of the human genome. Interestingly, we find that retrotransposon insertion events and repetitive element-associated NAHR are particularly enriched within centromeric and pericentromeric regions of DNA in the aged human brain, and that ribosomal DNA (rDNA) is subject to a high degree of NAHR compared to other regions of the genome. We detect a trending increase in potential somatic retrotransposition events of the small interfering nuclear element (SINE) AluY in late-stage Alzheimer's disease, and differential changes in methylation within repetitive elements and retrotransposons according to disease stage. Taken together, our analysis provides the first long-read DNA sequencing-based analysis of retrotransposon sequences, NAHR, structural variants, and DNA methylation in the aged brain, and points toward transposable elements, centromeric/pericentromeric regions and rDNA as hotspots for genomic variation.
Collapse
Affiliation(s)
- Paulino Ramirez
- Barshop Institute for Longevity and Aging Studies
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, Texas
- Brown University, Providence, Rhode Island
| | - Wenyan Sun
- Barshop Institute for Longevity and Aging Studies
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, Texas
- Clinical Neuroscience Research Center, Department of Neurosurgery, School of Medicine, Tulane University, New Orleans, Louisiana
| | - Shiva Kazempour Dehkordi
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, Texas
| | - Habil Zare
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, Texas
| | | | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Bernard Fongang
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Biochemistry & Structural Biology, University of Texas Health San Antonio, San Antonio, Texas
| | - Kevin F. Bieniek
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Pathology, University of Texas Health San Antonio, San Antonio, Texas
| | - Bess Frost
- Barshop Institute for Longevity and Aging Studies
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, Texas
- Brown University, Providence, Rhode Island
| |
Collapse
|
6
|
An Z, Jiang A, Chen J. Toward understanding the role of genomic repeat elements in neurodegenerative diseases. Neural Regen Res 2025; 20:646-659. [PMID: 38886931 PMCID: PMC11433896 DOI: 10.4103/nrr.nrr-d-23-01568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 12/21/2023] [Accepted: 03/02/2024] [Indexed: 06/20/2024] Open
Abstract
Neurodegenerative diseases cause great medical and economic burdens for both patients and society; however, the complex molecular mechanisms thereof are not yet well understood. With the development of high-coverage sequencing technology, researchers have started to notice that genomic repeat regions, previously neglected in search of disease culprits, are active contributors to multiple neurodegenerative diseases. In this review, we describe the association between repeat element variants and multiple degenerative diseases through genome-wide association studies and targeted sequencing. We discuss the identification of disease-relevant repeat element variants, further powered by the advancement of long-read sequencing technologies and their related tools, and summarize recent findings in the molecular mechanisms of repeat element variants in brain degeneration, such as those causing transcriptional silencing or RNA-mediated gain of toxic function. Furthermore, we describe how in silico predictions using innovative computational models, such as deep learning language models, could enhance and accelerate our understanding of the functional impact of repeat element variants. Finally, we discuss future directions to advance current findings for a better understanding of neurodegenerative diseases and the clinical applications of genomic repeat elements.
Collapse
Affiliation(s)
- Zhengyu An
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Aidi Jiang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Jingqi Chen
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, Shanghai, China
- Zhangjiang Fudan International Innovation Center, Shanghai, China
| |
Collapse
|
7
|
Solovyov A, Behr JM, Hoyos D, Banks E, Drong AW, Thornlow B, Zhong JZ, Garcia-Rivera E, McKerrow W, Chu C, Arisdakessian C, Zaller DM, Kamihara J, Diao L, Fromer M, Greenbaum BD. Pan-cancer multi-omic model of LINE-1 activity reveals locus heterogeneity of retrotransposition efficiency. Nat Commun 2025; 16:2049. [PMID: 40021663 PMCID: PMC11871128 DOI: 10.1038/s41467-025-57271-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 02/12/2025] [Indexed: 03/03/2025] Open
Abstract
Somatic mobilization of LINE-1 (L1) has been implicated in cancer etiology. We analyzed a recent TCGA data release comprised of nearly 5000 pan-cancer paired tumor-normal whole-genome sequencing (WGS) samples and ~9000 tumor RNA samples. We developed TotalReCall an improved algorithm and pipeline for detection of L1 retrotransposition (RT), finding high correlation between L1 expression and "RT burden" per sample. Furthermore, we mathematically model the dual regulatory roles of p53, where mutations in TP53 disrupt regulation of both L1 expression and retrotransposition. We found those with Li-Fraumeni Syndrome (LFS) heritable TP53 pathogenic and likely pathogenic variants bear similarly high L1 activity compared to matched cancers from patients without LFS, suggesting this population be considered in attempts to target L1 therapeutically. Due to improved sensitivity, we detect over 10 genes beyond TP53 whose mutations correlate with L1, including ATRX, suggesting other, potentially targetable, mechanisms underlying L1 regulation in cancer remain to be discovered.
Collapse
Affiliation(s)
- Alexander Solovyov
- Halvorsen Center for Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | | | - David Hoyos
- Halvorsen Center for Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Eric Banks
- ROME Therapeutics, Inc., Boston, MA, USA
- Acorn Biosciences, Cambridge, MA, USA
| | | | | | | | | | | | - Chong Chu
- ROME Therapeutics, Inc., Boston, MA, USA
| | | | | | - Junne Kamihara
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | | | | | - Benjamin D Greenbaum
- Halvorsen Center for Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Physiology, Biophysics & Systems Biology, Weill Cornell Medical College, New York, NY, USA.
| |
Collapse
|
8
|
Daigle A, Whitehouse LS, Zhao R, Emerson JJ, Schrider DR. Leveraging long-read assemblies and machine learning to enhance short-read transposable element detection and genotyping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.11.637720. [PMID: 39990489 PMCID: PMC11844559 DOI: 10.1101/2025.02.11.637720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Transposable elements (TEs) are parasitic genomic elements that are ubiquitous across the tree of life and play a crucial role in genome evolution. Advances in long-read sequencing have allowed highly accurate TE detection, though at a higher cost than short-read sequencing. Recent studies using long reads have shown that existing short-read TE detection methods perform inadequately when applied to real data. In this study, we use a machine learning approach (called TEforest) to discover and genotype TE insertions and deletions with short-read data by using TEs detected from long-read genome assemblies as training data. Our method first uses a highly sensitive algorithm to discover potential TE insertion or deletion sites in the genome, extracting relevant features from short-read alignments. To discriminate between true and false TE insertions, we train a random forest model with a labeled ground-truth dataset for which we have calculated the same set of short-read features. We conduct a comprehensive benchmark of TEforest and traditional TE detection methods using real data, finding that TEforest identifies more true positives and fewer false positives across datasets with different read lengths and coverages, while also accurately inferring genotypes and the precise breakpoints of insertions. By learning short-read signatures of TEs previously only discoverable using long reads, our approach bridges the gap between large-scale population genetic studies and the accuracy of long-read assemblies. This work provides a user-friendly tool to study the prevalence and phenotypic effects of TE insertions across the genome.
Collapse
Affiliation(s)
- Austin Daigle
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC 27599
| | - Logan S. Whitehouse
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC 27599
| | - Roy Zhao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697
| | - JJ Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697
| | - Daniel R. Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
9
|
Zhou W, Mumm C, Gan Y, Switzenberg JA, Wang J, De Oliveira P, Kathuria K, Losh SJ, McDonald TL, Bessell B, Van Deynze K, McConnell MJ, Boyle AP, Mills RE. A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.18.629274. [PMID: 39763954 PMCID: PMC11702624 DOI: 10.1101/2024.12.18.629274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Somatic mutations in individual cells lead to genomic mosaicism, contributing to the intricate regulatory landscape of genetic disorders and cancers. To evaluate and refine the detection of somatic mosaicism across different technologies with personalized donor-specific assembly (DSA), we obtained tissue from the dorsolateral prefrontal cortex (DLPFC) of a post-mortem neurotypical 31-year-old individual. We sequenced bulk DLPFC tissue using Oxford Nanopore Technologies (~60X), NovaSeq (~30X), and linked-read sequencing (~28X). Additionally, we applied Cas9 capture methodology coupled with long-read sequencing (TEnCATS), targeting active transposable elements. We also isolated and amplified DNA from flow-sorted single DLPFC neurons using MALBAC, sequencing 115 of these MALBAC libraries on Nanopore and 94 on NovaSeq. We constructed a haplotype-resolved assembly with a total length of 5.77 Gb and a phase block length of 2.67 Mb (N50) to facilitate cross-platform analysis of somatic genetic variations. We observed an increase in the phasing rate from 11.6% to 38.0% between short-read and long-read technologies. By generating a catalog of phased germline SNVs, CNVs, and TEs from the assembled genome, we applied standard approaches to recall these variants across sequencing technologies. We achieved aggregated recall rates from 97.3% to 99.4% based on long-read bulk tissue data, setting an upper bound for detection limits. Moreover, utilizing haplotype-based analysis from DSA, we achieved a remarkable reduction in false positive somatic calls in bulk tissue, ranging from 14.9% to 72.4%. We developed pipelines leveraging DSA information to enhance somatic large genetic variant calling in long-read single cells. By examining somatic variation using long-reads in 115 individual neurons, we identified 468 candidate somatic heterozygous large deletions (1.5Mb - 20Mb), 137 of which intersected with short-read single-cell data. Additionally, we identified 61 putative somatic TEs (60 Alus, one LINE-1) in the single-cell data. Collectively, our analysis spans personalized assembly to single-cell somatic variant calling, providing a comprehensive ab initio ad finem approach and resource in real human tissue.
Collapse
Affiliation(s)
- Weichen Zhou
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Camille Mumm
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Yanming Gan
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jessica A. Switzenberg
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jinhao Wang
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | - Kunal Kathuria
- Lieber Institute for Brain Development, Baltimore, MD, USA
| | - Steven J. Losh
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Torrin L. McDonald
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Brandt Bessell
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Kinsey Van Deynze
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | - Alan P. Boyle
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Ryan E. Mills
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
10
|
Mendez-Dorantes C, Zeng X, Karlow JA, Schofield P, Turner S, Kalinowski J, Denisko D, Lee EA, Burns KH, Zhang CZ. Chromosomal rearrangements and instability caused by the LINE-1 retrotransposon. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.14.628481. [PMID: 39764018 PMCID: PMC11702581 DOI: 10.1101/2024.12.14.628481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
LINE-1 (L1) retrotransposition is widespread in many cancers, especially those with a high burden of chromosomal rearrangements. However, whether and to what degree L1 activity directly impacts genome integrity is unclear. Here, we apply whole-genome sequencing to experimental models of L1 expression to comprehensively define the spectrum of genomic changes caused by L1. We provide definitive evidence that L1 expression frequently and directly causes both local and long-range chromosomal rearrangements, small and large segmental copy-number alterations, and subclonal copy-number heterogeneity due to ongoing chromosomal instability. Mechanistically, all these alterations arise from DNA double-strand breaks (DSBs) generated by L1-encoded ORF2p. The processing of ORF2p-generated DSB ends prior to their ligation can produce diverse rearrangements of the target sequences. Ligation between DSB ends generated at distal loci can generate either stable chromosomes or unstable dicentric, acentric, or ring chromosomes that undergo subsequent evolution through breakage-fusion bridge cycles or DNA fragmentation. Together, these findings suggest L1 is a potent mutagenic force capable of driving genome evolution beyond simple insertions.
Collapse
Affiliation(s)
- Carlos Mendez-Dorantes
- Department of Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
- Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA
| | - Xi Zeng
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, 02115, USA
- Department of Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, PRC
| | - Jennifer A Karlow
- Department of Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
- Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA
| | - Phillip Schofield
- Department of Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
| | - Serafina Turner
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
| | - Jupiter Kalinowski
- Department of Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
| | - Danielle Denisko
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, 02115, USA
- Department of Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, 02115, USA
- Department of Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA
| | - Kathleen H Burns
- Department of Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
- Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA
| | - Cheng-Zhong Zhang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
- Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA
| |
Collapse
|
11
|
French CE, Andrews NC, Beggs AH, Boone PM, Brownstein CA, Chopra M, Chou J, Chung WK, D'Gama AM, Doan RN, Ebrahimi-Fakhari D, Goldstein RD, Irons M, Jacobsen C, Kenna M, Lee T, Madden JA, Majmundar AJ, Mann N, Morton SU, Poduri A, Randolph AG, Roberts AE, Roberts S, Sampson MG, Shao DD, Shao W, Sharma A, Shearer E, Shimamura A, Snapper SB, Srivastava S, Thiagarajah JR, Whitman MC, Wojcik MH, Rockowitz S, Sliz P. Hospital-wide access to genomic data advanced pediatric rare disease research and clinical outcomes. NPJ Genom Med 2024; 9:60. [PMID: 39622807 PMCID: PMC11612168 DOI: 10.1038/s41525-024-00441-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 10/14/2024] [Indexed: 12/06/2024] Open
Abstract
Boston Children's Hospital has established a genomic sequencing and analysis research initiative to improve clinical care for pediatric rare disease patients. Through the Children's Rare Disease Collaborative (CRDC), the hospital offers CLIA-grade exome and genome sequencing, along with other sequencing types, to patients enrolled in specialized rare disease research studies. The data, consented for broad research use, are harmonized and analyzed with CRDC-supported variant interpretation tools. Since its launch, 66 investigators representing 26 divisions and 45 phenotype-based cohorts have joined the CRDC. These studies enrolled 4653 families, with 35% of analyzed cases having a finding either confirmed or under further investigation. This accessible and harmonized genomics platform also supports additional institutional data collections, research and clinical, and now encompasses 13,800+ patients and their families. This has fostered new research projects and collaborations, increased genetic diagnoses and accelerated innovative research via integration of genomics research with clinical care.
Collapse
Affiliation(s)
- Courtney E French
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
| | - Nancy C Andrews
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Alan H Beggs
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Philip M Boone
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Catherine A Brownstein
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Maya Chopra
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Rosamund Stone Zander Translational Neuroscience Center, Boston Children's Hospital, Boston, MA, USA
| | - Janet Chou
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Immunology, Boston Children's Hospital, Boston, MA, USA
| | - Wendy K Chung
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Alissa M D'Gama
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
| | - Ryan N Doan
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Darius Ebrahimi-Fakhari
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
- F.M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
| | - Richard D Goldstein
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of General Pediatrics, Boston Children's Hospital, Boston, MA, USA
| | - Mira Irons
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Christina Jacobsen
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA
| | - Margaret Kenna
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, Boston, MA, USA
- Department of Otolaryngology Head and Neck Surgery, Harvard Medical School, Boston, MA, USA
| | - Ted Lee
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Urology, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery, Harvard Medical School, Boston, MA, USA
| | - Jill A Madden
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Amar J Majmundar
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Nephrology, Boston Children's Hospital, Boston, MA, USA
| | - Nina Mann
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Nephrology, Boston Children's Hospital, Boston, MA, USA
| | - Sarah U Morton
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Annapurna Poduri
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
| | - Adrienne G Randolph
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Department of Anesthesiology, Critical Care and Pain Medicine, Boston Children's Hospital, Boston, MA, USA
- Department of Anaesthesia, Harvard Medical School, Boston, MA, USA
| | - Amy E Roberts
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA
| | - Stephanie Roberts
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA
| | - Matthew G Sampson
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Nephrology, Boston Children's Hospital, Boston, MA, USA
- Division of Nephrology, Brigham and Women's Hospital, Boston, MA, USA
| | - Diane D Shao
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
| | - Wanqing Shao
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
| | - Aditi Sharma
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
| | - Eliot Shearer
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, Boston, MA, USA
- Department of Otolaryngology Head and Neck Surgery, Harvard Medical School, Boston, MA, USA
| | - Akiko Shimamura
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Department of Hematology and Oncology, Boston Children's Hospital, Boston, MA, USA
- Dana Farber Cancer Institute, Boston, MA, USA
| | - Scott B Snapper
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Hepatology and Nutrition, Boston Children's Hospital, Boston, MA, USA
| | - Siddharth Srivastava
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Rosamund Stone Zander Translational Neuroscience Center, Boston Children's Hospital, Boston, MA, USA
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
| | - Jay R Thiagarajah
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Hepatology and Nutrition, Boston Children's Hospital, Boston, MA, USA
| | - Mary C Whitman
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- F.M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
- Department of Ophthalmology, Boston Children's Hospital, Boston, MA, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | - Monica H Wojcik
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Shira Rockowitz
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Piotr Sliz
- Children's Rare Disease Collaborative, Boston Children's Hospital, Boston, MA, USA.
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
- Division of Molecular Medicine, Boston Children's Hospital, Boston, MA, USA.
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
12
|
Bortoluzzi C, Mapel XM, Neuenschwander S, Janett F, Pausch H, Leonard AS. Genome assembly of wisent (Bison bonasus) uncovers a deletion that likely inactivates the THRSP gene. Commun Biol 2024; 7:1580. [PMID: 39604663 PMCID: PMC11603333 DOI: 10.1038/s42003-024-07295-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Accepted: 11/19/2024] [Indexed: 11/29/2024] Open
Abstract
The wisent (Bison bonasus) is Europe's largest land mammal. We produced a HiFi read-based wisent assembly with a contig N50 value of 91 Mb containing 99.7% of the highly conserved single copy mammalian genes which improves contiguity a thousand-fold over an existing assembly. Extended runs of homozygosity in the wisent genome compromised the separation of the HiFi reads into parental-specific read sets, which resulted in inferior haplotype assemblies. A bovine super-pangenome built with assemblies from wisent, bison, gaur, yak, taurine and indicine cattle identified a 1580 bp deletion removing the protein-coding sequence of THRSP encoding thyroid hormone-responsive protein from the wisent and bison genomes. Analysis of 725 sequenced samples across the Bovinae subfamily showed that the deletion is fixed in both Bison species but absent in Bos and Bubalus. The THRSP transcript is abundant in adipose, fat, liver, muscle, and mammary gland tissue of Bos and Bubalus, but absent in bison. This indicates that the deletion likely inactivates THRSP in bison. We show that super-pangenomes can reveal potentially trait-associated variation across phylogenies, but also demonstrate that haplotype assemblies from species that went through population bottlenecks warrant scrutiny, as they may have accumulated long runs of homozygosity that complicate phasing.
Collapse
Affiliation(s)
| | | | | | - Fredi Janett
- Clinic of Reproductive Medicine, University of Zurich, Zurich, Switzerland
| | | | | |
Collapse
|
13
|
Lee JO, Lee S, Lee D, Hwang T, Joe S, Yang JO, Jeong J, Ohn JH, Kim JH. KTED: a comprehensive web-based database for transposable elements in the Korean genome. BIOINFORMATICS ADVANCES 2024; 4:vbae179. [PMID: 39697868 PMCID: PMC11652267 DOI: 10.1093/bioadv/vbae179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 10/08/2024] [Accepted: 11/15/2024] [Indexed: 12/20/2024]
Abstract
Summary Transposable elements (TEs), commonly referred to as "mobile elements," constitute DNA segments capable of relocating within a genome. Initially disregarded as "junk DNA" devoid of specific functionality, it has become evident that TEs have diverse influences on an organism's biology and health. The impact of these elements varies according to their location, classification, and their effects on specific genes or regulatory components. Despite their significant roles, a paucity of resources concerning TEs in population-scale genome sequencing remains. Herein, we analyze whole-genome sequencing data sourced from the Korean Genome and Epidemiology Study, encompassing 2500 Korean individuals. To facilitate convenient data access and observation, we developed a web-based database, KTED. Additionally, we scrutinized the differential distributions of TEs across five distinct common disease groups: dyslipidemia, hypertension, diabetes, thyroid disease, and cancer. Availability and implementation https://snubh.shinyapps.io/KTED.
Collapse
Affiliation(s)
- Jin-Ok Lee
- Department of Health Science and Technology, Graduate School of Convergence Science and Technology, Seoul National University, Seoul 13605, Republic of Korea
| | - Sejoon Lee
- Precision Medicine Center, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
- Department of Genomic Medicine, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
| | - Dongyoon Lee
- Korea Bioinformation Center, KRIBB, Daejeon 34141, Republic of Korea
| | - Taeyeon Hwang
- Korea Bioinformation Center, KRIBB, Daejeon 34141, Republic of Korea
| | - Soobok Joe
- Korea Bioinformation Center, KRIBB, Daejeon 34141, Republic of Korea
| | - Jin Ok Yang
- Korea Bioinformation Center, KRIBB, Daejeon 34141, Republic of Korea
| | - Jibin Jeong
- Department of Genomic Medicine, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
| | - Jung Hun Ohn
- Precision Medicine Center, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
- Department of Internal Medicine, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul 03080, Republic of Korea
| | - Jee Hyun Kim
- Department of Genomic Medicine, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
- Department of Internal Medicine, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul 03080, Republic of Korea
| |
Collapse
|
14
|
Bilgrav Saether K, Eisfeldt J. Detecting transposable elements in long-read genomes using sTELLeR. Bioinformatics 2024; 40:btae686. [PMID: 39558574 PMCID: PMC11601167 DOI: 10.1093/bioinformatics/btae686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 11/05/2024] [Accepted: 11/14/2024] [Indexed: 11/20/2024] Open
Abstract
MOTIVATION Repeat elements, such as transposable elements (TE), are highly repetitive DNA sequences that compose around 50% of the genome. TEs such as Alu, SVA, HERV, and L1 elements can cause disease through disrupting genes, causing frameshift mutations or altering splicing patters. These are elements challenging to characterize using short-read genome sequencing, due to its read length and TEs repetitive nature. Long-read genome sequencing (lrGS) enables bridging of TEs, allowing increased resolution across repetitive DNA sequences. lrGS therefore present an opportunity for improved TE detection and analysis not only from a research perspective but also for future clinical detection. When choosing an lrGS TE caller, parameters such as runtime, CPU hours, sensitivity, precision, and compatibility with inclusion into pipelines are crucial for efficient detection. RESULTS We therefore developed sTELLeR, (s) Transposable ELement in Long (e) Read, for accurate, fast, and effective TE detection. Particularly, sTELLeR exhibit higher precision and sensitivity for calling of Alu elements than similar tools. The caller is 5-48× as fast and uses <2% of the CPU hours compared to competitive callers. The caller is haplotype aware and output results in a variant call format (VCF) file, enabling compatibility with other variant callers and downstream analysis. AVAILABILITY AND IMPLEMENTATION sTELLeR is a python-based tool and is available at https://github.com/kristinebilgrav/sTELLeR. Altogether, we show that sTELLeR is a fast, sensitive, and precise caller for detection of TE elements, and can easily be implemented into variant calling workflows.
Collapse
Affiliation(s)
- Kristine Bilgrav Saether
- Department of Molecular Medicine and Surgery, Karolinska Institute, Stockholm 171 76, Sweden
- Clinical Genomics Facility, Science for Life Laboratory, Stockholm 171 76, Sweden
| | - Jesper Eisfeldt
- Department of Molecular Medicine and Surgery, Karolinska Institute, Stockholm 171 76, Sweden
- Clinical Genomics Facility, Science for Life Laboratory, Stockholm 171 76, Sweden
- Department of Clinical Genetics and Genomics, Karolinska University Hospital, Stockholm 171 77, Sweden
| |
Collapse
|
15
|
Groza C, Chen X, Wheeler TJ, Bourque G, Goubert C. A unified framework to analyze transposable element insertion polymorphisms using graph genomes. Nat Commun 2024; 15:8915. [PMID: 39414821 PMCID: PMC11484939 DOI: 10.1038/s41467-024-53294-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 10/02/2024] [Indexed: 10/18/2024] Open
Abstract
Transposable elements are ubiquitous mobile DNA sequences generating insertion polymorphisms, contributing to genomic diversity. We present GraffiTE, a flexible pipeline to analyze polymorphic mobile elements insertions. By integrating state-of-the-art structural variant detection algorithms and graph genomes, GraffiTE identifies polymorphic mobile elements from genomic assemblies or long-read sequencing data, and genotypes these variants using short or long read sets. Benchmarking on simulated and real datasets reports high precision and recall rates. GraffiTE is designed to allow non-expert users to perform comprehensive analyses, including in models with limited transposable element knowledge and is compatible with various sequencing technologies. Here, we demonstrate the versatility of GraffiTE by analyzing human, Drosophila melanogaster, maize, and Cannabis sativa pangenome data. These analyses reveal the landscapes of polymorphic mobile elements and their frequency variations across individuals, strains, and cultivars.
Collapse
Affiliation(s)
- Cristian Groza
- Quantitative Life Sciences, McGill University, Montréal, QC, Canada
| | - Xun Chen
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
| | - Travis J Wheeler
- R. Ken Coit College of Pharmacy, University of Arizona, Tucson, AZ, USA
| | - Guillaume Bourque
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
- Canadian Centre for Computational Genomics, McGill University, Montréal, QC, Canada
- Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada
- Human Genetics, McGill University, Montréal, QC, Canada
| | - Clément Goubert
- Human Genetics, McGill University, Montréal, QC, Canada.
- R. Ken Coit College of Pharmacy, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
16
|
Song Y, Wen H, Zhai X, Jia L, Li L. Functional Bidirectionality of ERV-Derived Long Non-Coding RNAs in Humans. Int J Mol Sci 2024; 25:10481. [PMID: 39408810 PMCID: PMC11476766 DOI: 10.3390/ijms251910481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 09/25/2024] [Accepted: 09/26/2024] [Indexed: 10/20/2024] Open
Abstract
Human endogenous retroviruses (HERVs) are widely recognized as the result of exogenous retroviruses infecting the ancestral germline, stabilizing integration and vertical transmission during human genetic evolution. To date, endogenous retroviruses (ERVs) appear to have been selected for human physiological functions with the loss of retrotransposable capabilities. ERV elements were previously regarded as junk DNA for a long time. Since then, the aberrant activation and expression of ERVs have been observed in the development of many kinds of human diseases, and their role has been explored in a variety of human disorders such as cancer. The results show that specific ERV elements play respective crucial roles. Among them, long non-coding RNAs (lncRNAs) transcribed from specific long-terminal repeat regions of ERVs are often key factors. lncRNAs are over 200 nucleotides in size and typically bind to DNA, RNA, and proteins to perform biological functions. Dysregulated lncRNAs have been implicated in a variety of diseases. In particular, studies have shown that the aberrant expression of some ERV-derived lncRNAs has a tumor-suppressive or oncogenic effect, displaying significant functional bidirectionality. Therefore, theses lncRNAs have a promising future as novel biomarkers and therapeutic targets to explore the concise relationship between ERVs and cancers. In this review, we first summarize the role of ERV-derived lncRNAs in physiological regulation, mainly including immunomodulation, the maintenance of pluripotency, and erythropoiesis. In addition, pathological regulation examples of their aberrant activation and expression leading to carcinogenesis are highlighted, and specific mechanisms of occurrence are discussed.
Collapse
Affiliation(s)
- Yanmei Song
- Department of Microbiological Laboratory Technology, School of Public Health, Cheeloo College of Medicine, Shandong University, Key Laboratory for the Prevention and Control of Emerging Infectious Diseases and Biosafety, Jinan 250012, China; (Y.S.); (H.W.)
- State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, Beijing 100850, China;
| | - Hongling Wen
- Department of Microbiological Laboratory Technology, School of Public Health, Cheeloo College of Medicine, Shandong University, Key Laboratory for the Prevention and Control of Emerging Infectious Diseases and Biosafety, Jinan 250012, China; (Y.S.); (H.W.)
| | - Xiuli Zhai
- State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, Beijing 100850, China;
- Department of Microbiology, School of Basic Medicine, Anhui Medical University, Hefei 230000, China
| | - Lei Jia
- State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, Beijing 100850, China;
| | - Lin Li
- State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, Beijing 100850, China;
| |
Collapse
|
17
|
Lee AS, Ayers LJ, Kosicki M, Chan WM, Fozo LN, Pratt BM, Collins TE, Zhao B, Rose MF, Sanchis-Juan A, Fu JM, Wong I, Zhao X, Tenney AP, Lee C, Laricchia KM, Barry BJ, Bradford VR, Jurgens JA, England EM, Lek M, MacArthur DG, Lee EA, Talkowski ME, Brand H, Pennacchio LA, Engle EC. A cell type-aware framework for nominating non-coding variants in Mendelian regulatory disorders. Nat Commun 2024; 15:8268. [PMID: 39333082 PMCID: PMC11436875 DOI: 10.1038/s41467-024-52463-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 09/04/2024] [Indexed: 09/29/2024] Open
Abstract
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generate single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. We evaluate enhancer activity for 59 elements using an in vivo transgenic assay and validate 44 (75%), demonstrating that single cell accessibility can be a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieve significant reduction in our variant search space and nominate candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work delivers non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
Collapse
Affiliation(s)
- Arthur S Lee
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA.
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Lauren J Ayers
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael Kosicki
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Wai-Man Chan
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Lydia N Fozo
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Brandon M Pratt
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Thomas E Collins
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Boxun Zhao
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Matthew F Rose
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pathology, Boston Children's Hospital, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA, USA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Jack M Fu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Isaac Wong
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Xuefang Zhao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Alan P Tenney
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Cassia Lee
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge, MA, USA
| | - Kristen M Laricchia
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Brenda J Barry
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Victoria R Bradford
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Julie A Jurgens
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eleina M England
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Monkol Lek
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Eunjung Alice Lee
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Michael E Talkowski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Harrison Brand
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Elizabeth C Engle
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA.
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA.
- Medical Genetics Training Program, Harvard Medical School, Boston, MA, USA.
- Department of Ophthalmology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
18
|
Panhwar SA, Wang D, Lin F, Wang Y, Liu M, Chen R, Huang Y, Wu W, Huang D, Xiao Y, Xia W. Characterization of active transposable elements and their new insertions in tuber propagated greater yam (Dioscorea alata). BMC Genomics 2024; 25:864. [PMID: 39285286 PMCID: PMC11403837 DOI: 10.1186/s12864-024-10779-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 09/04/2024] [Indexed: 09/20/2024] Open
Abstract
BACKGROUND Greater yam is a key staple crop grown in tropical and subtropical regions, while its asexual propagation mode had led to non-flowering mutations. How transposable elements contribute to its genetic variations is rarely analyzed. We used transcriptome and whole genome sequencing data to identify active transposable elements (TEs) and genetic variation caused by these active TEs. Our aim was to shed light on which TEs would lead to its intraspecies variation. RESULTS Annotation of de novo assembly transcripts indicated that 0.8 - 0.9% of transcripts were TE related, with LTR retrotransposons (LTR-RTs) accounted for 65% TE transcripts. A large portion of these transcripts were non-autonomous TEs, which had incomplete functional domains. The majority of mapped transcripts were distributed in genic deficient regions, with 9% of TEs overlapping with genic regions. Moreover, over 90% TE transcripts exhibited low expression levels and insufficient reads coverage to support full-length structure assembly. Subfamily analysis of Copia and Gypsy, the two LTR-RTs revealed that a small number of subfamilies contained a significantly larger number of members, which play a key role in generating TE transcript. Based on resequencing data, 15,002 L-RT insertion loci were detected for active LTR-RT members. The insertion loci of LTR-RTs were highly divergent among greater yam accessions. CONCLUSIONS This study showed the ongoing transcription and transpositions of TEs in greater yam, despite low transcription levels and incomplete proteins insufficient for autonomous transposition. While our research did not directly link these TEs to specific yam traits such as tuber yield and propagation mode, it lays a crucial foundation for further research on how these TE insertion polymorphisms (TIPs) might be related to variations in greater yam traits and its tuber propagation mode. Future research may explore the potential roles of TEs in trait variations, such as tuber yield and stress resistance, in greater yam.
Collapse
Affiliation(s)
- Sajjad Ali Panhwar
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Dandan Wang
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Fanhui Lin
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Ying Wang
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Mengli Liu
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Runan Chen
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Yonglan Huang
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China
| | - Wenqiang Wu
- School of life sciences, Hainan University, 570228, Haikou, P.R. China
| | - Dongyi Huang
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China.
| | - Yong Xiao
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China.
| | - Wei Xia
- School of Breeding and Multiplication (Sanya Institute of Breeding and Multiplication), School of Tropical Agriculture and Forestry, Hainan University, 570228, Haikou, P.R. China.
| |
Collapse
|
19
|
Wu Y, Wang F, Lyu K, Liu R. Comparative Analysis of Transposable Elements in the Genomes of Citrus and Citrus-Related Genera. PLANTS (BASEL, SWITZERLAND) 2024; 13:2462. [PMID: 39273946 PMCID: PMC11397423 DOI: 10.3390/plants13172462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 08/28/2024] [Accepted: 09/01/2024] [Indexed: 09/15/2024]
Abstract
Transposable elements (TEs) significantly contribute to the evolution and diversity of plant genomes. In this study, we explored the roles of TEs in the genomes of Citrus and Citrus-related genera by constructing a pan-genome TE library from 20 published genomes of Citrus and Citrus-related accessions. Our results revealed an increase in TE content and the number of TE types compared to the original annotations, as well as a decrease in the content of unclassified TEs. The average length of TEs per assembly was approximately 194.23 Mb, representing 41.76% (Murraya paniculata) to 64.76% (Citrus gilletiana) of the genomes, with a mean value of 56.95%. A significant positive correlation was found between genome size and both the number of TE types and TE content. Consistent with the difference in mean whole-genome size (39.83 Mb) between Citrus and Citrus-related genera, Citrus genomes contained an average of 34.36 Mb more TE sequences than Citrus-related genomes. Analysis of the estimated insertion time and half-life of long terminal repeat retrotransposons (LTR-RTs) suggested that TE removal was not the primary factor contributing to the differences among genomes. These findings collectively indicate that TEs are the primary determinants of genome size and play a major role in shaping genome structures. Principal coordinate analysis (PCoA) of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) identifiers revealed that the fragmented TEs were predominantly derived from ancestral genomes, while intact TEs were crucial in the recent evolutionary diversification of Citrus. Moreover, the presence or absence of intact TEs near the AdhE superfamily was closely associated with the bitterness trait in the Citrus species. Overall, this study enhances TE annotation in Citrus and Citrus-related genomes and provides valuable data for future genetic breeding and agronomic trait research in Citrus.
Collapse
Affiliation(s)
- Yilei Wu
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Center for Agroforestry Mega Data Science, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Fusheng Wang
- National Citrus Engineering Research Center, Citrus Research Institute, Southwest University, Chongqing 400712, China
| | - Keliang Lyu
- Center for Agroforestry Mega Data Science, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Renyi Liu
- Center for Agroforestry Mega Data Science, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| |
Collapse
|
20
|
Jiang T, Zhou Z, Zhang Z, Cao S, Wang Y, Liu Y. MEHunter: transformer-based mobile element variant detection from long reads. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae557. [PMID: 39287014 DOI: 10.1093/bioinformatics/btae557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 09/03/2024] [Accepted: 09/13/2024] [Indexed: 09/19/2024]
Abstract
SUMMARY Mobile genetic elements (MEs) are heritable mutagens that significantly contribute to genetic diseases. The advent of long-read sequencing technologies, capable of resolving large DNA fragments, offers promising prospects for the comprehensive detection of ME variants (MEVs). However, achieving high precision while maintaining recall performance remains challenging mainly brought by the variable length and similar content of MEV signatures, which are often obscured by the noise in long reads. Here, we propose MEHunter, a high-performance MEV detection approach utilizing a fine-tuned transformer model adept at identifying potential MEVs with fragmented features. Benchmark experiments on both simulated and real datasets demonstrate that MEHunter consistently achieves higher accuracy and sensitivity than the state-of-the-art tools. Furthermore, it is capable of detecting novel potentially individual-specific MEVs that have been overlooked in published population projects. AVAILABILITY AND IMPLEMENTATION MEHunter is available from https://github.com/120L021101/MEHunter.
Collapse
Affiliation(s)
- Tao Jiang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan 450000, China
| | - Zuji Zhou
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Zhendong Zhang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Shuqi Cao
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Wang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan 450000, China
| | - Yadong Liu
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan 450000, China
| |
Collapse
|
21
|
Cornish AJ, Gruber AJ, Kinnersley B, Chubb D, Frangou A, Caravagna G, Noyvert B, Lakatos E, Wood HM, Thorn S, Culliford R, Arnedo-Pac C, Househam J, Cross W, Sud A, Law P, Leathlobhair MN, Hawari A, Woolley C, Sherwood K, Feeley N, Gül G, Fernandez-Tajes J, Zapata L, Alexandrov LB, Murugaesu N, Sosinsky A, Mitchell J, Lopez-Bigas N, Quirke P, Church DN, Tomlinson IPM, Sottoriva A, Graham TA, Wedge DC, Houlston RS. The genomic landscape of 2,023 colorectal cancers. Nature 2024; 633:127-136. [PMID: 39112709 PMCID: PMC11374690 DOI: 10.1038/s41586-024-07747-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 06/24/2024] [Indexed: 08/17/2024]
Abstract
Colorectal carcinoma (CRC) is a common cause of mortality1, but a comprehensive description of its genomic landscape is lacking2-9. Here we perform whole-genome sequencing of 2,023 CRC samples from participants in the UK 100,000 Genomes Project, thereby providing a highly detailed somatic mutational landscape of this cancer. Integrated analyses identify more than 250 putative CRC driver genes, many not previously implicated in CRC or other cancers, including several recurrent changes outside the coding genome. We extend the molecular pathways involved in CRC development, define four new common subgroups of microsatellite-stable CRC based on genomic features and show that these groups have independent prognostic associations. We also characterize several rare molecular CRC subgroups, some with potential clinical relevance, including cancers with both microsatellite and chromosomal instability. We demonstrate a spectrum of mutational profiles across the colorectum, which reflect aetiological differences. These include the role of Escherichia colipks+ colibactin in rectal cancers10 and the importance of the SBS93 signature11-13, which suggests that diet or smoking is a risk factor. Immune-escape driver mutations14 are near-ubiquitous in hypermutant tumours and occur in about half of microsatellite-stable CRCs, often in the form of HLA copy number changes. Many driver mutations are actionable, including those associated with rare subgroups (for example, BRCA1 and IDH1), highlighting the role of whole-genome sequencing in optimizing patient care.
Collapse
Affiliation(s)
- Alex J Cornish
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Andreas J Gruber
- Department of Biology, University of Konstanz, Konstanz, Germany
- Manchester Cancer Research Centre, Division of Cancer Sciences, University of Manchester, Manchester, UK
| | - Ben Kinnersley
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
- University College London Cancer Institute, London, UK
| | - Daniel Chubb
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Anna Frangou
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Max Planck Institute for Molecular Cell Biology and Genetics, Dresden, Germany
| | - Giulio Caravagna
- Department of Mathematics and Geosciences, University of Trieste, Trieste, Italy
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - Boris Noyvert
- Cancer Research UK Centre and Centre for Computational Biology, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK
| | - Eszter Lakatos
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Henry M Wood
- Pathology and Data Analytics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
| | - Steve Thorn
- Department of Oncology, University of Oxford, Oxford, UK
| | - Richard Culliford
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Claudia Arnedo-Pac
- Institute for Research in Biomedicine Barcelona, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Jacob Househam
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - William Cross
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
- Research Department of Pathology, University College London, UCL Cancer Institute, London, UK
| | - Amit Sud
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Philip Law
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | | | - Aliah Hawari
- Manchester Cancer Research Centre, Division of Cancer Sciences, University of Manchester, Manchester, UK
| | - Connor Woolley
- Department of Oncology, University of Oxford, Oxford, UK
| | - Kitty Sherwood
- Department of Oncology, University of Oxford, Oxford, UK
- Edinburgh Cancer Research, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Nathalie Feeley
- Department of Oncology, University of Oxford, Oxford, UK
- Edinburgh Cancer Research, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Güler Gül
- Edinburgh Cancer Research, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | | | - Luis Zapata
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - Ludmil B Alexandrov
- Department of Cellular and Molecular Medicine, UC San Diego, La Jolla, CA, USA
- Department of Bioengineering, UC San Diego, La Jolla, CA, USA
- Moores Cancer Center, UC San Diego, La Jolla, CA, USA
| | - Nirupa Murugaesu
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Alona Sosinsky
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Jonathan Mitchell
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Nuria Lopez-Bigas
- Institute for Research in Biomedicine Barcelona, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Philip Quirke
- Pathology and Data Analytics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
| | - David N Church
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Oxford NIHR Comprehensive Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | - Andrea Sottoriva
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
- Computational Biology Research Centre, Human Technopole, Milan, Italy
| | - Trevor A Graham
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - David C Wedge
- Manchester Cancer Research Centre, Division of Cancer Sciences, University of Manchester, Manchester, UK
| | - Richard S Houlston
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| |
Collapse
|
22
|
Wang ZY, Ge LP, Ouyang Y, Jin X, Jiang YZ. Targeting transposable elements in cancer: developments and opportunities. Biochim Biophys Acta Rev Cancer 2024; 1879:189143. [PMID: 38936517 DOI: 10.1016/j.bbcan.2024.189143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 05/23/2024] [Accepted: 06/19/2024] [Indexed: 06/29/2024]
Abstract
Transposable elements (TEs), comprising nearly 50% of the human genome, have transitioned from being perceived as "genomic junk" to key players in cancer progression. Contemporary research links TE regulatory disruptions with cancer development, underscoring their therapeutic potential. Advances in long-read sequencing, computational analytics, single-cell sequencing, proteomics, and CRISPR-Cas9 technologies have enriched our understanding of TEs' clinical implications, notably their impact on genome architecture, gene regulation, and evolutionary processes. In cancer, TEs, including long interspersed element-1 (LINE-1), Alus, and long terminal repeat (LTR) elements, demonstrate altered patterns, influencing both tumorigenic and tumor-suppressive mechanisms. TE-derived nucleic acids and tumor antigens play critical roles in tumor immunity, bridging innate and adaptive responses. Given their central role in oncology, TE-targeted therapies, particularly through reverse transcriptase inhibitors and epigenetic modulators, represent a novel avenue in cancer treatment. Combining these TE-focused strategies with existing chemotherapy or immunotherapy regimens could enhance efficacy and offer a new dimension in cancer treatment. This review delves into recent TE detection advancements, explores their multifaceted roles in tumorigenesis and immune regulation, discusses emerging diagnostic and therapeutic approaches centered on TEs, and anticipates future directions in cancer research.
Collapse
Affiliation(s)
- Zi-Yu Wang
- Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Li-Ping Ge
- Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Yang Ouyang
- Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Xi Jin
- Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Yi-Zhou Jiang
- Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China.
| |
Collapse
|
23
|
Porubsky D, Dashnow H, Sasani TA, Logsdon GA, Hallast P, Noyes MD, Kronenberg ZN, Mokveld T, Koundinya N, Nolan C, Steely CJ, Guarracino A, Dolzhenko E, Harvey WT, Rowell WJ, Grigorev K, Nicholas TJ, Oshima KK, Lin J, Ebert P, Watkins WS, Leung TY, Hanlon VCT, McGee S, Pedersen BS, Goldberg ME, Happ HC, Jeong H, Munson KM, Hoekzema K, Chan DD, Wang Y, Knuth J, Garcia GH, Fanslow C, Lambert C, Lee C, Smith JD, Levy S, Mason CE, Garrison E, Lansdorp PM, Neklason DW, Jorde LB, Quinlan AR, Eberle MA, Eichler EE. A familial, telomere-to-telomere reference for human de novo mutation and recombination from a four-generation pedigree. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.05.606142. [PMID: 39149261 PMCID: PMC11326147 DOI: 10.1101/2024.08.05.606142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess de novo mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 de novo single-nucleotide variants (SNVs), 7.4 non-tandem repeat indels, 79.6 de novo indels or structural variants (SVs) originating from tandem repeats, 7.7 centromeric de novo SVs and SNVs, and 12.4 de novo Y chromosome events per generation. STRs and VNTRs are the most mutable with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations, documenting de novo SVs, and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length, and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 17% of de novo SNVs are postzygotic in origin with no paternal bias. We place all this variation in the context of a high-resolution recombination map (~3.5 kbp breakpoint resolution). We observe a strong maternal recombination bias (1.36 maternal:paternal ratio) with a consistent reduction in the number of crossovers with increasing paternal (r=0.85) and maternal (r=0.65) age. However, we observe no correlation between meiotic crossover locations and de novo SVs, arguing against non-allelic homologous recombination as a predominant mechanism. The use of multiple orthogonal technologies, near-telomere-to-telomere phased genome assemblies, and a multi-generation family to assess transmission has created the most comprehensive, publicly available "truth set" of all classes of genomic variants. The resource can be used to test and benchmark new algorithms and technologies to understand the most fundamental processes underlying human genetic variation.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Nidhi Koundinya
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Cody J Steely
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Andrea Guarracino
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William J Rowell
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Kirill Grigorev
- Blue Marble Space Institute of Science, Seattle, WA, USA
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Keisuke K Oshima
- Present address: Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter Ebert
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Tiffany Y Leung
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada
| | | | - Sean McGee
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Michael E Goldberg
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hannah C Happ
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Altos Labs, San Diego, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Daniel D Chan
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada
| | - Yanni Wang
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Joshua D Smith
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Shawn Levy
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Erik Garrison
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - Deborah W Neklason
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
24
|
Kojima S. Investigating mobile element variations by statistical genetics. Hum Genome Var 2024; 11:23. [PMID: 38816353 PMCID: PMC11140006 DOI: 10.1038/s41439-024-00280-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/17/2024] [Accepted: 04/24/2024] [Indexed: 06/01/2024] Open
Abstract
The integration of structural variations (SVs) in statistical genetics provides an opportunity to understand the genetic factors influencing complex human traits and disease. Recent advances in long-read technology and variant calling methods for short reads have improved the accurate discovery and genotyping of SVs, enabling their use in expression quantitative trait loci (eQTL) analysis and genome-wide association studies (GWAS). Mobile elements are DNA sequences that insert themselves into various genome locations. Insertional polymorphisms of mobile elements between humans, called mobile element variations (MEVs), contribute to approximately 25% of human SVs. We recently developed a variant caller that can accurately identify and genotype MEVs from biobank-scale short-read whole-genome sequencing (WGS) datasets and integrate them into statistical genetics. The use of MEVs in eQTL analysis and GWAS has a minimal impact on the discovery of genome loci associated with gene expression and disease; most disease-associated haplotypes can be identified by single nucleotide variations (SNVs). On the other hand, it helps make hypotheses about causal variants or effector variants. Focusing on MEVs, we identified multiple MEVs that contribute to differential gene expression and one of them is a potential cause of skin disease, emphasizing the importance of the integration of MEVs in medical genetics. Here, I will provide an overview of MEVs, MEV calling from WGS, and the integration of MEVs in statistical genetics. Finally, I will discuss the unanswered questions about MEVs, such as rare variants.
Collapse
Affiliation(s)
- Shohei Kojima
- Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan.
| |
Collapse
|
25
|
Chu C, Ljungström V, Tran A, Jin H, Park PJ. Contribution of de novo retroelements to birth defects and childhood cancers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305733. [PMID: 38699361 PMCID: PMC11065029 DOI: 10.1101/2024.04.15.24305733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Insertion of active retroelements-L1s, Alus, and SVAs-can disrupt proper genome function and lead to various disorders including cancer. However, the role of de novo retroelements (DNRTs) in birth defects and childhood cancers has not been well characterized due to the lack of adequate data and efficient computational tools. Here, we examine whole-genome sequencing data of 3,244 trios from 12 birth defect and childhood cancer cohorts in the Gabriella Miller Kids First Pediatric Research Program. Using an improved version of our tool xTea (x-Transposable element analyzer) that incorporates a deep-learning module, we identified 162 DNRTs, as well as 2 pseudogene insertions. Several variants are likely to be causal, such as a de novo Alu insertion that led to the ablation of a whole exon in the NF1 gene in a proband with brain tumor. We observe a high de novo SVA insertion burden in both high-intolerance loss-of-function genes and exons as well as more frequent de novo Alu insertions of paternal origin. We also identify potential mosaic DNRTs from embryonic stages. Our study reveals the important roles of DNRTs in causing birth defects and predisposition to childhood cancers.
Collapse
Affiliation(s)
- Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Viktor Ljungström
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Antuan Tran
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Hu Jin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter J. Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
26
|
Lee M, Ahmad SF, Xu J. Regulation and function of transposable elements in cancer genomes. Cell Mol Life Sci 2024; 81:157. [PMID: 38556602 PMCID: PMC10982106 DOI: 10.1007/s00018-024-05195-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/28/2024] [Accepted: 03/01/2024] [Indexed: 04/02/2024]
Abstract
Over half of human genomic DNA is composed of repetitive sequences generated throughout evolution by prolific mobile genetic parasites called transposable elements (TEs). Long disregarded as "junk" or "selfish" DNA, TEs are increasingly recognized as formative elements in genome evolution, wired intimately into the structure and function of the human genome. Advances in sequencing technologies and computational methods have ushered in an era of unprecedented insight into how TE activity impacts human biology in health and disease. Here we discuss the current views on how TEs have shaped the regulatory landscape of the human genome, how TE activity is implicated in human cancers, and how recent findings motivate novel strategies to leverage TE activity for improved cancer therapy. Given the crucial role of methodological advances in TE biology, we pair our conceptual discussions with an in-depth review of the inherent technical challenges in studying repeats, specifically related to structural variation, expression analyses, and chromatin regulation. Lastly, we provide a catalog of existing and emerging assays and bioinformatic software that altogether are enabling the most sophisticated and comprehensive investigations yet into the regulation and function of interspersed repeats in cancer genomes.
Collapse
Affiliation(s)
- Michael Lee
- Department of Pediatrics, Children's Medical Center Research Institute, University of Texas Southwestern Medical Center, 6000 Harry Hines Blvd., Dallas, TX, 75390, USA.
| | - Syed Farhan Ahmad
- Department of Pathology, Center of Excellence for Leukemia Studies, St. Jude Children's Research Hospital, 262 Danny Thomas Place - MS 345, Memphis, TN, 38105, USA
| | - Jian Xu
- Department of Pathology, Center of Excellence for Leukemia Studies, St. Jude Children's Research Hospital, 262 Danny Thomas Place - MS 345, Memphis, TN, 38105, USA.
| |
Collapse
|
27
|
Lebreton J, Colin L, Chatre E, Bernard P. RNAP II antagonizes mitotic chromatin folding and chromosome segregation by condensin. Cell Rep 2024; 43:113901. [PMID: 38446663 DOI: 10.1016/j.celrep.2024.113901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 12/07/2023] [Accepted: 02/16/2024] [Indexed: 03/08/2024] Open
Abstract
Condensin shapes mitotic chromosomes by folding chromatin into loops, but whether it does so by DNA-loop extrusion remains speculative. Although loop-extruding cohesin is stalled by transcription, the impact of transcription on condensin, which is enriched at highly expressed genes in many species, remains unclear. Using degrons of Rpb1 or the torpedo nuclease Dhp1XRN2 to either deplete or displace RNAPII on chromatin in fission yeast metaphase cells, we show that RNAPII does not load condensin on DNA. Instead, RNAPII retains condensin in cis and hinders its ability to fold mitotic chromatin and to support chromosome segregation, consistent with the stalling of a loop extruder. Transcription termination by Dhp1 limits such a hindrance. Our results shed light on the integrated functioning of condensin, and we argue that a tight control of transcription underlies mitotic chromosome assembly by loop-extruding condensin.
Collapse
Affiliation(s)
- Jérémy Lebreton
- ENS de Lyon, University Lyon, 46 allée d'Italie, 69007 Lyon, France
| | - Léonard Colin
- CNRS Laboratory of Biology and Modelling of the Cell, UMR 5239, ENS de Lyon, 46 allée d'Italie, 69007 Lyon, France
| | - Elodie Chatre
- Lymic-Platim, University Lyon, Université Claude Bernard Lyon 1, ENS de Lyon, CNRS UAR3444, Inserm US8, SFR Biosciences, 50 Avenue Tony Garnier, 69007 Lyon, France
| | - Pascal Bernard
- ENS de Lyon, University Lyon, 46 allée d'Italie, 69007 Lyon, France; CNRS Laboratory of Biology and Modelling of the Cell, UMR 5239, ENS de Lyon, 46 allée d'Italie, 69007 Lyon, France.
| |
Collapse
|
28
|
Wu Z, Li T, Jiang Z, Zheng J, Gu Y, Liu Y, Liu Y, Xie Z. Human pangenome analysis of sequences missing from the reference genome reveals their widespread evolutionary, phenotypic, and functional roles. Nucleic Acids Res 2024; 52:2212-2230. [PMID: 38364871 PMCID: PMC10954445 DOI: 10.1093/nar/gkae086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 01/18/2024] [Accepted: 01/27/2024] [Indexed: 02/18/2024] Open
Abstract
Nonreference sequences (NRSs) are DNA sequences present in global populations but absent in the current human reference genome. However, the extent and functional significance of NRSs in the human genomes and populations remains unclear. Here, we de novo assembled 539 genomes from five genetically divergent human populations using long-read sequencing technology, resulting in the identification of 5.1 million NRSs. These were merged into 45284 unique NRSs, with 29.7% being novel discoveries. Among these NRSs, 38.7% were common across the five populations, and 35.6% were population specific. The use of a graph-based pangenome approach allowed for the detection of 565 transcript expression quantitative trait loci on NRSs, with 426 of these being novel findings. Moreover, 26 NRS candidates displayed evidence of adaptive selection within human populations. Genes situated in close proximity to or intersecting with these candidates may be associated with metabolism and type 2 diabetes. Genome-wide association studies revealed 14 NRSs to be significantly associated with eight phenotypes. Additionally, 154 NRSs were found to be in strong linkage disequilibrium with 258 phenotype-associated SNPs in the GWAS catalogue. Our work expands the understanding of human NRSs and provides novel insights into their functions, facilitating evolutionary and biomedical researches.
Collapse
Affiliation(s)
- Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehang Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Jingjing Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yizhou Gu
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
- University of Wisconsin-Madison, WI, USA
| | - Yizhi Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yun Liu
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences and Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
29
|
Fukuda K. The role of transposable elements in human evolution and methods for their functional analysis: current status and future perspectives. Genes Genet Syst 2024; 98:289-304. [PMID: 37866889 DOI: 10.1266/ggs.23-00140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023] Open
Abstract
Transposable elements (TEs) are mobile DNA sequences that can insert themselves into various locations within the genome, causing mutations that may provide advantages or disadvantages to individuals and species. The insertion of TEs can result in genetic variation that may affect a wide range of human traits including genetic disorders. Understanding the role of TEs in human biology is crucial for both evolutionary and medical research. This review discusses the involvement of TEs in human traits and disease susceptibility, as well as methods for functional analysis of TEs.
Collapse
Affiliation(s)
- Kei Fukuda
- Integrative Genomics Unit, The University of Melbourne
| |
Collapse
|
30
|
Wijngaard R, Demidov G, O'Gorman L, Corominas-Galbany J, Yaldiz B, Steyaert W, de Boer E, Vissers LELM, Kamsteeg EJ, Pfundt R, Swinkels H, den Ouden A, Te Paske IBAW, de Voer RM, Faivre L, Denommé-Pichon AS, Duffourd Y, Vitobello A, Chevarin M, Straub V, Töpf A, van der Kooi AJ, Magrinelli F, Rocca C, Hanna MG, Vandrovcova J, Ossowski S, Laurie S, Gilissen C. Mobile element insertions in rare diseases: a comparative benchmark and reanalysis of 60,000 exome samples. Eur J Hum Genet 2024; 32:200-208. [PMID: 37853102 PMCID: PMC10853235 DOI: 10.1038/s41431-023-01478-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 08/29/2023] [Accepted: 10/04/2023] [Indexed: 10/20/2023] Open
Abstract
Mobile element insertions (MEIs) are a known cause of genetic disease but have been underexplored due to technical limitations of genetic testing methods. Various bioinformatic tools have been developed to identify MEIs in Next Generation Sequencing data. However, most tools have been developed specifically for genome sequencing (GS) data rather than exome sequencing (ES) data, which remains more widely used for routine diagnostic testing. In this study, we benchmarked six MEI detection tools (ERVcaller, MELT, Mobster, SCRAMble, TEMP2 and xTea) on ES data and on GS data from publicly available genomic samples (HG002, NA12878). For all the tools we evaluated sensitivity and precision of different filtering strategies. Results show that there were substantial differences in tool performance between ES and GS data. MELT performed best with ES data and its combination with SCRAMble increased substantially the detection rate of MEIs. By applying both tools to 10,890 ES samples from Solve-RD and 52,624 samples from Radboudumc we were able to diagnose 10 patients who had remained undiagnosed by conventional ES analysis until now. Our study shows that MELT and SCRAMble can be used reliably to identify clinically relevant MEIs in ES data. This may lead to an additional diagnosis for 1 in 3000 to 4000 patients in routine clinical ES.
Collapse
Affiliation(s)
- Robin Wijngaard
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - German Demidov
- Universitätsklinikum Tübingen - Institut für Medizinische Genetik und angewandte Genomik, Tübingen, Germany
| | - Luke O'Gorman
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Burcu Yaldiz
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Wouter Steyaert
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Elke de Boer
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Lisenka E L M Vissers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Erik-Jan Kamsteeg
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Rolph Pfundt
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Hilde Swinkels
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Amber den Ouden
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Iris B A W Te Paske
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Richarda M de Voer
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Laurence Faivre
- Centre de Référence Maladies Rares "Anomalies du développement et syndromes malformatifs", Centre de Génétique, FHU-TRANSLAD et Institut GIMI, CHU Dijon Bourgogne, Dijon, France
| | - Anne-Sophie Denommé-Pichon
- UMR1231-Inserm, Génétique des Anomalies du développement, Université de Bourgogne Franche-Comté, Dijon, France
- Laboratoire de Génétique chromosomique et moléculaire, UF6254 Innovation en diagnostic génomique des maladies rares, Centre Hospitalier Universitaire de Dijon, Dijon, France
| | - Yannis Duffourd
- UMR1231-Inserm, Génétique des Anomalies du développement, Université de Bourgogne Franche-Comté, Dijon, France
- Laboratoire de Génétique chromosomique et moléculaire, UF6254 Innovation en diagnostic génomique des maladies rares, Centre Hospitalier Universitaire de Dijon, Dijon, France
| | - Antonio Vitobello
- UMR1231-Inserm, Génétique des Anomalies du développement, Université de Bourgogne Franche-Comté, Dijon, France
- Laboratoire de Génétique chromosomique et moléculaire, UF6254 Innovation en diagnostic génomique des maladies rares, Centre Hospitalier Universitaire de Dijon, Dijon, France
| | - Martin Chevarin
- UMR1231-Inserm, Génétique des Anomalies du développement, Université de Bourgogne Franche-Comté, Dijon, France
- Laboratoire de Génétique chromosomique et moléculaire, UF6254 Innovation en diagnostic génomique des maladies rares, Centre Hospitalier Universitaire de Dijon, Dijon, France
| | - Volker Straub
- John Walton Muscular Dystrophy Research Centre, Translational and Clinical Research Institute, Newcastle University and Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Ana Töpf
- John Walton Muscular Dystrophy Research Centre, Translational and Clinical Research Institute, Newcastle University and Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Anneke J van der Kooi
- Department of Neurology, Amsterdam UMC, University of Amsterdam, Amsterdam Neuroscience, Amsterdam, The Netherlands
| | - Francesca Magrinelli
- Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology, London, UK
| | - Clarissa Rocca
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London, UK
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Michael G Hanna
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London, UK
| | - Jana Vandrovcova
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London, UK
| | - Stephan Ossowski
- Universitätsklinikum Tübingen - Institut für Medizinische Genetik und angewandte Genomik, Tübingen, Germany
| | - Steven Laurie
- Centro Nacional de Análisis Genómico (CNAG), Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands.
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands.
| |
Collapse
|
31
|
Liang Y, Qu X, Shah NM, Wang T. Towards targeting transposable elements for cancer therapy. Nat Rev Cancer 2024; 24:123-140. [PMID: 38228901 DOI: 10.1038/s41568-023-00653-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/04/2023] [Indexed: 01/18/2024]
Abstract
Transposable elements (TEs) represent almost half of the human genome. Historically deemed 'junk DNA', recent technological advancements have stimulated a wave of research into the functional impact of TEs on gene-regulatory networks in evolution and development, as well as in diseases including cancer. The genetic and epigenetic evolution of cancer involves the exploitation of TEs, whereby TEs contribute directly to cancer-specific gene activities. This Review provides a perspective on the role of TEs in cancer as being a 'double-edged sword', both promoting cancer evolution and representing a vulnerability that could be exploited in cancer therapy. We discuss how TEs affect transcriptome regulation and other cellular processes in cancer. We highlight the potential of TEs as therapeutic targets for cancer. We also summarize technical hurdles in the characterization of TEs with genomic assays. Last, we outline open questions and exciting future research avenues.
Collapse
Affiliation(s)
- Yonghao Liang
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Xuan Qu
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Nakul M Shah
- Division of Cancer Medicine, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA.
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO, USA.
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO, USA.
| |
Collapse
|
32
|
Mandal AK. Recent insights into crosstalk between genetic parasites and their host genome. Brief Funct Genomics 2024; 23:15-23. [PMID: 36307128 PMCID: PMC10799329 DOI: 10.1093/bfgp/elac032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/14/2022] [Accepted: 09/21/2022] [Indexed: 01/21/2024] Open
Abstract
The bulk of higher order organismal genomes is comprised of transposable element (TE) copies, i.e. genetic parasites. The host-parasite relation is multi-faceted, varying across genomic region (genic versus intergenic), life-cycle stages, tissue-type and of course in health versus pathological state. The reach of functional genomics though, in investigating genotype-to-phenotype relations, has been limited when TEs are involved. The aim of this review is to highlight recent progress made in understanding how TE origin biochemical activity interacts with the central dogma stages of the host genome. Such interaction can also bring about modulation of the immune context and this could have important repercussions in disease state where immunity has a role to play. Thus, the review is to instigate ideas and action points around identifying evolutionary adaptations that the host genome and the genetic parasite have evolved and why they could be relevant.
Collapse
Affiliation(s)
- Amit K Mandal
- Corresponding author: A.K. Mandal, Nuffield Department of Surgical Sciences (NDS), University of Oxford, Old Road Campus Research building (ORCRB), Oxford OX3 7DQ, UK. Tel: +44 (0)1865 617123; Fax: +44 (0)1865 768876; E-mail:
| |
Collapse
|
33
|
Lee AS, Ayers LJ, Kosicki M, Chan WM, Fozo LN, Pratt BM, Collins TE, Zhao B, Rose MF, Sanchis-Juan A, Fu JM, Wong I, Zhao X, Tenney AP, Lee C, Laricchia KM, Barry BJ, Bradford VR, Lek M, MacArthur DG, Lee EA, Talkowski ME, Brand H, Pennacchio LA, Engle EC. A cell type-aware framework for nominating non-coding variants in Mendelian regulatory disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.22.23300468. [PMID: 38234731 PMCID: PMC10793524 DOI: 10.1101/2023.12.22.23300468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
Collapse
Affiliation(s)
- Arthur S. Lee
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Lauren J. Ayers
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
| | - Michael Kosicki
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA
| | - Wai-Man Chan
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Lydia N. Fozo
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
| | - Brandon M. Pratt
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
| | - Thomas E. Collins
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
| | - Boxun Zhao
- Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
| | - Matthew F. Rose
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Department of Pathology, Boston Children's Hospital, Boston, MA
- Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Jack M. Fu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Isaac Wong
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Xuefang Zhao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Alan P. Tenney
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Cassia Lee
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Harvard College, Cambridge, MA
| | - Kristen M. Laricchia
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Brenda J. Barry
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Victoria R. Bradford
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
| | - Monkol Lek
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Daniel G. MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Eunjung Alice Lee
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
- Department of Genetics, Harvard Medical School, Boston, MA
| | - Michael E. Talkowski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Harrison Brand
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
- Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA
| | - Len A. Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA
| | - Elizabeth C. Engle
- Department of Neurology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA
- Department of Ophthalmology, Boston Children’s Hospital and Harvard Medical School, Boston, MA
| |
Collapse
|
34
|
Chu C, Lin EW, Tran A, Jin H, Ho NI, Veit A, Cortes-Ciriano I, Burns KH, Ting DT, Park PJ. The landscape of human SVA retrotransposons. Nucleic Acids Res 2023; 51:11453-11465. [PMID: 37823611 PMCID: PMC10681720 DOI: 10.1093/nar/gkad821] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 09/12/2023] [Accepted: 09/20/2023] [Indexed: 10/13/2023] Open
Abstract
SINE-VNTR-Alu (SVA) retrotransposons are evolutionarily young and still-active transposable elements (TEs) in the human genome. Several pathogenic SVA insertions have been identified that directly mutate host genes to cause neurodegenerative and other types of diseases. However, due to their sequence heterogeneity and complex structures as well as limitations in sequencing techniques and analysis, SVA insertions have been less well studied compared to other mobile element insertions. Here, we identified polymorphic SVA insertions from 3646 whole-genome sequencing (WGS) samples of >150 diverse populations and constructed a polymorphic SVA insertion reference catalog. Using 20 long-read samples, we also assembled reference and polymorphic SVA sequences and characterized the internal hexamer/variable-number-tandem-repeat (VNTR) expansions as well as differing SVA activity for SVA subfamilies and human populations. In addition, we developed a module to annotate both reference and polymorphic SVA copies. By characterizing the landscape of both reference and polymorphic SVA retrotransposons, our study enables more accurate genotyping of these elements and facilitate the discovery of pathogenic SVA insertions.
Collapse
Affiliation(s)
- Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Eric W Lin
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA 02129, USA
- Department of Medicine, Massachusetts General Hospital Harvard Medical School, Boston, MA 02114, USA
| | - Antuan Tran
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Hu Jin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Natalie I Ho
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA 02129, USA
- Department of Medicine, Massachusetts General Hospital Harvard Medical School, Boston, MA 02114, USA
| | - Alexander Veit
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Isidro Cortes-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
| | - Kathleen H Burns
- Department of Pathology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA
| | - David T Ting
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA 02129, USA
- Department of Medicine, Massachusetts General Hospital Harvard Medical School, Boston, MA 02114, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
35
|
Cuenca-Guardiola J, Morena-Barrio BDL, Navarro-Manzano E, Stevens J, Ouwehand WH, Gleadall NS, Corral J, Fernández-Breis JT. Detection and annotation of transposable element insertions and deletions on the human genome using nanopore sequencing. iScience 2023; 26:108214. [PMID: 37953943 PMCID: PMC10638045 DOI: 10.1016/j.isci.2023.108214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 07/28/2023] [Accepted: 10/11/2023] [Indexed: 11/14/2023] Open
Abstract
Repetitive sequences represent about 45% of the human genome. Some are transposable elements (TEs) with the ability to change their position in the genome, creating genetic variability both as insertions or deletions, with potential pathogenic consequences. We used long-read nanopore sequencing to identify TE variants in the genomes of 24 patients with antithrombin deficiency. We identified 7 344 TE insertions and 3 056 TE deletions, 2 926 were not previously described in publicly available databases. The insertions affected 3 955 genes, with 6 insertions located in exons, 3 929 in introns, and 147 in promoters. Potential functional impact was evaluated with gene annotation and enrichment analysis, which suggested a strong relationship with neuron-related functions and autism. We conclude that this study encourages the generation of a complete map of TEs in the human genome, which will be useful for identifying new TEs involved in genetic disorders.
Collapse
Affiliation(s)
- Javier Cuenca-Guardiola
- Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, IMIB-Pascual Parrilla, Facultad de Informática, Campus de Espinardo, Murcia 30100, Spain
| | - Belén de la Morena-Barrio
- Servicio de Hematología, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Pascual Parrilla, CIBERER-III, Ronda de Garay S/N, Murcia 30003, Spain
| | - Esther Navarro-Manzano
- Servicio de Hematología, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Pascual Parrilla, CIBERER-III, Ronda de Garay S/N, Murcia 30003, Spain
| | - Jonathan Stevens
- Department of Haematology, University of Cambridge, CB2 0PT, Cambridge Biomedical Campus, Cambridge, Cambridge, England, UK
- Blood and Transplant, National Health Service (NHS), CB2 0QQ, Cambridge Biomedical Campus, Cambridge, England, UK
| | - Willem H. Ouwehand
- Department of Haematology, University of Cambridge, CB2 0PT, Cambridge Biomedical Campus, Cambridge, Cambridge, England, UK
- Blood and Transplant, National Health Service (NHS), CB2 0QQ, Cambridge Biomedical Campus, Cambridge, England, UK
- British Heart Foundation Cambridge Centre of Excellence, Division of Cardiovascular Medicine, Cambridge Heart and Lung Research Institute, Cambridge Biomedical Campus, Cambridge, England CB2 0AY, UK
- University College London Hospitals, NHS Foundation Trust, London, England, UK
| | - Nicholas S. Gleadall
- Department of Haematology, University of Cambridge, CB2 0PT, Cambridge Biomedical Campus, Cambridge, Cambridge, England, UK
- Blood and Transplant, National Health Service (NHS), CB2 0QQ, Cambridge Biomedical Campus, Cambridge, England, UK
| | - Javier Corral
- Servicio de Hematología, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Pascual Parrilla, CIBERER-III, Ronda de Garay S/N, Murcia 30003, Spain
| | - Jesualdo Tomás Fernández-Breis
- Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, IMIB-Pascual Parrilla, Facultad de Informática, Campus de Espinardo, Murcia 30100, Spain
| |
Collapse
|
36
|
Lee B, Park J, Voshall A, Maury E, Kang Y, Kim YJ, Lee JY, Shim HR, Kim HJ, Lee JW, Jung MH, Kim SC, Chu HBK, Kim DW, Kim M, Choi EJ, Hwang OK, Lee HW, Ha K, Choi JK, Kim Y, Choi Y, Park WY, Lee EA. Pan-cancer analysis reveals multifaceted roles of retrotransposon-fusion RNAs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.16.562422. [PMID: 37905014 PMCID: PMC10614793 DOI: 10.1101/2023.10.16.562422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Transposon-derived transcripts are abundant in RNA sequences, yet their landscape and function, especially for fusion transcripts derived from unannotated or somatically acquired transposons, remains underexplored. Here, we developed a new bioinformatic tool to detect transposon-fusion transcripts in RNA-sequencing data and performed a pan-cancer analysis of 10,257 cancer samples across 34 cancer types as well as 3,088 normal tissue samples. We identified 52,277 cancer-specific fusions with ~30 events per cancer and hotspot loci within transposons vulnerable to fusion formation. Exonization of intronic transposons was the most prevalent genic fusions, while somatic L1 insertions constituted a small fraction of cancer-specific fusions. Source L1s and HERVs, but not Alus showed decreased DNA methylation in cancer upon fusion formation. Overall cancer-specific L1 fusions were enriched in tumor suppressors while Alu fusions were enriched in oncogenes, including recurrent Alu fusions in EZH2 predictive of patient survival. We also demonstrated that transposon-derived peptides triggered CD8+ T-cell activation to the extent comparable to EBV viruses. Our findings reveal distinct epigenetic and tumorigenic mechanisms underlying transposon fusions across different families and highlight transposons as novel therapeutic targets and the source of potent neoantigens.
Collapse
Affiliation(s)
- Boram Lee
- Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Junseok Park
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Adam Voshall
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Eduardo Maury
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Bioinformatics and Integrative Genomics Program; Harvard/MIT MD-PhD Program, Harvard Medical School, Boston, MA, USA
| | - Yeeok Kang
- Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea
| | - Yoen Jeong Kim
- Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
| | - Jin-Young Lee
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Hye-Ran Shim
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Hyo-Ju Kim
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Jung-Woo Lee
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Min-Hyeok Jung
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Si-Cho Kim
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Hoang Bao Khanh Chu
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Da-Won Kim
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Minjeong Kim
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Eun-Ji Choi
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Ok Kyung Hwang
- New Drug Development Center, KBiohealth, Cheongju-Si, Chungbuk, Republic of Korea
| | - Ho Won Lee
- New Drug Development Center, KBiohealth, Cheongju-Si, Chungbuk, Republic of Korea
| | - Kyungsoo Ha
- New Drug Development Center, KBiohealth, Cheongju-Si, Chungbuk, Republic of Korea
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea
| | - Yongjoon Kim
- Cancer Genome Research Center (CGRC), Yonsei University, Seoul, Republic of Korea
| | - Yoonjoo Choi
- Combinatorial Tumor Immunotherapy MRC, Chonnam National University Medical School, Hwasun, Republic of Korea
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
37
|
Zhao P, Peng C, Fang L, Wang Z, Liu GE. Taming transposable elements in livestock and poultry: a review of their roles and applications. Genet Sel Evol 2023; 55:50. [PMID: 37479995 PMCID: PMC10362595 DOI: 10.1186/s12711-023-00821-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/30/2023] [Indexed: 07/23/2023] Open
Abstract
Livestock and poultry play a significant role in human nutrition by converting agricultural by-products into high-quality proteins. To meet the growing demand for safe animal protein, genetic improvement of livestock must be done sustainably while minimizing negative environmental impacts. Transposable elements (TE) are important components of livestock and poultry genomes, contributing to their genetic diversity, chromatin states, gene regulatory networks, and complex traits of economic value. However, compared to other species, research on TE in livestock and poultry is still in its early stages. In this review, we analyze 72 studies published in the past 20 years, summarize the TE composition in livestock and poultry genomes, and focus on their potential roles in functional genomics. We also discuss bioinformatic tools and strategies for integrating multi-omics data with TE, and explore future directions, feasibility, and challenges of TE research in livestock and poultry. In addition, we suggest strategies to apply TE in basic biological research and animal breeding. Our goal is to provide a new perspective on the importance of TE in livestock and poultry genomes.
Collapse
Affiliation(s)
- Pengju Zhao
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Chen Peng
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Lingzhao Fang
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark.
| | - Zhengguang Wang
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China.
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, 20705, USA.
| |
Collapse
|
38
|
Wolf MM, Rathmell WK, de Cubas AA. Immunogenicity in renal cell carcinoma: shifting focus to alternative sources of tumour-specific antigens. Nat Rev Nephrol 2023; 19:440-450. [PMID: 36973495 PMCID: PMC10801831 DOI: 10.1038/s41581-023-00700-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2023] [Indexed: 03/29/2023]
Abstract
Renal cell carcinoma (RCC) comprises a group of malignancies arising from the kidney with unique tumour-specific antigen (TSA) signatures that can trigger cytotoxic immunity. Two classes of TSAs are now considered potential drivers of immunogenicity in RCC: small-scale insertions and deletions (INDELs) that result in coding frameshift mutations, and activation of human endogenous retroviruses. The presence of neoantigen-specific T cells is a hallmark of solid tumours with a high mutagenic burden, which typically have abundant TSAs owing to non-synonymous single nucleotide variations within the genome. However, RCC exhibits high cytotoxic T cell reactivity despite only having an intermediate non-synonymous single nucleotide variation mutational burden. Instead, RCC tumours have a high pan-cancer proportion of INDEL frameshift mutations, and coding frameshift INDELs are associated with high immunogenicity. Moreover, cytotoxic T cells in RCC subtypes seem to recognize tumour-specific endogenous retrovirus epitopes, whose presence is associated with clinical responses to immune checkpoint blockade therapy. Here, we review the distinct molecular landscapes in RCC that promote immunogenic responses, discuss clinical opportunities for discovery of biomarkers that can inform therapeutic immune checkpoint blockade strategies, and identify gaps in knowledge for future investigations.
Collapse
Affiliation(s)
- Melissa M Wolf
- Department of Medicine, Program in Cancer Biology, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - W Kimryn Rathmell
- Department of Medicine, Program in Cancer Biology, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Aguirre A de Cubas
- Department of Microbiology and Immunology, Medical University of South Carolina, Charleston, SC, USA.
- Hollings Cancer Center, Medical University of South Carolina, Charleston, SC, USA.
| |
Collapse
|
39
|
Kim J, Woo S, de Gusmao CM, Zhao B, Chin DH, DiDonato RL, Nguyen MA, Nakayama T, Hu CA, Soucy A, Kuniholm A, Thornton JK, Riccardi O, Friedman DA, El Achkar CM, Dash Z, Cornelissen L, Donado C, Faour KNW, Bush LW, Suslovitch V, Lentucci C, Park PJ, Lee EA, Patterson A, Philippakis AA, Margus B, Berde CB, Yu TW. A framework for individualized splice-switching oligonucleotide therapy. Nature 2023; 619:828-836. [PMID: 37438524 PMCID: PMC10371869 DOI: 10.1038/s41586-023-06277-0] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/25/2023] [Indexed: 07/14/2023]
Abstract
Splice-switching antisense oligonucleotides (ASOs) could be used to treat a subset of individuals with genetic diseases1, but the systematic identification of such individuals remains a challenge. Here we performed whole-genome sequencing analyses to characterize genetic variation in 235 individuals (from 209 families) with ataxia-telangiectasia, a severely debilitating and life-threatening recessive genetic disorder2,3, yielding a complete molecular diagnosis in almost all individuals. We developed a predictive taxonomy to assess the amenability of each individual to splice-switching ASO intervention; 9% and 6% of the individuals had variants that were 'probably' or 'possibly' amenable to ASO splice modulation, respectively. Most amenable variants were in deep intronic regions that are inaccessible to exon-targeted sequencing. We developed ASOs that successfully rescued mis-splicing and ATM cellular signalling in patient fibroblasts for two recurrent variants. In a pilot clinical study, one of these ASOs was used to treat a child who had been diagnosed with ataxia-telangiectasia soon after birth, and showed good tolerability without serious adverse events for three years. Our study provides a framework for the prospective identification of individuals with genetic diseases who might benefit from a therapeutic approach involving splice-switching ASOs.
Collapse
Affiliation(s)
- Jinkuk Kim
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Biomedical Research Center, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- KI for Health Science and Technology, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Center for Epidemic Preparedness, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
| | - Sijae Woo
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Claudio M de Gusmao
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
- Postgraduate School of Medical Science, University of Campinas (UNICAMP), São Paulo, Brazil
| | - Boxun Zhao
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Diana H Chin
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Renata L DiDonato
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Minh A Nguyen
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Tojo Nakayama
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Chunguang April Hu
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Aubrie Soucy
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Ashley Kuniholm
- Institutional Center for Clinical and Translational Research, Boston Children's Hospital, Boston, MA, USA
| | | | - Olivia Riccardi
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Danielle A Friedman
- Department of Neurology, Boston Children's Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | | | - Zane Dash
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Laura Cornelissen
- Department of Anesthesiology, Critical Care and Pain Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Carolina Donado
- Department of Anesthesiology, Critical Care and Pain Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Kamli N W Faour
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Lynn W Bush
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA
- Center for Bioethics, Harvard Medical School, Boston, MA, USA
| | - Victoria Suslovitch
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Claudia Lentucci
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Al Patterson
- Harvard Medical School, Boston, MA, USA
- Department of Pharmacy, Boston Children's Hospital, Boston, MA, USA
| | - Anthony A Philippakis
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Brad Margus
- Ataxia Telangiectasia Children's Project, Coconut Creek, FL, USA
| | - Charles B Berde
- Harvard Medical School, Boston, MA, USA
- Department of Anesthesiology, Critical Care and Pain Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Timothy W Yu
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA.
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
- Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
40
|
Rajaby R, Liu DX, Au CH, Cheung YT, Lau AYT, Yang QY, Sung WK. INSurVeyor: improving insertion calling from short read sequencing data. Nat Commun 2023; 14:3243. [PMID: 37277343 DOI: 10.1038/s41467-023-38870-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 05/18/2023] [Indexed: 06/07/2023] Open
Abstract
Insertions are one of the major types of structural variations and are defined as the addition of 50 nucleotides or more into a DNA sequence. Several methods exist to detect insertions from next-generation sequencing short read data, but they generally have low sensitivity. Our contribution is two-fold. First, we introduce INSurVeyor, a fast, sensitive and precise method that detects insertions from next-generation sequencing paired-end data. Using publicly available benchmark datasets (both human and non-human), we show that INSurVeyor is not only more sensitive than any individual caller we tested, but also more sensitive than all of them combined. Furthermore, for most types of insertions, INSurVeyor is almost as sensitive as long reads callers. Second, we provide state-of-the-art catalogues of insertions for 1047 Arabidopsis Thaliana genomes from the 1001 Genomes Project and 3202 human genomes from the 1000 Genomes Project, both generated with INSurVeyor. We show that they are more complete and precise than existing resources, and important insertions are missed by existing methods.
Collapse
Affiliation(s)
- Ramesh Rajaby
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
- A*STAR Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672, Singapore
| | - Dong-Xu Liu
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Chun Hang Au
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
| | - Yuen-Ting Cheung
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
| | - Amy Yuet Ting Lau
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
| | - Qing-Yong Yang
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wing-Kin Sung
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China.
- A*STAR Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672, Singapore.
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
- Department of Chemical Pathology, The Chinese University of Hong Kong, Hong Kong, China.
- Laboratory of Computational Genomics, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, China.
- School of Computing, National University of Singapore, 13 Computing Drive, Singapore, 117417, Singapore.
| |
Collapse
|
41
|
Lee JJK, Jung YL, Cheong TC, Espejo Valle-Inclan J, Chu C, Gulhan DC, Ljungström V, Jin H, Viswanadham VV, Watson EV, Cortés-Ciriano I, Elledge SJ, Chiarle R, Pellman D, Park PJ. ERα-associated translocations underlie oncogene amplifications in breast cancer. Nature 2023; 618:1024-1032. [PMID: 37198482 PMCID: PMC10307628 DOI: 10.1038/s41586-023-06057-w] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 04/05/2023] [Indexed: 05/19/2023]
Abstract
Focal copy-number amplification is an oncogenic event. Although recent studies have revealed the complex structure1-3 and the evolutionary trajectories4 of oncogene amplicons, their origin remains poorly understood. Here we show that focal amplifications in breast cancer frequently derive from a mechanism-which we term translocation-bridge amplification-involving inter-chromosomal translocations that lead to dicentric chromosome bridge formation and breakage. In 780 breast cancer genomes, we observe that focal amplifications are frequently connected to each other by inter-chromosomal translocations at their boundaries. Subsequent analysis indicates the following model: the oncogene neighbourhood is translocated in G1 creating a dicentric chromosome, the dicentric chromosome is replicated, and as dicentric sister chromosomes segregate during mitosis, a chromosome bridge is formed and then broken, with fragments often being circularized in extrachromosomal DNAs. This model explains the amplifications of key oncogenes, including ERBB2 and CCND1. Recurrent amplification boundaries and rearrangement hotspots correlate with oestrogen receptor binding in breast cancer cells. Experimentally, oestrogen treatment induces DNA double-strand breaks in the oestrogen receptor target regions that are repaired by translocations, suggesting a role of oestrogen in generating the initial translocations. A pan-cancer analysis reveals tissue-specific biases in mechanisms initiating focal amplifications, with the breakage-fusion-bridge cycle prevalent in some and the translocation-bridge amplification in others, probably owing to the different timing of DNA break repair. Our results identify a common mode of oncogene amplification and propose oestrogen as its mechanistic origin in breast cancer.
Collapse
Affiliation(s)
- Jake June-Koo Lee
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA.
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | - Youngsook Lucy Jung
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Taek-Chin Cheong
- Department of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Doga C Gulhan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA
| | - Viktor Ljungström
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Hu Jin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | | | - Emma V Watson
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Isidro Cortés-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Stephen J Elledge
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Roberto Chiarle
- Department of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy
| | - David Pellman
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
42
|
Groza C, Chen X, Pacis A, Simon MM, Pramatarova A, Aracena KA, Pastinen T, Barreiro LB, Bourque G. Genome graphs detect human polymorphisms in active epigenomic state during influenza infection. CELL GENOMICS 2023; 3:100294. [PMID: 37228750 PMCID: PMC10203048 DOI: 10.1016/j.xgen.2023.100294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/26/2022] [Accepted: 03/09/2023] [Indexed: 05/27/2023]
Abstract
Genetic variants, including mobile element insertions (MEIs), are known to impact the epigenome. We hypothesized that genome graphs, which encapsulate genetic diversity, could reveal missing epigenomic signals. To test this, we sequenced the epigenome of monocyte-derived macrophages from 35 ancestrally diverse individuals before and after influenza infection, allowing us to investigate the role of MEIs in immunity. We characterized genetic variants and MEIs using linked reads and built a genome graph. Mapping epigenetic data revealed 2.3%-3% novel peaks for H3K4me1, H3K27ac chromatin immunoprecipitation sequencing (ChIP-seq), and ATAC-seq. Additionally, the use of a genome graph modified some quantitative trait loci estimates and revealed 375 polymorphic MEIs in an active epigenomic state. Among these is an AluYh3 polymorphism whose chromatin state changed after infection and was associated with the expression of TRIM25, a gene that restricts influenza RNA synthesis. Our results demonstrate that graph genomes can reveal regulatory regions that would have been overlooked by other approaches.
Collapse
Affiliation(s)
- Cristian Groza
- Quantitative Life Sciences, McGill University, Montréal, QC, Canada
| | - Xun Chen
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Alain Pacis
- Canadian Centre for Computational Genomics, McGill University, Montréal, QC, Canada
| | - Marie-Michelle Simon
- Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada
| | - Albena Pramatarova
- Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada
| | | | - Tomi Pastinen
- Genomic Medicine Center, Children’s Mercy Hospital and Research Institute, Kansas City, MO, USA
| | - Luis B. Barreiro
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Committee on Immunology, University of Chicago, Chicago, IL, USA
| | - Guillaume Bourque
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
- Canadian Centre for Computational Genomics, McGill University, Montréal, QC, Canada
- Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada
- Human Genetics, McGill University, Montréal, QC, Canada
| |
Collapse
|
43
|
Nam CH, Youk J, Kim JY, Lim J, Park JW, Oh SA, Lee HJ, Park JW, Won H, Lee Y, Jeong SY, Lee DS, Oh JW, Han J, Lee J, Kwon HW, Kim MJ, Ju YS. Widespread somatic L1 retrotransposition in normal colorectal epithelium. Nature 2023; 617:540-547. [PMID: 37165195 PMCID: PMC10191854 DOI: 10.1038/s41586-023-06046-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 04/04/2023] [Indexed: 05/12/2023]
Abstract
Throughout an individual's lifetime, genomic alterations accumulate in somatic cells1-11. However, the mutational landscape induced by retrotransposition of long interspersed nuclear element-1 (L1), a widespread mobile element in the human genome12-14, is poorly understood in normal cells. Here we explored the whole-genome sequences of 899 single-cell clones established from three different cell types collected from 28 individuals. We identified 1,708 somatic L1 retrotransposition events that were enriched in colorectal epithelium and showed a positive relationship with age. Fingerprinting of source elements showed 34 retrotransposition-competent L1s. Multidimensional analysis demonstrated that (1) somatic L1 retrotranspositions occur from early embryogenesis at a substantial rate, (2) epigenetic on/off of a source element is preferentially determined in the early organogenesis stage, (3) retrotransposition-competent L1s with a lower population allele frequency have higher retrotransposition activity and (4) only a small fraction of L1 transcripts in the cytoplasm are finally retrotransposed in somatic cells. Analysis of matched cancers further suggested that somatic L1 retrotransposition rate is substantially increased during colorectal tumourigenesis. In summary, this study illustrates L1 retrotransposition-induced somatic mosaicism in normal cells and provides insights into the genomic and epigenomic regulation of transposable elements over the human lifetime.
Collapse
Affiliation(s)
- Chang Hyun Nam
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Jeonghwan Youk
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- Genome Insight, Inc., Daejeon, Republic of Korea
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | | | - Joonoh Lim
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- Genome Insight, Inc., Daejeon, Republic of Korea
| | - Jung Woo Park
- Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Soo A Oh
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Hyun Jung Lee
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Ji Won Park
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Hyein Won
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Yunah Lee
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Seung-Yong Jeong
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Dong-Sung Lee
- Department of Life Science, University of Seoul, Seoul, Republic of Korea
| | - Ji Won Oh
- Department of Anatomy, School of Medicine, Kyungpook National University, Daegu, Republic of Korea
- Department of Anatomy, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Jinju Han
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Junehawk Lee
- Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Hyun Woo Kwon
- Department of Nuclear Medicine, Korea University College of Medicine, Seoul, Republic of Korea.
| | - Min Jung Kim
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea.
| | - Young Seok Ju
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.
- Genome Insight, Inc., Daejeon, Republic of Korea.
| |
Collapse
|
44
|
Samelak-Czajka A, Wojciechowski P, Marszalek-Zenczak M, Figlerowicz M, Zmienko A. Differences in the intraspecies copy number variation of Arabidopsis thaliana conserved and nonconserved miRNA genes. Funct Integr Genomics 2023; 23:120. [PMID: 37036577 PMCID: PMC10085913 DOI: 10.1007/s10142-023-01043-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 03/23/2023] [Accepted: 03/25/2023] [Indexed: 04/11/2023]
Abstract
MicroRNAs (miRNAs) regulate gene expression by RNA interference mechanism. In plants, miRNA genes (MIRs) which are grouped into conserved families, i.e. they are present among the different plant taxa, are involved in the regulation of many developmental and physiological processes. The roles of the nonconserved MIRs-which are MIRs restricted to one plant family, genus, or even species-are less recognized; however, many of them participate in the responses to biotic and abiotic stresses. Both over- and underproduction of miRNAs may influence various biological processes. Consequently, maintaining intracellular miRNA homeostasis seems to be crucial for the organism. Deletions and duplications in the genomic sequence may alter gene dosage and/or activity. We evaluated the extent of copy number variations (CNVs) among Arabidopsis thaliana (Arabidopsis) MIRs in over 1000 natural accessions, using population-based analysis of the short-read sequencing data. We showed that the conserved MIRs were unlikely to display CNVs and their deletions were extremely rare, whereas nonconserved MIRs presented moderate variation. Transposon-derived MIRs displayed exceptionally high diversity. Conversely, MIRs involved in the epigenetic control of transposons reactivated during development were mostly invariable. MIR overlap with the protein-coding genes also limited their variability. At the expression level, a higher rate of nonvariable, nonconserved miRNAs was detectable in Col-0 leaves, inflorescence, and siliques compared to nonconserved variable miRNAs, although the expression of both groups was much lower than that of the conserved MIRs. Our data indicate that CNV rate of Arabidopsis MIRs is related with their age, function, and genomic localization.
Collapse
Affiliation(s)
- Anna Samelak-Czajka
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland
| | - Pawel Wojciechowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland
- Institute of Computing Science, Faculty of Computing and Telecommunications, Poznan University of Technology, 60-965, Poznan, Poland
| | | | - Marek Figlerowicz
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland.
| | - Agnieszka Zmienko
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland.
| |
Collapse
|
45
|
Bowles H, Kabiljo R, Al Khleifat A, Jones A, Quinn JP, Dobson RJB, Swanson CM, Al-Chalabi A, Iacoangeli A. An assessment of bioinformatics tools for the detection of human endogenous retroviral insertions in short-read genome sequencing data. FRONTIERS IN BIOINFORMATICS 2023; 2:1062328. [PMID: 36845320 PMCID: PMC9945273 DOI: 10.3389/fbinf.2022.1062328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 12/12/2022] [Indexed: 02/10/2023] Open
Abstract
There is a growing interest in the study of human endogenous retroviruses (HERVs) given the substantial body of evidence that implicates them in many human diseases. Although their genomic characterization presents numerous technical challenges, next-generation sequencing (NGS) has shown potential to detect HERV insertions and their polymorphisms in humans. Currently, a number of computational tools to detect them in short-read NGS data exist. In order to design optimal analysis pipelines, an independent evaluation of the available tools is required. We evaluated the performance of a set of such tools using a variety of experimental designs and datasets. These included 50 human short-read whole-genome sequencing samples, matching long and short-read sequencing data, and simulated short-read NGS data. Our results highlight a great performance variability of the tools across the datasets and suggest that different tools might be suitable for different study designs. However, specialized tools designed to detect exclusively human endogenous retroviruses consistently outperformed generalist tools that detect a wider range of transposable elements. We suggest that, if sufficient computing resources are available, using multiple HERV detection tools to obtain a consensus set of insertion loci may be ideal. Furthermore, given that the false positive discovery rate of the tools varied between 8% and 55% across tools and datasets, we recommend the wet lab validation of predicted insertions if DNA samples are available.
Collapse
Affiliation(s)
- Harry Bowles
- Department of Basic and Clinical Neuroscience, King’s College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
| | - Renata Kabiljo
- Department of Basic and Clinical Neuroscience, King’s College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
- Department of Biostatistics and Health Informatics, King’s College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
| | - Ahmad Al Khleifat
- Department of Basic and Clinical Neuroscience, King’s College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
| | - Ashley Jones
- Department of Basic and Clinical Neuroscience, King’s College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
| | - John P. Quinn
- Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Richard J. B. Dobson
- Department of Biostatistics and Health Informatics, King’s College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
- NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London, London, United Kingdom
- Institute of Health Informatics, University College London, London, United Kingdom
- NIHR Biomedical Research Centre, University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | - Chad M. Swanson
- Department of Infectious Diseases, School of Immunology and Microbial Sciences, King’s College London, London, United Kingdom
| | - Ammar Al-Chalabi
- Department of Basic and Clinical Neuroscience, King’s College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
- Department of Neurology, King’s College Hospital, London, United Kingdom
| | - Alfredo Iacoangeli
- Department of Basic and Clinical Neuroscience, King’s College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
- Department of Biostatistics and Health Informatics, King’s College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
- NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London, London, United Kingdom
| |
Collapse
|
46
|
Chen X, Bourque G, Goubert C. Genotyping of Transposable Element Insertions Segregating in Human Populations Using Short-Read Realignments. Methods Mol Biol 2023; 2607:63-83. [PMID: 36449158 DOI: 10.1007/978-1-0716-2883-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Transposable element (TE) insertions are a major source of structural variation in the human genome. Due to the repetitive nature and biological importance of TEs, many bioinformatic tools have been developed to identify and genotype TE insertion polymorphisms using high-throughput short-reads. In this chapter, we outline recently developed methods to characterize TE insertion polymorphisms in human populations. We also provide detailed protocols to tackle this question primarily using three software: MELT2, ERVcaller, and TypeREF.
Collapse
Affiliation(s)
- Xun Chen
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan.
| | - Guillaume Bourque
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
- Canadian Centre for Computational Genomics, McGill University, Montreal, QC, Canada
- McGill Genome Centre, Montreal, QC, Canada
- Human Genetics, McGill University, Montreal, QC, Canada
| | - Clément Goubert
- Canadian Centre for Computational Genomics, McGill University, Montreal, QC, Canada.
- McGill Genome Centre, Montreal, QC, Canada.
- Human Genetics, McGill University, Montreal, QC, Canada.
| |
Collapse
|
47
|
Angileri KM, Bagia NA, Feschotte C. Transposon control as a checkpoint for tissue regeneration. Development 2022; 149:dev191957. [PMID: 36440631 PMCID: PMC10655923 DOI: 10.1242/dev.191957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 10/03/2022] [Indexed: 11/29/2022]
Abstract
Tissue regeneration requires precise temporal control of cellular processes such as inflammatory signaling, chromatin remodeling and proliferation. The combination of these processes forms a unique microenvironment permissive to the expression, and potential mobilization of, transposable elements (TEs). Here, we develop the hypothesis that TE activation creates a barrier to tissue repair that must be overcome to achieve successful regeneration. We discuss how uncontrolled TE activity may impede tissue restoration and review mechanisms by which TE activity may be controlled during regeneration. We posit that the diversification and co-evolution of TEs and host control mechanisms may contribute to the wide variation in regenerative competency across tissues and species.
Collapse
Affiliation(s)
- Krista M. Angileri
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14850, USA
| | - Nornubari A. Bagia
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14850, USA
| | - Cedric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14850, USA
| |
Collapse
|
48
|
Lee H, Min JW, Mun S, Han K. Human Retrotransposons and Effective Computational Detection Methods for Next-Generation Sequencing Data. LIFE (BASEL, SWITZERLAND) 2022; 12:life12101583. [PMID: 36295018 PMCID: PMC9605557 DOI: 10.3390/life12101583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/03/2022] [Accepted: 10/10/2022] [Indexed: 11/16/2022]
Abstract
Transposable elements (TEs) are classified into two classes according to their mobilization mechanism. Compared to DNA transposons that move by the "cut and paste" mechanism, retrotransposons mobilize via the "copy and paste" method. They have been an essential research topic because some of the active elements, such as Long interspersed element 1 (LINE-1), Alu, and SVA elements, have contributed to the genetic diversity of primates beyond humans. In addition, they can cause genetic disorders by altering gene expression and generating structural variations (SVs). The development and rapid technological advances in next-generation sequencing (NGS) have led to new perspectives on detecting retrotransposon-mediated SVs, especially insertions. Moreover, various computational methods have been developed based on NGS data to precisely detect the insertions and deletions in the human genome. Therefore, this review discusses details about the recently studied and utilized NGS technologies and the effective computational approaches for discovering retrotransposons through it. The final part covers a diverse range of computational methods for detecting retrotransposon insertions with human NGS data. This review will give researchers insights into understanding the TEs and how to investigate them and find connections with research interests.
Collapse
Affiliation(s)
- Haeun Lee
- Department of Bioconvergence Engineering, Dankook University, Yongin 16890, Korea
| | - Jun Won Min
- Department of Surgery, Dankook University College of Medicine, Cheonan 31116, Korea
| | - Seyoung Mun
- Department of Microbiology, College of Science & Technology, Dankook University, Cheonan 31116, Korea
- Center for Bio Medical Engineering Core Facility, Dankook University, Cheonan 31116, Korea
- Correspondence: (S.M.); (K.H.)
| | - Kyudong Han
- Department of Bioconvergence Engineering, Dankook University, Yongin 16890, Korea
- Department of Microbiology, College of Science & Technology, Dankook University, Cheonan 31116, Korea
- Center for Bio Medical Engineering Core Facility, Dankook University, Cheonan 31116, Korea
- HuNbiome Co., Ltd., R&D Center, Seoul 08507, Korea
- Correspondence: (S.M.); (K.H.)
| |
Collapse
|
49
|
Han S, Dias GB, Basting PJ, Viswanatha R, Perrimon N, Bergman C. Local assembly of long reads enables phylogenomics of transposable elements in a polyploid cell line. Nucleic Acids Res 2022; 50:e124. [PMID: 36156149 PMCID: PMC9757076 DOI: 10.1093/nar/gkac794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 07/21/2022] [Accepted: 09/16/2022] [Indexed: 12/24/2022] Open
Abstract
Animal cell lines often undergo extreme genome restructuring events, including polyploidy and segmental aneuploidy that can impede de novo whole-genome assembly (WGA). In some species like Drosophila, cell lines also exhibit massive proliferation of transposable elements (TEs). To better understand the role of transposition during animal cell culture, we sequenced the genome of the tetraploid Drosophila S2R+ cell line using long-read and linked-read technologies. WGAs for S2R+ were highly fragmented and generated variable estimates of TE content across sequencing and assembly technologies. We therefore developed a novel WGA-independent bioinformatics method called TELR that identifies, locally assembles, and estimates allele frequency of TEs from long-read sequence data (https://github.com/bergmanlab/telr). Application of TELR to a ∼130x PacBio dataset for S2R+ revealed many haplotype-specific TE insertions that arose by transposition after initial cell line establishment and subsequent tetraploidization. Local assemblies from TELR also allowed phylogenetic analysis of paralogous TEs, which revealed that proliferation of TE families in vitro can be driven by single or multiple source lineages. Our work provides a model for the analysis of TEs in complex heterozygous or polyploid genomes that are recalcitrant to WGA and yields new insights into the mechanisms of genome evolution in animal cell culture.
Collapse
Affiliation(s)
| | | | - Preston J Basting
- Institute of Bioinformatics, University of Georgia, 120 E. Green St., Athens, GA, USA
| | - Raghuvir Viswanatha
- Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA, USA
| | - Norbert Perrimon
- Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA, USA,Howard Hughes Medical Institute, Boston, MA, USA
| | - Casey M Bergman
- To whom correspondence should be addressed. Tel: +1 706 542 1764; Fax: +1 706 542 3910;
| |
Collapse
|
50
|
Sources of Cancer Neoantigens beyond Single-Nucleotide Variants. Int J Mol Sci 2022; 23:ijms231710131. [PMID: 36077528 PMCID: PMC9455963 DOI: 10.3390/ijms231710131] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 09/01/2022] [Accepted: 09/02/2022] [Indexed: 11/17/2022] Open
Abstract
The success of checkpoint blockade therapy against cancer has unequivocally shown that cancer cells can be effectively recognized by the immune system and eliminated. However, the identity of the cancer antigens that elicit protective immunity remains to be fully explored. Over the last decade, most of the focus has been on somatic mutations derived from non-synonymous single-nucleotide variants (SNVs) and small insertion/deletion mutations (indels) that accumulate during cancer progression. Mutated peptides can be presented on MHC molecules and give rise to novel antigens or neoantigens, which have been shown to induce potent anti-tumor immune responses. A limitation with SNV-neoantigens is that they are patient-specific and their accurate prediction is critical for the development of effective immunotherapies. In addition, cancer types with low mutation burden may not display sufficient high-quality [SNV/small indels] neoantigens to alone stimulate effective T cell responses. Accumulating evidence suggests the existence of alternative sources of cancer neoantigens, such as gene fusions, alternative splicing variants, post-translational modifications, and transposable elements, which may be attractive novel targets for immunotherapy. In this review, we describe the recent technological advances in the identification of these novel sources of neoantigens, the experimental evidence for their presentation on MHC molecules and their immunogenicity, as well as the current clinical development stage of immunotherapy targeting these neoantigens.
Collapse
|