1
|
Auxillos J, Stigliani A, Vaagensø C, Garland W, Niazi A, Valen E, Jensen T, Sandelin A. True length of diverse capped RNA sequencing (TLDR-seq): 5'-3'-end sequencing of capped RNAs regardless of 3'-end status. Nucleic Acids Res 2025; 53:gkaf240. [PMID: 40183637 PMCID: PMC11969664 DOI: 10.1093/nar/gkaf240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 02/20/2025] [Accepted: 03/14/2025] [Indexed: 04/05/2025] Open
Abstract
Analysis of transcript function is greatly aided by knowledge of the full-length RNA sequence. New long-read sequencing enabled by Oxford Nanopore and PacBio devices have the potential to provide full-length transcript information; however, standard methods still lack the ability to capture true RNA 5' ends and select for polyadenylated (pA+) transcripts only. Here, we present a method that, by utilizing cap trapping and 3'-end adapter ligation, sequences transcripts between their exact 5' and 3' ends regardless of polyadenylation status and without the need for ribosomal RNA depletion, with the ability to characterize polyadenylation length of RNAs, if any. The method shows high reproducibility, can faithfully detect 5' ends, 3' ends and splice junctions, and produces gene-expression estimates that are highly correlated to those of short-read sequencing techniques. We also demonstrate that the method can detect and sequence full-length nonadenylated (pA-) RNAs, including long noncoding RNAs, promoter upstream transcripts, and enhancer RNAs, and present cases where pA+ and pA- RNAs show preferences for different but closely located transcription start sites. Our method is therefore useful for the characterization of diverse capped RNA species and analysis of relationships between transcription initiation, termination, and RNA processing.
Collapse
Affiliation(s)
- Jamie Auxillos
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, DK2200 Copenhagen, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK2200 Copenhagen, Denmark
| | - Arnaud Stigliani
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, DK2200 Copenhagen, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK2200 Copenhagen, Denmark
| | - Christian Skov Vaagensø
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, DK2200 Copenhagen, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK2200 Copenhagen, Denmark
| | - William Garland
- Department of Molecular Biology and Genetics, Aarhus University, DK8000 Aarhus, Denmark
| | - Adnan Muhammed Niazi
- Computational Biology Unit, Department of Informatics, University of Bergen, N-5008 Bergen, Norway
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, N-5008 Bergen, Norway
- Department of Biosciences, University of Oslo, N-0371 Oslo, Norway
| | - Torben Heick Jensen
- Department of Molecular Biology and Genetics, Aarhus University, DK8000 Aarhus, Denmark
| | - Albin Sandelin
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, DK2200 Copenhagen, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK2200 Copenhagen, Denmark
| |
Collapse
|
2
|
Fu ZH, Cheng S, Li JW, Zhang N, Wu Y, Zhao GR. Synthetic tunable promoters for flexible control of multi-gene expression in mammalian cells. J Adv Res 2025:S2090-1232(25)00106-7. [PMID: 39938795 DOI: 10.1016/j.jare.2025.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 02/06/2025] [Accepted: 02/07/2025] [Indexed: 02/14/2025] Open
Abstract
INTRODUCTION Synthetic biology revolutionizes our ability to decode and recode genetic systems. The capability to reconstruct and flexibly manipulate multi-gene systems is critical for understanding cellular behaviors and has significant applications in therapeutics. OBJECTIVES This study aims to construct a diverse library of synthetic tunable promoters (STPs) to enable flexible control of multi-gene expression in mammalian cells. METHODS We designed and constructed synthetic tunable promoters (STPs) that incorporate both a universal activation site (UAS) and a specific activation site (SAS), enabling multi-level expression control via the CRISPR activation (CRISPRa) system. To evaluate promoter activity, we utilized Massively Parallel Reporter Assays (MPRA) to assess the basal strengths of the STPs and their activation responses. Next, we constructed a three-gene reporter system to assess the capacity of the synthetic promoters for achieving multilevel control of single-gene expression within multi-gene systems. RESULTS The promoter library contains 24,960 unique non-redundant promoters with distinct sequence characteristics. MPRA revealed a wide range of promoter activities, showing different basal strengths and distinct activation levels when activated by the CRISPRa system. When regulated by targeting the SAS, the STPs exhibited orthogonality, allowing multilevel control of single-gene expression within multi-gene systems without cross-interference. Furthermore, the combinatorial activation of STPs in a multi-gene system enlarged the scope of expression levels achievable, providing fine-tuned control over gene expression. CONCLUSION We provide a diverse collection of synthetic tunable promoters, offering a valuable toolkit for the construction and manipulation of multi-gene systems in mammalian cells, with applications in gene therapy and biotechnology.
Collapse
Affiliation(s)
- Zong-Heng Fu
- State Key Laboratory of Synthetic Biology, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin, 300072, China
| | - Si Cheng
- State Key Laboratory of Synthetic Biology, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin, 300072, China
| | - Jia-Wei Li
- State Key Laboratory of Synthetic Biology, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin, 300072, China
| | - Nan Zhang
- State Key Laboratory of Synthetic Biology, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin, 300072, China
| | - Yi Wu
- State Key Laboratory of Synthetic Biology, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin, 300072, China.
| | - Guang-Rong Zhao
- State Key Laboratory of Synthetic Biology, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin, 300072, China.
| |
Collapse
|
3
|
Solovyeva AI, Afanasev RV, Popova MA, Enukashvily NI. Dysregulation of Transposon Transcription Profiles in Cancer Cells Resembles That of Embryonic Stem Cells. Curr Issues Mol Biol 2024; 46:8576-8599. [PMID: 39194722 DOI: 10.3390/cimb46080505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 07/30/2024] [Accepted: 08/01/2024] [Indexed: 08/29/2024] Open
Abstract
Transposable elements (TEs) comprise a substantial portion of the mammalian genome, with potential implications for both embryonic development and cancer. This study aimed to characterize the expression profiles of TEs in embryonic stem cells (ESCs), cancer cell lines, tumor tissues, and the tumor microenvironment (TME). We observed similarities in TE expression profiles between cancer cells and ESCs, suggesting potential parallels in regulatory mechanisms. Notably, four TE RNAs (HERVH, LTR7, HERV-Fc1, HERV-Fc2) exhibited significant downregulation across cancer cell lines and tumor tissues compared to ESCs, highlighting potential roles in pluripotency regulation. The strong up-regulation of the latter two TEs (HERV-Fc1, HERV-Fc2) in ESCs has not been previously demonstrated and may be a first indication of their role in the regulation of pluripotency. Conversely, tandemly repeated sequences (MSR1, CER, ALR) showed up-regulation in cancer contexts. Moreover, a difference in TE expression was observed between the TME and the tumor bulk transcriptome, with distinct dysregulated TE profiles. Some TME-specific TEs were absent in normal tissues, predominantly belonging to LTR and L1 retrotransposon families. These findings not only shed light on the regulatory roles of TEs in both embryonic development and cancer but also suggest novel targets for anti-cancer therapy. Understanding the interplay between cancer cells and the TME at the TE level may pave the way for further research into therapeutic interventions.
Collapse
Affiliation(s)
- Anna I Solovyeva
- Lab of the Non-Coding DNA Studies, Institute of Cytology, Russian Academy of Sciences, 194064 St. Petersburg, Russia
- Zoological Institute of Russian Academy of Sciences, 199034 St. Petersburg, Russia
| | - Roman V Afanasev
- Lab of the Non-Coding DNA Studies, Institute of Cytology, Russian Academy of Sciences, 194064 St. Petersburg, Russia
| | - Marina A Popova
- Lab of the Non-Coding DNA Studies, Institute of Cytology, Russian Academy of Sciences, 194064 St. Petersburg, Russia
- Applied Genomics Laboratory, SCAMT Institute, ITMO University, 191002 St. Petersburg, Russia
| | - Natella I Enukashvily
- Lab of the Non-Coding DNA Studies, Institute of Cytology, Russian Academy of Sciences, 194064 St. Petersburg, Russia
- Department of Cytology and Histology, St. Petersburg State University, 199034 St. Petersburg, Russia
| |
Collapse
|
4
|
Tanudisastro HA, Deveson IW, Dashnow H, MacArthur DG. Sequencing and characterizing short tandem repeats in the human genome. Nat Rev Genet 2024; 25:460-475. [PMID: 38366034 DOI: 10.1038/s41576-024-00692-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2023] [Indexed: 02/18/2024]
Abstract
Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.
Collapse
Affiliation(s)
- Hope A Tanudisastro
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Ira W Deveson
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
5
|
Carbonell-Sala S, Perteghella T, Lagarde J, Nishiyori H, Palumbo E, Arnan C, Takahashi H, Carninci P, Uszczynska-Ratajczak B, Guigó R. CapTrap-seq: a platform-agnostic and quantitative approach for high-fidelity full-length RNA sequencing. Nat Commun 2024; 15:5278. [PMID: 38937428 PMCID: PMC11211341 DOI: 10.1038/s41467-024-49523-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 06/10/2024] [Indexed: 06/29/2024] Open
Abstract
Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we develop CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5' capped, full-length transcripts. In our study, we evaluate the performance of CapTrap-seq alongside other widely used RNA-seq library preparation protocols in human and mouse tissues, employing both ONT and PacBio sequencing technologies. To explore the quantitative capabilities of CapTrap-seq and its accuracy in reconstructing full-length RNA molecules, we implement a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation. Our benchmarks, incorporating the Long-read RNA-seq Genome Annotation Assessment Project (LRGASP) data, demonstrate that CapTrap-seq is a competitive, platform-agnostic RNA library preparation method for generating full-length transcript sequences.
Collapse
Affiliation(s)
- Sílvia Carbonell-Sala
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Tamara Perteghella
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Flomics Biotech, SL, Carrer de Roc Boronat 31, 08005, Barcelona, Catalonia, Spain
| | - Hiromi Nishiyori
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan
| | - Emilio Palumbo
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Carme Arnan
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Hazuki Takahashi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan
- Human Technopole, Milan, Italy
| | - Barbara Uszczynska-Ratajczak
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Department of Computational Biology of Noncoding RNA, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| |
Collapse
|
6
|
Al-Chalabi A, Andrews J, Farhan S. Recent advances in the genetics of familial and sporadic ALS. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2024; 176:49-74. [PMID: 38802182 DOI: 10.1016/bs.irn.2024.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
ALS shows complex genetic inheritance patterns. In about 5% to 10% of cases, there is a family history of ALS or a related condition such as frontotemporal dementia in a first or second degree relative, and for about 80% of such people a pathogenic gene variant can be identified. Such variants are also seen in people with no family history because of factor influencing the expression of genes, such as age. Genetic susceptibility factors also contribute to risk, and the heritability of ALS is between 40% and 60%. The genetic variants influencing ALS risk include single base changes, repeat expansions, copy number variants, and others. Here we review what is known of the genetic landscape and architecture of ALS.
Collapse
Affiliation(s)
- Ammar Al-Chalabi
- Department of Basic and Clinical Neuroscience, King's College London, London, United Kingdom.
| | - Jinsy Andrews
- Department of Neurology, Columbia University, New York, NY, United States
| | - Sali Farhan
- Department of Neurology and Neurosurgery, Montreal Neurological Institute-Hospital, Montreal, QC, Canada; Department of Human Genetics, Montreal Neurological Institute-Hospital, Montreal, QC, Canada
| |
Collapse
|
7
|
Bhati M, Mapel XM, Lloret-Villas A, Pausch H. Structural variants and short tandem repeats impact gene expression and splicing in bovine testis tissue. Genetics 2023; 225:iyad161. [PMID: 37655920 PMCID: PMC10627265 DOI: 10.1093/genetics/iyad161] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/05/2023] [Accepted: 08/24/2023] [Indexed: 09/02/2023] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) are significant sources of genetic variation. However, the impacts of these variants on gene regulation have not been investigated in cattle. Here, we genotyped and characterized 19,408 SVs and 374,821 STRs in 183 bovine genomes and investigated their impact on molecular phenotypes derived from testis transcriptomes. We found that 71% STRs were multiallelic. The vast majority (95%) of STRs and SVs were in intergenic and intronic regions. Only 37% SVs and 40% STRs were in high linkage disequilibrium (LD) (R2 > 0.8) with surrounding SNPs/insertions and deletions (Indels), indicating that SNP-based association testing and genomic prediction are blind to a nonnegligible portion of genetic variation. We showed that both SVs and STRs were more than 2-fold enriched among expression and splicing QTL (e/sQTL) relative to SNPs/Indels and were often associated with differential expression and splicing of multiple genes. Deletions and duplications had larger impacts on splicing and expression than any other type of SV. Exonic duplications predominantly increased gene expression either through alternative splicing or other mechanisms, whereas expression- and splicing-associated STRs primarily resided in intronic regions and exhibited bimodal effects on the molecular phenotypes investigated. Most e/sQTL resided within 100 kb of the affected genes or splicing junctions. We pinpoint candidate causal STRs and SVs associated with the expression of SLC13A4 and TTC7B and alternative splicing of a lncRNA and CAPP1. We provide a catalog of STRs and SVs for taurine cattle and show that these variants contribute substantially to gene expression and splicing variation.
Collapse
Affiliation(s)
- Meenu Bhati
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Xena Marie Mapel
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | | | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| |
Collapse
|
8
|
Amaral P, Carbonell-Sala S, De La Vega FM, Faial T, Frankish A, Gingeras T, Guigo R, Harrow JL, Hatzigeorgiou AG, Johnson R, Murphy TD, Pertea M, Pruitt KD, Pujar S, Takahashi H, Ulitsky I, Varabyou A, Wells CA, Yandell M, Carninci P, Salzberg SL. The status of the human gene catalogue. Nature 2023; 622:41-47. [PMID: 37794265 PMCID: PMC10575709 DOI: 10.1038/s41586-023-06490-x] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 07/27/2023] [Indexed: 10/06/2023]
Abstract
Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we examine the need for a universal annotation standard that includes all medically significant genes and maintains their relationships with different reference genomes for the use of the human gene catalogue in clinical settings.
Collapse
Affiliation(s)
- Paulo Amaral
- INSPER Institute of Education and Research, Sao Paulo, Brazil
| | | | - Francisco M De La Vega
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
- Tempus Labs, Chicago, IL, USA
| | | | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Thomas Gingeras
- Department of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Roderic Guigo
- Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Jennifer L Harrow
- Centre for Genomics Research, Discovery Sciences, AstraZeneca, Royston, UK
| | - Artemis G Hatzigeorgiou
- Department of Computer Science and Biomedical Informatics, Universithy of Thessaly, Lamia, Greece
- Hellenic Pasteur Institute, Athens, Greece
| | - Rory Johnson
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
- Conway Institute of Biomedical and Biomolecular Research, University College Dublin, Dublin, Ireland
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
- Department for BioMedical Research, University of Bern, Bern, Switzerland
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Shashikant Pujar
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Hazuki Takahashi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Igor Ulitsky
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| | - Ales Varabyou
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Christine A Wells
- Stem Cell Systems, Department of Anatomy and Physiology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Mark Yandell
- Departent of Human Genetics, Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Human Technopole, Milan, Italy.
| | - Steven L Salzberg
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
9
|
Guo LT, Pyle AM. RT-based Sanger sequencing of RNAs containing complex RNA repetitive elements. Methods Enzymol 2023; 691:17-27. [PMID: 37914445 DOI: 10.1016/bs.mie.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
Although next-generation sequencing (NGS) technologies have revolutionized our ability to sequence DNA with high-throughput, the chain termination-based Sanger sequencing method remains a widely used approach for DNA sequence analysis due to its simplicity, low cost and high accuracy. In particular, high accuracy makes Sanger sequencing the "gold standard" for sequence validation in basic research and clinical applications. During the early days of Sanger sequencing development, reverse transcriptase (RT)-based RNA sequencing was also explored and showed great promise, but the approach did not acquire popularity over time due to the limited processivity and low template unwinding capability of Avian Myeloblastosis Virus (AMV) RT, and other RT enzymes available at the time. RNA molecules have complex features, often containing repetitive sequences and stable secondary or tertiary structures. While these features are required for RNA biological function, they represent strong obstacles for retroviral RTs. Repetitive sequences and stable structures cause reverse transcription errors and premature primer extension stops, making chain termination-based methods unfeasible. MarathonRT is an ultra-processive RT encoded group II intron that can copy RNA molecules of any sequence and structure in a single cycle, making it an ideal RT enzyme for Sanger RNA sequencing. In this chapter, we upgrade the Sanger RNA sequencing method by replacing AMV RT with MarathonRT, providing a simple, robust method for direct RNA sequence analysis. The guidance for troubleshooting and further optimization are also provided.
Collapse
Affiliation(s)
- Li-Tao Guo
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, United States
| | - Anna Marie Pyle
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, United States; Department of Chemistry, Yale University, New Haven, CT, United States; Howard Hughes Medical Institute, Chevy Chase, MD, United States.
| |
Collapse
|
10
|
Carbonell-Sala S, Lagarde J, Nishiyori H, Palumbo E, Arnan C, Takahashi H, Carninci P, Uszczynska-Ratajczak B, Guigó R. CapTrap-Seq: A platform-agnostic and quantitative approach for high-fidelity full-length RNA transcript sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.16.543444. [PMID: 37398314 PMCID: PMC10312720 DOI: 10.1101/2023.06.16.543444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we developed CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5'capped, full-length transcripts, together with the data processing pipeline LyRic. We benchmarked CapTrap-seq and other popular RNA-seq library preparation protocols in a number of human tissues using both ONT and PacBio sequencing. To assess the accuracy of the transcript models produced, we introduced a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation in RNA spike-in molecules. We found that the vast majority (up to 90%) of transcript models that LyRic derives from CapTrap-seq reads are full-length. This makes it possible to produce highly accurate annotations with minimal human intervention.
Collapse
|
11
|
Hussain S, Sadouni N, van Essen D, Dao LTM, Ferré Q, Charbonnier G, Torres M, Gallardo F, Lecellier CH, Sexton T, Saccani S, Spicuglia S. Short tandem repeats are important contributors to silencer elements in T cells. Nucleic Acids Res 2023; 51:4845-4866. [PMID: 36929452 PMCID: PMC10250210 DOI: 10.1093/nar/gkad187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 02/26/2023] [Accepted: 03/15/2023] [Indexed: 03/18/2023] Open
Abstract
The action of cis-regulatory elements with either activation or repression functions underpins the precise regulation of gene expression during normal development and cell differentiation. Gene activation by the combined activities of promoters and distal enhancers has been extensively studied in normal and pathological contexts. In sharp contrast, gene repression by cis-acting silencers, defined as genetic elements that negatively regulate gene transcription in a position-independent fashion, is less well understood. Here, we repurpose the STARR-seq approach as a novel high-throughput reporter strategy to quantitatively assess silencer activity in mammals. We assessed silencer activity from DNase hypersensitive I sites in a mouse T cell line. Identified silencers were associated with either repressive or active chromatin marks and enriched for binding motifs of known transcriptional repressors. CRISPR-mediated genomic deletions validated the repressive function of distinct silencers involved in the repression of non-T cell genes and genes regulated during T cell differentiation. Finally, we unravel an association of silencer activity with short tandem repeats, highlighting the role of repetitive elements in silencer activity. Our results provide a general strategy for genome-wide identification and characterization of silencer elements.
Collapse
Affiliation(s)
- Saadat Hussain
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Nori Sadouni
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Dominic van Essen
- Institute for Research on Cancer and Ageing, IRCAN, 06107 Nice, France
| | - Lan T M Dao
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Quentin Ferré
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Guillaume Charbonnier
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Magali Torres
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Frederic Gallardo
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Charles-Henri Lecellier
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
- LIRMM, University of Montpellier, CNRS, Montpellier, France
| | - Tom Sexton
- Institut de Génétique et de Biologie Moléculaire et Cellulaire – IGBMC (CNRS UMR 7104, INSERM U1258, Université de Strasbourg), 67404 Illkirch, France
| | - Simona Saccani
- Institute for Research on Cancer and Ageing, IRCAN, 06107 Nice, France
| | - Salvatore Spicuglia
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| |
Collapse
|
12
|
Shi Y, Niu Y, Zhang P, Luo H, Liu S, Zhang S, Wang J, Li Y, Liu X, Song T, Xu T, He S. Characterization of genome-wide STR variation in 6487 human genomes. Nat Commun 2023; 14:2092. [PMID: 37045857 PMCID: PMC10097659 DOI: 10.1038/s41467-023-37690-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.
Collapse
Affiliation(s)
- Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuai Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Sijia Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiajia Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinyue Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250117, Shandong, China.
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
13
|
Amaral P, Carbonell-Sala S, De La Vega FM, Faial T, Frankish A, Gingeras T, Guigo R, Harrow JL, Hatzigeorgiou AG, Johnson R, Murphy TD, Pertea M, Pruitt KD, Pujar S, Takahashi H, Ulitsky I, Varabyou A, Wells CA, Yandell M, Carninci P, Salzberg SL. The status of the human gene catalogue. ARXIV 2023:arXiv:2303.13996v1. [PMID: 36994150 PMCID: PMC10055485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Scientists have been trying to identify all of the genes in the human genome since the initial draft of the genome was published in 2001. Over the intervening years, much progress has been made in identifying protein-coding genes, and the estimated number has shrunk to fewer than 20,000, although the number of distinct protein-coding isoforms has expanded dramatically. The invention of high-throughput RNA sequencing and other technological breakthroughs have led to an explosion in the number of reported non-coding RNA genes, although most of them do not yet have any known function. A combination of recent advances offers a path forward to identifying these functions and towards eventually completing the human gene catalogue. However, much work remains to be done before we have a universal annotation standard that includes all medically significant genes, maintains their relationships with different reference genomes, and describes clinically relevant genetic variants.
Collapse
Affiliation(s)
- Paulo Amaral
- INSPER Institute of Education and Research, São Paulo, SP, Brasil
| | - Silvia Carbonell-Sala
- Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
| | - Francisco M. De La Vega
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA; Tempus Labs, Inc., Chicago, IL
| | | | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Gingeras
- Department of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
| | - Roderic Guigo
- Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Jennifer L Harrow
- Centre for Genomics Research, Discovery Sciences, AstraZeneca, Da Vinci Building. Melbourn Science Park, Royston UK SG8 6HB
| | - Artemis G. Hatzigeorgiou
- Universithy of Thessaly, Department of Computer Science and Biomedical Informatics, Lamia, Greece; Hellenic Pasteur Institute, Athens, Greece
| | - Rory Johnson
- School of Biology and Environmental Science, University College Dublin, D04 V1W8 Dublin, Ireland; Conway Institute of Biomedical and Biomolecular Research, University College Dublin, D04 V1W8 Dublin, Ireland; Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, 3010 Bern, Switzerland; Department for BioMedical Research, University of Bern, 3008 Bern, Switzerland
| | - Terence D. Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Kim D. Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Shashikant Pujar
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Hazuki Takahashi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama Kanagawa 230-0045 Japan
| | - Igor Ulitsky
- Department of Immunology and Regenerative Biology; Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Ales Varabyou
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Christine A. Wells
- Stem Cell Systems, Department of Anatomy and Physiology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville 3010 Vic Australia
| | - Mark Yandell
- Departent of Human Genetics, Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Piero Carninci
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Human Technopole, via Rita Levi Montalcini 1, Milan 20157 Italy
| | - Steven L. Salzberg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Immunology and Regenerative Biology; Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot 76100, Israel
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
14
|
Yoshida Y, Shaikhutdinov N, Kozlova O, Itoh M, Tagami M, Murata M, Nishiyori-Sueki H, Kojima-Ishiyama M, Noma S, Cherkasov A, Gazizova G, Nasibullina A, Deviatiiarov R, Shagimardanova E, Ryabova A, Yamaguchi K, Bino T, Shigenobu S, Tokumoto S, Miyata Y, Cornette R, Yamada TG, Funahashi A, Tomita M, Gusev O, Kikawada T. High quality genome assembly of the anhydrobiotic midge provides insights on a single chromosome-based emergence of extreme desiccation tolerance. NAR Genom Bioinform 2022; 4:lqac029. [PMID: 35387384 PMCID: PMC8982440 DOI: 10.1093/nargab/lqac029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 03/08/2022] [Accepted: 03/18/2022] [Indexed: 12/13/2022] Open
Abstract
Non-biting midges (Chironomidae) are known to inhabit a wide range of environments, and certain species can tolerate extreme conditions, where the rest of insects cannot survive. In particular, the sleeping chironomid Polypedilum vanderplanki is known for the remarkable ability of its larvae to withstand almost complete desiccation by entering a state called anhydrobiosis. Chromosome numbers in chironomids are higher than in other dipterans and this extra genomic resource might facilitate rapid adaptation to novel environments. We used improved sequencing strategies to assemble a chromosome-level genome sequence for P. vanderplanki for deep comparative analysis of genomic location of genes associated with desiccation tolerance. Using whole genome-based cross-species and intra-species analysis, we provide evidence for the unique functional specialization of Chromosome 4 through extensive acquisition of novel genes. In contrast to other insect genomes, in the sleeping chironomid a uniquely high degree of subfunctionalization in paralogous anhydrobiosis genes occurs in this chromosome, as well as pseudogenization in a highly duplicated gene family. Our findings suggest that the Chromosome 4 in Polypedilum is a site of high genetic turnover, allowing it to act as a 'sandbox' for evolutionary experiments, thus facilitating the rapid adaptation of midges to harsh environments.
Collapse
Affiliation(s)
- Yuki Yoshida
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0035, Japan
- Graduate School of Media and Governance, Systems Biology Program, Keio University, Fujisawa, Kanagawa 252-0882, Japan
| | - Nurislam Shaikhutdinov
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, 21205, Russian Federation
| | - Olga Kozlova
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Masayoshi Itoh
- Preventive Medicine & Diagnosis Innovation Program (PMI), RIKEN, Wako, Saitama 351-0198, Japan
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| | - Michihira Tagami
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| | - Mitsuyoshi Murata
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| | | | - Miki Kojima-Ishiyama
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| | - Shohei Noma
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Kanagawa 230-0045, Japan
| | - Alexander Cherkasov
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Guzel Gazizova
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Aigul Nasibullina
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Ruslan Deviatiiarov
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Elena Shagimardanova
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Alina Ryabova
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
| | - Katsushi Yamaguchi
- Functional Genomics Facility, National Institute for Basic Biology, Okazaki, Aichi 444-8585, Japan
| | - Takahiro Bino
- Functional Genomics Facility, National Institute for Basic Biology, Okazaki, Aichi 444-8585, Japan
| | - Shuji Shigenobu
- Functional Genomics Facility, National Institute for Basic Biology, Okazaki, Aichi 444-8585, Japan
| | - Shoko Tokumoto
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Yugo Miyata
- Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), Tsukuba, Ibaraki 305-8634, Japan
| | - Richard Cornette
- Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), Tsukuba, Ibaraki 305-8634, Japan
| | - Takahiro G Yamada
- Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa 223-8522, Japan
| | - Akira Funahashi
- Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa 223-8522, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0035, Japan
- Graduate School of Media and Governance, Systems Biology Program, Keio University, Fujisawa, Kanagawa 252-0882, Japan
- Faculty of Environment and Information studies, Keio University, Fujisawa, Kanagawa 252-0882, Japan
| | - Oleg Gusev
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420012, Russian Federation
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Kanagawa 230-0045, Japan
- Department of Regulatory Transcriptomics for Medical Genetic Diagnostics, Graduate School of Medicine, Juntendo University, Tokyo 113-8421, Japan
| | - Takahiro Kikawada
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
- Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), Tsukuba, Ibaraki 305-8634, Japan
| |
Collapse
|
15
|
Xiao X, Zhang CY, Zhang Z, Hu Z, Li M, Li T. Revisiting tandem repeats in psychiatric disorders from perspectives of genetics, physiology, and brain evolution. Mol Psychiatry 2022; 27:466-475. [PMID: 34650204 DOI: 10.1038/s41380-021-01329-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 09/16/2021] [Accepted: 09/28/2021] [Indexed: 01/28/2023]
Abstract
Genome-wide association studies (GWASs) have revealed substantial genetic components comprised of single nucleotide polymorphisms (SNPs) in the heritable risk of psychiatric disorders. However, genetic risk factors not covered by GWAS also play pivotal roles in these illnesses. Tandem repeats, which are likely functional but frequently overlooked by GWAS, may account for an important proportion in the "missing heritability" of psychiatric disorders. Despite difficulties in characterizing and quantifying tandem repeats in the genome, studies have been carried out in an attempt to describe impact of tandem repeats on gene regulation and human phenotypes. In this review, we have introduced recent research progress regarding the genomic distribution and regulatory mechanisms of tandem repeats. We have also summarized the current knowledge of the genetic architecture and biological underpinnings of psychiatric disorders brought by studies of tandem repeats. These findings suggest that tandem repeats, in candidate psychiatric risk genes or in different levels of linkage disequilibrium (LD) with psychiatric GWAS SNPs and haplotypes, may modulate biological phenotypes related to psychiatric disorders (e.g., cognitive function and brain physiology) through regulating alternative splicing, promoter activity, enhancer activity and so on. In addition, many tandem repeats undergo tight natural selection in the human lineage, and likely exert crucial roles in human brain evolution. Taken together, the putative roles of tandem repeats in the pathogenesis of psychiatric disorders is strongly implicated, and using examples from previous literatures, we wish to call for further attention to tandem repeats in the post-GWAS era of psychiatric disorders.
Collapse
Affiliation(s)
- Xiao Xiao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Chu-Yi Zhang
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Zhuohua Zhang
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China.,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Zhonghua Hu
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Department of Critical Care Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Eye Center of Xiangya Hospital and Hunan Key Laboratory of Ophthalmology, Central South University, Changsha, Hunan, China. .,National Clinical Research Center on Mental Disorders, Changsha, Hunan, China.
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China. .,CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China. .,KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Tao Li
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China. .,Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Guangzhou, China.
| |
Collapse
|
16
|
Guerrini MM, Oguchi A, Suzuki A, Murakawa Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol 2021; 44:127-136. [PMID: 34468849 DOI: 10.1007/s00281-021-00886-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 08/13/2021] [Indexed: 01/06/2023]
Abstract
Cap analysis of gene expression (CAGE) was developed to detect the 5' end of RNA. Trapping of the RNA 5'-cap structure enables the enrichment and selective sequencing of complete transcripts. Upscaled high-throughput versions of CAGE have enabled the genome-wide identification of transcription start sites, including transcriptionally active promoters and enhancers. CAGE sequencing can be exploited to draw comprehensive maps of active genomic regulatory elements in a cell type- and activation-specific manner. The cells of the immune system are among the best candidates to be analyzed in humans, since they are easily accessible. In this review, we discuss how CAGE data are instrumental for integrative analyses with quantitative trait loci and omics data, and their usefulness in the mechanistic interpretation of the effects of genetic variations over the entire human genome. Integrating CAGE data with the currently available omics information will contribute to better understanding of the genome-wide association study variants that lie outside of annotated genes, deepening our knowledge on human diseases, and enabling the targeted design of more specific therapeutic interventions.
Collapse
Affiliation(s)
- Matteo Maurizio Guerrini
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan.
| | - Akiko Oguchi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Yasuhiro Murakawa
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- IFOM-the FIRC Institute of Molecular Oncology, Milan, Italy
| |
Collapse
|