1
|
Patowary A, Zhang P, Jops C, Vuong CK, Ge X, Hou K, Kim M, Gong N, Margolis M, Vo D, Wang X, Liu C, Pasaniuc B, Li JJ, Gandal MJ, de la Torre-Ubieta L. Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms. Science 2024; 384:eadh7688. [PMID: 38781356 DOI: 10.1126/science.adh7688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 03/13/2024] [Indexed: 05/25/2024]
Abstract
RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders; yet, the role of cell type-specific splicing and transcript-isoform diversity during human brain development has not been systematically investigated. In this work, we leveraged single-molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone and cortical plate regions of the developing human neocortex at tissue and single-cell resolution. We identified 214,516 distinct isoforms, of which 72.6% were novel (not previously annotated in Gencode version 33), and uncovered a substantial contribution of transcript-isoform diversity-regulated by RNA binding proteins-in defining cellular identity in the developing neocortex. We leveraged this comprehensive isoform-centric gene annotation to reprioritize thousands of rare de novo risk variants and elucidate genetic risk mechanisms for neuropsychiatric disorders.
Collapse
Affiliation(s)
- Ashok Patowary
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Pan Zhang
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Connor Jops
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute at Penn Med and the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Celine K Vuong
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Xinzhou Ge
- Department of Statistics, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Kangcheng Hou
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Minsoo Kim
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Naihua Gong
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael Margolis
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Daniel Vo
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute at Penn Med and the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Xusheng Wang
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38103, USA
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13210, USA
- Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410008, China
| | - Bogdan Pasaniuc
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Institute for Precision Health, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Jingyi Jessica Li
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Statistics, University of California Los Angeles, Los Angeles, CA 90095, USA
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Biostatistics, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Michael J Gandal
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute at Penn Med and the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Luis de la Torre-Ubieta
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
2
|
Fansler MM, Mitschka S, Mayr C. Quantifying 3'UTR length from scRNA-seq data reveals changes independent of gene expression. Nat Commun 2024; 15:4050. [PMID: 38744866 PMCID: PMC11094166 DOI: 10.1038/s41467-024-48254-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 04/22/2024] [Indexed: 05/16/2024] Open
Abstract
Although more than half of all genes generate transcripts that differ in 3'UTR length, current analysis pipelines only quantify the amount but not the length of mRNA transcripts. 3'UTR length is determined by 3' end cleavage sites (CS). We map CS in more than 200 primary human and mouse cell types and increase CS annotations relative to the GENCODE database by 40%. Approximately half of all CS are used in few cell types, revealing that most genes only have one or two major 3' ends. We incorporate the CS annotations into a computational pipeline, called scUTRquant, for rapid, accurate, and simultaneous quantification of gene and 3'UTR isoform expression from single-cell RNA sequencing (scRNA-seq) data. When applying scUTRquant to data from 474 cell types and 2134 perturbations, we discover extensive 3'UTR length changes across cell types that are as widespread and coordinately regulated as gene expression changes but affect mostly different genes. Our data indicate that mRNA abundance and mRNA length are two largely independent axes of gene regulation that together determine the amount and spatial organization of protein synthesis.
Collapse
Affiliation(s)
- Mervin M Fansler
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Graduate College, New York, NY, 10021, USA
- Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Sibylle Mitschka
- Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Christine Mayr
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Graduate College, New York, NY, 10021, USA.
- Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA.
| |
Collapse
|
3
|
Pardo-Palacios FJ, Arzalluz-Luque A, Kondratova L, Salguero P, Mestre-Tomás J, Amorín R, Estevan-Morió E, Liu T, Nanni A, McIntyre L, Tseng E, Conesa A. SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms. Nat Methods 2024; 21:793-797. [PMID: 38509328 PMCID: PMC11093726 DOI: 10.1038/s41592-024-02229-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 03/01/2024] [Indexed: 03/22/2024]
Abstract
SQANTI3 is a tool designed for the quality control, curation and annotation of long-read transcript models obtained with third-generation sequencing technologies. Leveraging its annotation framework, SQANTI3 calculates quality descriptors of transcript models, junctions and transcript ends. With this information, potential artifacts can be identified and replaced with reliable sequences. Furthermore, the integrated functional annotation feature enables subsequent functional iso-transcriptomics analyses.
Collapse
Affiliation(s)
- Francisco J Pardo-Palacios
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
- Department of Applied Statistics and Operational Research, and Quality, Universitat Politècnica de València, Valencia, Valencia, Spain
| | - Angeles Arzalluz-Luque
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
- Department of Applied Statistics and Operational Research, and Quality, Universitat Politècnica de València, Valencia, Valencia, Spain
| | - Liudmyla Kondratova
- Horticultural Sciences Department, University of Florida, Gainesville, FL, USA
- Genetics Institute, University of Florida, Gainesville, FL, USA
| | - Pedro Salguero
- Department of Applied Statistics and Operational Research, and Quality, Universitat Politècnica de València, Valencia, Valencia, Spain
| | - Jorge Mestre-Tomás
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
| | - Rocío Amorín
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL, USA
| | - Eva Estevan-Morió
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
| | - Tianyuan Liu
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
| | - Adalena Nanni
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, USA
| | - Lauren McIntyre
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, USA
| | | | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| |
Collapse
|
4
|
Liu X, Chen H, Li Z, Yang X, Jin W, Wang Y, Zheng J, Li L, Xuan C, Yuan J, Yang Y. InPACT: a computational method for accurate characterization of intronic polyadenylation from RNA sequencing data. Nat Commun 2024; 15:2583. [PMID: 38519498 PMCID: PMC10960005 DOI: 10.1038/s41467-024-46875-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 03/12/2024] [Indexed: 03/25/2024] Open
Abstract
Alternative polyadenylation can occur in introns, termed intronic polyadenylation (IPA), has been implicated in diverse biological processes and diseases, as it can produce noncoding transcripts or transcripts with truncated coding regions. However, a reliable method is required to accurately characterize IPA. Here, we propose a computational method called InPACT, which allows for the precise characterization of IPA from conventional RNA-seq data. InPACT successfully identifies numerous previously unannotated IPA transcripts in human cells, many of which are translated, as evidenced by ribosome profiling data. We have demonstrated that InPACT outperforms other methods in terms of IPA identification and quantification. Moreover, InPACT applied to monocyte activation reveals temporally coordinated IPA events. Further application on single-cell RNA-seq data of human fetal bone marrow reveals the expression of several IPA isoforms in a context-specific manner. Therefore, InPACT represents a powerful tool for the accurate characterization of IPA from RNA-seq data.
Collapse
Affiliation(s)
- Xiaochuan Liu
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammatory Biology, The Second Hospital of Tianjin Medical University, Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Hao Chen
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Zekun Li
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammatory Biology, The Second Hospital of Tianjin Medical University, Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Xiaoxiao Yang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammatory Biology, The Second Hospital of Tianjin Medical University, Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Wen Jin
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammatory Biology, The Second Hospital of Tianjin Medical University, Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Yuting Wang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammatory Biology, The Second Hospital of Tianjin Medical University, Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Jian Zheng
- Department of Immunology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Long Li
- Department of Immunology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Chenghao Xuan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China.
| | - Jiapei Yuan
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, 300020, China.
- Tianjin Institutes of Health Science, Tianjin, 301600, China.
| | - Yang Yang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammatory Biology, The Second Hospital of Tianjin Medical University, Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China.
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China.
| |
Collapse
|
5
|
Seyres D, Gorka O, Schmidt R, Marone R, Zavolan M, Jeker LT. T helper cells exhibit a dynamic and reversible 3'-UTR landscape. RNA (NEW YORK, N.Y.) 2024; 30:418-434. [PMID: 38302256 PMCID: PMC10946431 DOI: 10.1261/rna.079897.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 01/16/2024] [Indexed: 02/03/2024]
Abstract
3' untranslated regions (3' UTRs) are critical elements of messenger RNAs, as they contain binding sites for RNA-binding proteins (RBPs) and microRNAs that affect various aspects of the RNA life cycle including transcript stability and cellular localization. In response to T cell receptor activation, T cells undergo massive expansion during the effector phase of the immune response and dynamically modify their 3' UTRs. Whether this serves to directly regulate the abundance of specific mRNAs or is a secondary effect of proliferation remains unclear. To study 3'-UTR dynamics in T helper cells, we investigated division-dependent alternative polyadenylation (APA). In addition, we generated 3' end UTR sequencing data from naive, activated, memory, and regulatory CD4+ T cells. 3'-UTR length changes were estimated using a nonnegative matrix factorization approach and were compared with those inferred from long-read PacBio sequencing. We found that APA events were transient and reverted after effector phase expansion. Using an orthogonal bulk RNA-seq data set, we did not find evidence of APA association with differential gene expression or transcript usage, indicating that APA has only a marginal effect on transcript abundance. 3'-UTR sequence analysis revealed conserved binding sites for T cell-relevant microRNAs and RBPs in the alternative 3' UTRs. These results indicate that poly(A) site usage could play an important role in the control of cell fate decisions and homeostasis.
Collapse
Affiliation(s)
- Denis Seyres
- Department of Biomedicine, Basel University Hospital and University of Basel, CH-4031 Basel, Switzerland
- Transplantation Immunology and Nephrology, Basel University Hospital, CH-4031 Basel, Switzerland
| | - Oliver Gorka
- Department of Biomedicine, Basel University Hospital and University of Basel, CH-4031 Basel, Switzerland
- Transplantation Immunology and Nephrology, Basel University Hospital, CH-4031 Basel, Switzerland
| | - Ralf Schmidt
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Romina Marone
- Department of Biomedicine, Basel University Hospital and University of Basel, CH-4031 Basel, Switzerland
- Transplantation Immunology and Nephrology, Basel University Hospital, CH-4031 Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Lukas T Jeker
- Department of Biomedicine, Basel University Hospital and University of Basel, CH-4031 Basel, Switzerland
- Transplantation Immunology and Nephrology, Basel University Hospital, CH-4031 Basel, Switzerland
| |
Collapse
|
6
|
Pandini C, Pagani G, Tassinari M, Vitale E, Bezzecchi E, Saadeldin MK, Doldi V, Giannuzzi G, Mantovani R, Chiara M, Ciarrocchi A, Gandellini P. The pancancer overexpressed NFYC Antisense 1 controls cell cycle mitotic progression through in cis and in trans modes of action. Cell Death Dis 2024; 15:206. [PMID: 38467619 PMCID: PMC10928104 DOI: 10.1038/s41419-024-06576-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/20/2024] [Accepted: 02/23/2024] [Indexed: 03/13/2024]
Abstract
Antisense RNAs (asRNAs) represent an underappreciated yet crucial layer of gene expression regulation. Generally thought to modulate their sense genes in cis through sequence complementarity or their act of transcription, asRNAs can also regulate different molecular targets in trans, in the nucleus or in the cytoplasm. Here, we performed an in-depth molecular characterization of NFYC Antisense 1 (NFYC-AS1), the asRNA transcribed head-to-head to NFYC subunit of the proliferation-associated NF-Y transcription factor. Our results show that NFYC-AS1 is a prevalently nuclear asRNA peaking early in the cell cycle. Comparative genomics suggests a narrow phylogenetic distribution, with a probable origin in the common ancestor of mammalian lineages. NFYC-AS1 is overexpressed pancancer, preferentially in association with RB1 mutations. Knockdown of NFYC-AS1 by antisense oligonucleotides impairs cell growth in lung squamous cell carcinoma and small cell lung cancer cells, a phenotype recapitulated by CRISPR/Cas9-deletion of its transcription start site. Surprisingly, expression of the sense gene is affected only when endogenous transcription of NFYC-AS1 is manipulated. This suggests that regulation of cell proliferation is at least in part independent of the in cis transcription-mediated effect on NFYC and is possibly exerted by RNA-dependent in trans effects converging on the regulation of G2/M cell cycle phase genes. Accordingly, NFYC-AS1-depleted cells are stuck in mitosis, indicating defects in mitotic progression. Overall, NFYC-AS1 emerged as a cell cycle-regulating asRNA with dual action, holding therapeutic potential in different cancer types, including the very aggressive RB1-mutated tumors.
Collapse
Affiliation(s)
- Cecilia Pandini
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Giulia Pagani
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Martina Tassinari
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Emanuele Vitale
- Laboratory of Translational Research, Azienda USL-IRCCS di Reggio Emilia, Viale Risorgimento 80, 42123, Reggio Emilia, Italy
- Clinical and Experimental Medicine PhD Program, University of Modena and Reggio Emilia, Via Università 4, 41121, Modena, Italy
| | - Eugenia Bezzecchi
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Mona Kamal Saadeldin
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
- Biology Department, School of Science and Engineering, The American University in Cairo, New Cairo, 11835, Egypt
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Valentina Doldi
- Molecular Pharmacology Unit, Department of Experimental Oncology, Fondazione IRCSS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Giuliana Giannuzzi
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Roberto Mantovani
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Matteo Chiara
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy
| | - Alessia Ciarrocchi
- Laboratory of Translational Research, Azienda USL-IRCCS di Reggio Emilia, Viale Risorgimento 80, 42123, Reggio Emilia, Italy
| | - Paolo Gandellini
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milan, Italy.
| |
Collapse
|
7
|
Bryce-Smith S, Brown AL, Mehta PR, Mattedi F, Mikheenko A, Barattucci S, Zanovello M, Dattilo D, Yome M, Hill SE, Qi YA, Wilkins OG, Sun K, Ryadnov E, Wan Y, Vargas JNS, Birsa N, Raj T, Humphrey J, Keuss M, Ward M, Secrier M, Fratta P. TDP-43 loss induces extensive cryptic polyadenylation in ALS/FTD. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.22.576625. [PMID: 38313254 PMCID: PMC10836071 DOI: 10.1101/2024.01.22.576625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
Nuclear depletion and cytoplasmic aggregation of the RNA-binding protein TDP-43 is the hallmark of ALS, occurring in over 97% of cases. A key consequence of TDP-43 nuclear loss is the de-repression of cryptic exons. Whilst TDP-43 regulated cryptic splicing is increasingly well catalogued, cryptic alternative polyadenylation (APA) events, which define the 3' end of last exons, have been largely overlooked, especially when not associated with novel upstream splice junctions. We developed a novel bioinformatic approach to reliably identify distinct APA event types: alternative last exons (ALE), 3'UTR extensions (3'Ext) and intronic polyadenylation (IPA) events. We identified novel neuronal cryptic APA sites induced by TDP-43 loss of function by systematically applying our pipeline to a compendium of publicly available and in house datasets. We find that TDP-43 binding sites and target motifs are enriched at these cryptic events and that TDP-43 can have both repressive and enhancing action on APA. Importantly, all categories of cryptic APA can also be identified in ALS and FTD post mortem brain regions with TDP-43 proteinopathy underlining their potential disease relevance. RNA-seq and Ribo-seq analyses indicate that distinct cryptic APA categories have different downstream effects on transcript and translation. Intriguingly, cryptic 3'Exts occur in multiple transcription factors, such as ELK1, SIX3, and TLX1, and lead to an increase in wild-type protein levels and function. Finally, we show that an increase in RNA stability leading to a higher cytoplasmic localisation underlies these observations. In summary, we demonstrate that TDP-43 nuclear depletion induces a novel category of cryptic RNA processing events and we expand the palette of TDP-43 loss consequences by showing this can also lead to an increase in normal protein translation.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Anna-Leigh Brown
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Puja R. Mehta
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Francesca Mattedi
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Alla Mikheenko
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Simone Barattucci
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Matteo Zanovello
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Dario Dattilo
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Matthew Yome
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Sarah E. Hill
- National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA
| | - Yue A. Qi
- National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA
| | - Oscar G. Wilkins
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
- The Francis Crick Institute, London, UK
| | - Kai Sun
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Eugeni Ryadnov
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Yixuan Wan
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | | | - Jose Norberto S. Vargas
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Nicol Birsa
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Towfique Raj
- Nash Family Department of Neuroscience & Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Ronald M. Loeb Center for Alzheimer’s Disease, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences & Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Estelle and Daniel Maggin Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jack Humphrey
- Nash Family Department of Neuroscience & Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Ronald M. Loeb Center for Alzheimer’s Disease, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences & Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Estelle and Daniel Maggin Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Matthew Keuss
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Michael Ward
- National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA
| | - Maria Secrier
- UCL Genetics Institute, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Pietro Fratta
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
- The Francis Crick Institute, London, UK
| |
Collapse
|
8
|
Zeng Y, Lovchykova A, Akiyama T, Liu C, Guo C, Jawahar VM, Sianto O, Calliari A, Prudencio M, Dickson DW, Petrucelli L, Gitler AD. TDP-43 nuclear loss in FTD/ALS causes widespread alternative polyadenylation changes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.22.575730. [PMID: 38328059 PMCID: PMC10849503 DOI: 10.1101/2024.01.22.575730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
In frontotemporal dementia and amyotrophic lateral sclerosis, the RNA-binding protein TDP-43 is depleted from the nucleus. TDP-43 loss leads to cryptic exon inclusion but a role in other RNA processing events remains unresolved. Here, we show that loss of TDP-43 causes widespread changes in alternative polyadenylation, impacting expression of disease-relevant genes (e.g., ELP1, NEFL, and TMEM106B) and providing evidence that alternative polyadenylation is a new facet of TDP-43 pathology.
Collapse
Affiliation(s)
- Yi Zeng
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Tetsuya Akiyama
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Chang Liu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Caiwei Guo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Vidhya Maheswari Jawahar
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Neuroscience Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences, Jacksonville, FL, USA
| | - Odilia Sianto
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Anna Calliari
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Neuroscience Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences, Jacksonville, FL, USA
| | - Mercedes Prudencio
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Neuroscience Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences, Jacksonville, FL, USA
| | - Dennis W. Dickson
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Neuroscience Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences, Jacksonville, FL, USA
| | - Leonard Petrucelli
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Neuroscience Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences, Jacksonville, FL, USA
| | - Aaron D. Gitler
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Chan Zuckerberg Biohub – San Francisco, San Francisco, CA, USA
| |
Collapse
|
9
|
Calvo-Roitberg E, Carroll CL, Venev SV, Kim G, Mick ST, Dekker J, Fiszbein A, Pai AA. mRNA initiation and termination are spatially coordinated. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.05.574404. [PMID: 38260419 PMCID: PMC10802295 DOI: 10.1101/2024.01.05.574404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The expression of a precise mRNA transcriptome is crucial for establishing cell identity and function, with dozens of alternative isoforms produced for a single gene sequence. The regulation of mRNA isoform usage occurs by the coordination of co-transcriptional mRNA processing mechanisms across a gene. Decisions involved in mRNA initiation and termination underlie the largest extent of mRNA isoform diversity, but little is known about any relationships between decisions at both ends of mRNA molecules. Here, we systematically profile the joint usage of mRNA transcription start sites (TSSs) and polyadenylation sites (PASs) across tissues and species. Using both short and long read RNA-seq data, we observe that mRNAs preferentially using upstream TSSs also tend to use upstream PASs, and congruently, the usage of downstream sites is similarly paired. This observation suggests that mRNA 5' end choice may directly influence mRNA 3' ends. Our results suggest a novel "Positional Initiation-Termination Axis" (PITA), in which the usage of alternative terminal sites are coupled based on the order in which they appear in the genome. PITA isoforms are more likely to encode alternative protein domains and use conserved sites. PITA is strongly associated with the length of genomic features, such that PITA is enriched in longer genes with more area devoted to regions that regulate alternative 5' or 3' ends. Strikingly, we found that PITA genes are more likely than non-PITA genes to have multiple, overlapping chromatin structural domains related to pairing of ordinally coupled start and end sites. In turn, PITA coupling is also associated with fast RNA Polymerase II (RNAPII) trafficking across these long gene regions. Our findings indicate that a combination of spatial and kinetic mechanisms couple transcription initiation and mRNA 3' end decisions based on ordinal position to define the expression mRNA isoforms.
Collapse
Affiliation(s)
| | | | - Sergey V. Venev
- Department of Systems Biology, University Massachusetts Chan Medical School, Worcester, MA
| | - GyeungYun Kim
- Department of Biology, Boston University, Boston, MA
| | | | - Job Dekker
- Department of Systems Biology, University Massachusetts Chan Medical School, Worcester, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Ana Fiszbein
- Department of Biology, Boston University, Boston, MA
- Center for Computing & Data Sciences, Boston University, Boston, MA
| | - Athma A. Pai
- RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA
| |
Collapse
|
10
|
Neumann DP, Pillman KA, Dredge BK, Bert AG, Phillips CA, Lumb R, Ramani Y, Bracken CP, Hollier BG, Selth LA, Beilharz TH, Goodall GJ, Gregory PA. The landscape of alternative polyadenylation during EMT and its regulation by the RNA-binding protein Quaking. RNA Biol 2024; 21:1-11. [PMID: 38112323 PMCID: PMC10732628 DOI: 10.1080/15476286.2023.2294222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/05/2023] [Indexed: 12/21/2023] Open
Abstract
Epithelial-mesenchymal transition (EMT) plays important roles in tumour progression and is orchestrated by dynamic changes in gene expression. While it is well established that post-transcriptional regulation plays a significant role in EMT, the extent of alternative polyadenylation (APA) during EMT has not yet been explored. Using 3' end anchored RNA sequencing, we mapped the alternative polyadenylation (APA) landscape following Transforming Growth Factor (TGF)-β-mediated induction of EMT in human mammary epithelial cells and found APA generally causes 3'UTR lengthening during this cell state transition. Investigation of potential mediators of APA indicated the RNA-binding protein Quaking (QKI), a splicing factor induced during EMT, regulates a subset of events including the length of its own transcript. Analysis of QKI crosslinked immunoprecipitation (CLIP)-sequencing data identified the binding of QKI within 3' untranslated regions (UTRs) was enriched near cleavage and polyadenylation sites. Following QKI knockdown, APA of many transcripts is altered to produce predominantly shorter 3'UTRs associated with reduced gene expression. These findings reveal the changes in APA that occur during EMT and identify a potential role for QKI in this process.
Collapse
Affiliation(s)
- Daniel P. Neumann
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - Katherine A. Pillman
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - B. Kate Dredge
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - Andrew G. Bert
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - Caroline A. Phillips
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - Rachael Lumb
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - Yesha Ramani
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
| | - Cameron P. Bracken
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
- Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, Australia
| | - Brett G. Hollier
- Australian Prostate Cancer Research Centre - Queensland, Centre for Genomics and Personalised Health, Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD, Australia
| | - Luke A. Selth
- Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, Australia
- Flinders Health and Medical Research Institute, Flinders University, Bedford Park, SA, Australia
| | - Traude H. Beilharz
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Gregory J. Goodall
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
- Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, Australia
| | - Philip A. Gregory
- Centre for Cancer Biology, University of South Australia and SA Pathology, Adelaide, SA, Australia
- Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, Australia
| |
Collapse
|
11
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. RNA (NEW YORK, N.Y.) 2023; 29:1839-1855. [PMID: 37816550 PMCID: PMC10653393 DOI: 10.1261/rna.079849.123] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/12/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- Department of Neuromuscular Diseases, UCL Queen Square Motor Neuron Disease Centre, UCL Queen Square Institute of Neurology, UCL, London WC1N 3BG, United Kingdom
| | - Dominik Burri
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Matthew R Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Christina J Herrmann
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Christina M Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore 138672
- Yong Loo Lin School of Medicine, National University of Singapore, Kent Ridge, Singapore 119228
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mervin M Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell Graduate Studies, New York, New York 10065, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, New York 10065, USA
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Meritxell Ferret
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Asier Gonzalez-Uriarte
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Chelsea Herdman
- Department of Neurobiology, University of Utah, Salt Lake City, Utah 84132, USA
| | - Alexander Kanitz
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg-University Mainz, 55118 Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9NL, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, Amsterdam UMC, University of Amsterdam, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, 3521 AL Utrecht, The Netherlands
| | - Chi-Lam Poon
- Graduate School of Medical Sciences, Weill Cornell Medicine, New York, New York 10065, USA
| | - Gregor Rot
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72076 Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California 92617, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Dondi A, Lischetti U, Jacob F, Singer F, Borgsmüller N, Coelho R, Heinzelmann-Schwarz V, Beisel C, Beerenwinkel N. Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing in ovarian cancer. Nat Commun 2023; 14:7780. [PMID: 38012143 PMCID: PMC10682465 DOI: 10.1038/s41467-023-43387-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 11/07/2023] [Indexed: 11/29/2023] Open
Abstract
Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Here, we perform long-read single-cell RNA sequencing (scRNA-seq) on clinical samples from three ovarian cancer patients presenting with omental metastasis and increase the PacBio sequencing depth to 12,000 reads per cell. Our approach captures 152,000 isoforms, of which over 52,000 were not previously reported. Isoform-level analysis accounting for non-coding isoforms reveals 20% overestimation of protein-coding gene expression on average. We also detect cell type-specific isoform and poly-adenylation site usage in tumor and mesothelial cells, and find that mesothelial cells transition into cancer-associated fibroblasts in the metastasis, partly through the TGF-β/miR-29/Collagen axis. Furthermore, we identify gene fusions, including an experimentally validated IGF2BP2::TESPA1 fusion, which is misclassified as high TESPA1 expression in matched short-read data, and call mutations confirmed by targeted NGS cancer gene panel results. With these findings, we envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine.
Collapse
Affiliation(s)
- Arthur Dondi
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Ulrike Lischetti
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland.
| | - Francis Jacob
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland
| | - Franziska Singer
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
- ETH Zurich, NEXUS Personalized Health Technologies, Wagistrasse 18, 8952, Schlieren, Switzerland
| | - Nico Borgsmüller
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Ricardo Coelho
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland
| | - Viola Heinzelmann-Schwarz
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland
- University Hospital Basel, Gynecological Cancer Center, Spitalstrasse 21, 4031, Basel, Switzerland
| | - Christian Beisel
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
| | - Niko Beerenwinkel
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland.
| |
Collapse
|
13
|
Stroup EK, Ji Z. Deep learning of human polyadenylation sites at nucleotide resolution reveals molecular determinants of site usage and relevance in disease. Nat Commun 2023; 14:7378. [PMID: 37968271 PMCID: PMC10651852 DOI: 10.1038/s41467-023-43266-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 11/05/2023] [Indexed: 11/17/2023] Open
Abstract
The genomic distribution of cleavage and polyadenylation (polyA) sites should be co-evolutionally optimized with the local gene structure. Otherwise, spurious polyadenylation can cause premature transcription termination and generate aberrant proteins. To obtain mechanistic insights into polyA site optimization across the human genome, we develop deep/machine learning models to identify genome-wide putative polyA sites at unprecedented nucleotide-level resolution and calculate their strength and usage in the genomic context. Our models quantitatively measure position-specific motif importance and their crosstalk in polyA site formation and cleavage heterogeneity. The intronic site expression is governed by the surrounding splicing landscape. The usage of alternative polyA sites in terminal exons is modulated by their relative locations and distance to downstream genes. Finally, we apply our models to reveal thousands of disease- and trait-associated genetic variants altering polyadenylation activity. Altogether, our models represent a valuable resource to dissect molecular mechanisms mediating genome-wide polyA site expression and characterize their functional roles in human diseases.
Collapse
Affiliation(s)
- Emily Kunce Stroup
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, 60628, USA.
| |
Collapse
|
14
|
Patowary A, Zhang P, Jops C, Vuong CK, Ge X, Hou K, Kim M, Gong N, Margolis M, Vo D, Wang X, Liu C, Pasaniuc B, Li JJ, Gandal MJ, de la Torre-Ubieta L. Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.25.534016. [PMID: 36993726 PMCID: PMC10055310 DOI: 10.1101/2023.03.25.534016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders, yet the role of cell-type-specific splicing or transcript-isoform diversity during human brain development has not been systematically investigated. Here, we leveraged single-molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone (GZ) and cortical plate (CP) regions of the developing human neocortex at tissue and single-cell resolution. We identified 214,516 unique isoforms, of which 72.6% are novel (unannotated in Gencode-v33), and uncovered a substantial contribution of transcript-isoform diversity, regulated by RNA binding proteins, in defining cellular identity in the developing neocortex. We leveraged this comprehensive isoform-centric gene annotation to re-prioritize thousands of rare de novo risk variants and elucidate genetic risk mechanisms for neuropsychiatric disorders. One-Sentence Summary A cell-specific atlas of gene isoform expression helps shape our understanding of brain development and disease. Structured Abstract INTRODUCTION: The development of the human brain is regulated by precise molecular and genetic mechanisms driving spatio-temporal and cell-type-specific transcript expression programs. Alternative splicing, a major mechanism increasing transcript diversity, is highly prevalent in the human brain, influences many aspects of brain development, and has strong links to neuropsychiatric disorders. Despite this, the cell-type-specific transcript-isoform diversity of the developing human brain has not been systematically investigated.RATIONALE: Understanding splicing patterns and isoform diversity across the developing neocortex has translational relevance and can elucidate genetic risk mechanisms in neurodevelopmental disorders. However, short-read sequencing, the prevalent technology for transcriptome profiling, is not well suited to capturing alternative splicing and isoform diversity. To address this, we employed third-generation long-read sequencing, which enables capture and sequencing of complete individual RNA molecules, to deeply profile the full-length transcriptome of the germinal zone (GZ) and cortical plate (CP) regions of the developing human neocortex at tissue and single-cell resolution.RESULTS: We profiled microdissected GZ and CP regions of post-conception week (PCW) 15-17 human neocortex in bulk and at single-cell resolution across six subjects using high-fidelity long-read sequencing (PacBio IsoSeq). We identified 214,516 unique isoforms, of which 72.6% were novel (unannotated in Gencode), and >7,000 novel exons, expanding the proteome by 92,422 putative proteoforms. We uncovered thousands of isoform switches during cortical neurogenesis predicted to impact RNA regulatory domains or protein structure and implicating previously uncharacterized RNA-binding proteins in cellular identity and neuropsychiatric disease. At the single-cell level, early-stage excitatory neurons exhibited the greatest isoform diversity, and isoform-centric single-cell clustering led to the identification of previously uncharacterized cell states. We systematically assessed the contribution of transcriptomic features, and localized cell and spatio-temporal transcript expression signatures across neuropsychiatric disorders, revealing predominant enrichments in dynamic isoform expression and utilization patterns and that the number and complexity of isoforms per gene is strongly predictive of disease. Leveraging this resource, we re-prioritized thousands of rare de novo risk variants associated with autism spectrum disorders (ASD), intellectual disability (ID), and neurodevelopmental disorders (NDDs), more broadly, to potentially more severe consequences and revealed a larger proportion of cryptic splice variants with the expanded transcriptome annotation provided in this study.CONCLUSION: Our study offers a comprehensive landscape of isoform diversity in the human neocortex during development. This extensive cataloging of novel isoforms and splicing events sheds light on the underlying mechanisms of neurodevelopmental disorders and presents an opportunity to explore rare genetic variants linked to these conditions. The implications of our findings extend beyond fundamental neuroscience, as they provide crucial insights into the molecular basis of developmental brain disorders and pave the way for targeted therapeutic interventions. To facilitate exploration of this dataset we developed an online portal ( https://sciso.gandallab.org/ ).
Collapse
|
15
|
Kang B, Yang Y, Hu K, Ruan X, Liu YL, Lee P, Lee J, Wang J, Zhang X. Infernape uncovers cell type-specific and spatially resolved alternative polyadenylation in the brain. Genome Res 2023; 33:1774-1787. [PMID: 37907328 PMCID: PMC10691540 DOI: 10.1101/gr.277864.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/12/2023] [Indexed: 11/02/2023]
Abstract
Differential polyadenylation sites (PAs) critically regulate gene expression, but their cell type-specific usage and spatial distribution in the brain have not been systematically characterized. Here, we present Infernape, which infers and quantifies PA usage from single-cell and spatial transcriptomic data and show its application in the mouse brain. Infernape uncovers alternative intronic PAs and 3'-UTR lengthening during cortical neurogenesis. Progenitor-neuron comparisons in the excitatory and inhibitory neuron lineages show overlapping PA changes in embryonic brains, suggesting that the neural proliferation-differentiation axis plays a prominent role. In the adult mouse brain, we uncover cell type-specific PAs and visualize such events using spatial transcriptomic data. Over two dozen neurodevelopmental disorder-associated genes such as Csnk2a1 and Mecp2 show differential PAs during brain development. This study presents Infernape to identify PAs from scRNA-seq and spatial data, and highlights the role of alternative PAs in neuronal gene regulation.
Collapse
Affiliation(s)
- Bowei Kang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yalan Yang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Kaining Hu
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Xiangbin Ruan
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yi-Lin Liu
- Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Pinky Lee
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jasper Lee
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jingshu Wang
- Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA;
| | - Xiaochang Zhang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA;
- The Neuroscience Institute, The University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
16
|
Fang L, Velema WA, Lee Y, Xiao L, Mohsen MG, Kietrys AM, Kool ET. Pervasive transcriptome interactions of protein-targeted drugs. Nat Chem 2023; 15:1374-1383. [PMID: 37653232 DOI: 10.1038/s41557-023-01309-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 07/27/2023] [Indexed: 09/02/2023]
Abstract
The off-target toxicity of drugs targeted to proteins imparts substantial health and economic costs. Proteome interaction studies can reveal off-target effects with unintended proteins; however, little attention has been paid to intracellular RNAs as potential off-targets that may contribute to toxicity. To begin to assess this, we developed a reactivity-based RNA profiling methodology and applied it to uncover transcriptome interactions of a set of Food and Drug Administration-approved small-molecule drugs in vivo. We show that these protein-targeted drugs pervasively interact with the human transcriptome and can exert unintended biological effects on RNA functions. In addition, we show that many off-target interactions occur at RNA loci associated with protein binding and structural changes, allowing us to generate hypotheses to infer the biological consequences of RNA off-target binding. The results suggest that rigorous characterization of drugs' transcriptome interactions may help assess target specificity and potentially avoid toxicity and clinical failures.
Collapse
Affiliation(s)
- Linglan Fang
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Willem A Velema
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Yujeong Lee
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Lu Xiao
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | | | - Anna M Kietrys
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Eric T Kool
- Department of Chemistry, Stanford University, Stanford, CA, USA.
- Sarafan ChEM-H Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
17
|
Calvo-Roitberg E, Daniels RF, Pai AA. Challenges in identifying mRNA transcript starts and ends from long-read sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.26.550536. [PMID: 37546743 PMCID: PMC10402045 DOI: 10.1101/2023.07.26.550536] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Long-read sequencing (LRS) technologies have the potential to revolutionize scientific discoveries in RNA biology, especially by enabling the comprehensive identification and quantification of full length mRNA isoforms. However, inherently high error rates make the analysis of long-read sequencing data challenging. While these error rates have been characterized for sequence and splice site identification, it is still unclear how accurately LRS reads represent transcript start and end sites. Here, we systematically assess the variability and accuracy of mRNA terminal ends identified by LRS reads across multiple sequencing platforms. We find substantial inconsistencies in both the start and end coordinates of LRS reads spanning a gene, such that LRS reads often fail to accurately recapitulate annotated or empirically derived terminal ends of mRNA molecules. To address this challenge, we introduce an approach to condition reads based on empirically derived terminal ends and identified a subset of reads that are more likely to represent full-length transcripts. Our approach can improve transcriptome analyses by enhancing the fidelity of transcript terminal end identification, but may result in lower power to quantify genes or discover novel isoforms. Thus, it is necessary to be cautious when selecting sequencing approaches and/or interpreting data from long-read RNA sequencing.
Collapse
Affiliation(s)
| | - Rachel F Daniels
- RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA
| | - Athma A Pai
- RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA
| |
Collapse
|
18
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.23.546284. [PMID: 37425672 PMCID: PMC10327023 DOI: 10.1101/2023.06.23.546284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, and limitations and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for seamless extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies. Furthermore, the containers and reproducible workflows generated in the course of this project can be seamlessly deployed and extended in the future to evaluate new methods or datasets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Dominik Burri
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Matthew R. Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Christina J. Herrmann
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom
| | - Christina M. Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore
- National University of Singapore, Kent Ridge, Singapore
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA
| | - Mervin M. Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell GraduateStudies, New York, NY, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, NY, USA
| | - José M. Fernández
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Meritxell Ferret
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Asier Gonzalez-Uriarte
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom
| | | | - Alexander Kanitz
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI) - UniversityMedical Center of the Johannes Gutenberg, University Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, AmsterdamUMC, University of Amsterdam, and Oncode Institute, Amsterdam, The Netherlands
| | | | - Gregor Rot
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Life Sciences, Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven CT, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
19
|
Pardo-Palacios FJ, Arzalluz-Luque A, Kondratova L, Salguero P, Mestre-Tomás J, Amorín R, Estevan-Morió E, Liu T, Nanni A, McIntyre L, Tseng E, Conesa A. SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541248. [PMID: 37398077 PMCID: PMC10312485 DOI: 10.1101/2023.05.17.541248] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The emergence of long-read RNA sequencing (lrRNA-seq) has provided an unprecedented opportunity to analyze transcriptomes at isoform resolution. However, the technology is not free from biases, and transcript models inferred from these data require quality control and curation. In this study, we introduce SQANTI3, a tool specifically designed to perform quality analysis on transcriptomes constructed using lrRNA-seq data. SQANTI3 provides an extensive naming framework to describe transcript model diversity in comparison to the reference transcriptome. Additionally, the tool incorporates a wide range of metrics to characterize various structural properties of transcript models, such as transcription start and end sites, splice junctions, and other structural features. These metrics can be utilized to filter out potential artifacts. Moreover, SQANTI3 includes a Rescue module that prevents the loss of known genes and transcripts exhibiting evidence of expression but displaying low-quality features. Lastly, SQANTI3 incorporates IsoAnnotLite, which enables functional annotation at the isoform level and facilitates functional iso-transcriptomics analyses. We demonstrate the versatility of SQANTI3 in analyzing different data types, isoform reconstruction pipelines, and sequencing platforms, and how it provides novel biological insights into isoform biology. The SQANTI3 software is available at https://github.com/ConesaLab/SQANTI3 .
Collapse
|
20
|
Vlasenok M, Margasyuk S, Pervouchine DD. Transcriptome sequencing suggests that pre-mRNA splicing counteracts widespread intronic cleavage and polyadenylation. NAR Genom Bioinform 2023; 5:lqad051. [PMID: 37260513 PMCID: PMC10227441 DOI: 10.1093/nargab/lqad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 05/09/2023] [Accepted: 05/17/2023] [Indexed: 06/02/2023] Open
Abstract
Alternative splicing (AS) and alternative polyadenylation (APA) are two crucial steps in the post-transcriptional regulation of eukaryotic gene expression. Protocols capturing and sequencing RNA 3'-ends have uncovered widespread intronic polyadenylation (IPA) in normal and disease conditions, where it is currently attributed to stochastic variations in the pre-mRNA processing. Here, we took advantage of the massive amount of RNA-seq data generated by the Genotype Tissue Expression project (GTEx) to simultaneously identify and match tissue-specific expression of intronic polyadenylation sites with tissue-specific splicing. A combination of computational methods including the analysis of short reads with non-templated adenines revealed that APA events are more abundant in introns than in exons. While the rate of IPA in composite terminal exons and skipped terminal exons expectedly correlates with splicing, we observed a considerable fraction of IPA events that lack AS support and attributed them to spliced polyadenylated introns (SPI). We hypothesize that SPIs represent transient byproducts of a dynamic coupling between APA and AS, in which the spliceosome removes the intron while it is being cleaved and polyadenylated. These findings indicate that cotranscriptional pre-mRNA splicing could serve as a rescue mechanism to suppress premature transcription termination at intronic polyadenylation sites.
Collapse
Affiliation(s)
- Maria Vlasenok
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar 30, Moscow 121205, Russia
| | - Sergey Margasyuk
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar 30, Moscow 121205, Russia
| | - Dmitri D Pervouchine
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar 30, Moscow 121205, Russia
| |
Collapse
|
21
|
Liu H, Liu H, Wang L, Song L, Jiang G, Lu Q, Yang T, Peng H, Cai R, Zhao X, Zhao T, Wu H. Cochlear transcript diversity and its role in auditory functions implied by an otoferlin short isoform. Nat Commun 2023; 14:3085. [PMID: 37248244 DOI: 10.1038/s41467-023-38621-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 05/10/2023] [Indexed: 05/31/2023] Open
Abstract
Isoforms of a gene may contribute to diverse biological functions. In the cochlea, the repertoire of alternative isoforms remains unexplored. We integrated single-cell short-read and long-read RNA sequencing techniques and identified 236,012 transcripts, 126,612 of which were unannotated in the GENCODE database. Then we analyzed and verified the unannotated transcripts using RNA-seq, RT-PCR, Sanger sequencing, and MS-based proteomics approaches. To illustrate the importance of identifying spliced isoforms, we investigated otoferlin, a key protein involved in synaptic transmission in inner hair cells (IHCs). Upon deletion of the canonical otoferlin isoform, the identified short isoform is able to support normal hearing thresholds but with reduced sustained exocytosis of IHCs, and further revealed otoferlin functions in endocytic membrane retrieval that was not well-addressed previously. Furthermore, we found that otoferlin isoforms are associated with IHC functions and auditory phenotypes. This work expands our mechanistic understanding of auditory functions at the level of isoform resolution.
Collapse
Affiliation(s)
- Huihui Liu
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Hongchao Liu
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Longhao Wang
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Lei Song
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Guixian Jiang
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Qing Lu
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Institutes, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Tao Yang
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Hu Peng
- Department of Otolaryngology-Head and Neck Surgery, Changzheng Hospital, Second Military Medical University, Shanghai, 200003, China
| | - Ruijie Cai
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Xingle Zhao
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Ting Zhao
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China
| | - Hao Wu
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China.
- Ear Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China.
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai, 200011, China.
| |
Collapse
|
22
|
Reese F, Williams B, Balderrama-Gutierrez G, Wyman D, Çelik MH, Rebboah E, Rezaie N, Trout D, Razavi-Mohseni M, Jiang Y, Borsari B, Morabito S, Liang HY, McGill CJ, Rahmanian S, Sakr J, Jiang S, Zeng W, Carvalho K, Weimer AK, Dionne LA, McShane A, Bedi K, Elhajjajy SI, Upchurch S, Jou J, Youngworth I, Gabdank I, Sud P, Jolanki O, Strattan JS, Kagda MS, Snyder MP, Hitz BC, Moore JE, Weng Z, Bennett D, Reinholdt L, Ljungman M, Beer MA, Gerstein MB, Pachter L, Guigó R, Wold BJ, Mortazavi A. The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.15.540865. [PMID: 37292896 PMCID: PMC10245583 DOI: 10.1101/2023.05.15.540865] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3' end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts. We sequenced 264 LR-RNA-seq PacBio libraries totaling over 1 billion circular consensus reads (CCS) for 81 unique human and mouse samples. We detect at least one full-length transcript from 87.7% of annotated human protein coding genes and a total of 200,000 full-length transcripts, 40% of which have novel exon junction chains. To capture and compute on the three sources of transcript structure diversity, we introduce a gene and transcript annotation framework that uses triplets representing the transcript start site, exon junction chain, and transcript end site of each transcript. Using triplets in a simplex representation demonstrates how promoter selection, splice pattern, and 3' processing are deployed across human tissues, with nearly half of multi-transcript protein coding genes showing a clear bias toward one of the three diversity mechanisms. Evaluated across samples, the predominantly expressed transcript changes for 74% of protein coding genes. In evolution, the human and mouse transcriptomes are globally similar in types of transcript structure diversity, yet among individual orthologous gene pairs, more than half (57.8%) show substantial differences in mechanism of diversification in matching tissues. This initial large-scale survey of human and mouse long-read transcriptomes provides a foundation for further analyses of alternative transcript usage, and is complemented by short-read and microRNA data on the same samples and by epigenome data elsewhere in the ENCODE4 collection.
Collapse
Affiliation(s)
- Fairlie Reese
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Brian Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
| | - Gabriela Balderrama-Gutierrez
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Dana Wyman
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Muhammed Hasan Çelik
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Elisabeth Rebboah
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Narges Rezaie
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Diane Trout
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
| | - Milad Razavi-Mohseni
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, USA
| | - Yunzhe Jiang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, USA
| | - Beatrice Borsari
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, USA
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Samuel Morabito
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Heidi Yahan Liang
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Cassandra J McGill
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Sorena Rahmanian
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Jasmine Sakr
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, USA
| | - Shan Jiang
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Weihua Zeng
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Klebea Carvalho
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Annika K Weimer
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Louise A Dionne
- The Jackson Laboratory, The Jackson Laboratory, Bar Harbor, USA
| | - Ariel McShane
- Cellular and Molecular Biology Program, University of Michigan, Ann Arbor, USA
- Department of Radiation Oncology, University of Michigan, Ann Arbor, USA
| | - Karan Bedi
- Department of Biostatistics, University of Michigan, Ann Arbor, USA
- Center for RNA Biomedicine and Rogel Cancer Center, University of Michigan, Ann Arbor, USA
| | - Shaimae I Elhajjajy
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, USA
| | - Sean Upchurch
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
| | - Jennifer Jou
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Ingrid Youngworth
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Paul Sud
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Otto Jolanki
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - J Seth Strattan
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Meenakshi S Kagda
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Michael P Snyder
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Ben C Hitz
- Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, USA
| | - David Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, USA
- Department of Neurological Sciences, Rush University Medical Center, Chicago, USA
| | - Laura Reinholdt
- The Jackson Laboratory, The Jackson Laboratory, Bar Harbor, USA
| | - Mats Ljungman
- Center for RNA Biomedicine and Rogel Cancer Center, University of Michigan, Ann Arbor, USA
- Departments of Radiation Oncology and Environmental Health Sciences, University of Michigan, Ann Arbor, USA
| | - Michael A Beer
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, USA
| | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, USA
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, USA
- Department of Statistics and Data Science, Yale University, New Haven, USA
- Department of Computer Science, Yale University, New Haven, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, USA
| | - Roderic Guigó
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Barbara J Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
| | - Ali Mortazavi
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| |
Collapse
|
23
|
van der Noord VE, van der Stel W, Louwerens G, Verhoeven D, Kuiken HJ, Lieftink C, Grandits M, Ecker GF, Beijersbergen RL, Bouwman P, Le Dévédec SE, van de Water B. Systematic screening identifies ABCG2 as critical factor underlying synergy of kinase inhibitors with transcriptional CDK inhibitors. Breast Cancer Res 2023; 25:51. [PMID: 37147730 PMCID: PMC10161439 DOI: 10.1186/s13058-023-01648-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 04/07/2023] [Indexed: 05/07/2023] Open
Abstract
BACKGROUND Triple-negative breast cancer (TNBC) is a subtype of breast cancer with limited treatment options and poor clinical prognosis. Inhibitors of transcriptional CDKs are currently under thorough investigation for application in the treatment of multiple cancer types, including breast cancer. These studies have raised interest in combining these inhibitors, including CDK12/13 inhibitor THZ531, with a variety of other anti-cancer agents. However, the full scope of these potential synergistic interactions of transcriptional CDK inhibitors with kinase inhibitors has not been systematically investigated. Moreover, the mechanisms behind these previously described synergistic interactions remain largely elusive. METHODS Kinase inhibitor combination screenings were performed to identify kinase inhibitors that synergize with CDK7 inhibitor THZ1 and CDK12/13 inhibitor THZ531 in TNBC cell lines. CRISPR-Cas9 knockout screening and transcriptomic evaluation of resistant versus sensitive cell lines were performed to identify genes critical for THZ531 resistance. RNA sequencing analysis after treatment with individual and combined synergistic treatments was performed to gain further insights into the mechanism of this synergy. Kinase inhibitor screening in combination with visualization of ABCG2-substrate pheophorbide A was used to identify kinase inhibitors that inhibit ABCG2. Multiple transcriptional CDK inhibitors were evaluated to extend the significance of the found mechanism to other transcriptional CDK inhibitors. RESULTS We show that a very high number of tyrosine kinase inhibitors synergize with the CDK12/13 inhibitor THZ531. Yet, we identified the multidrug transporter ABCG2 as key determinant of THZ531 resistance in TNBC cells. Mechanistically, we demonstrate that most synergistic kinase inhibitors block ABCG2 function, thereby sensitizing cells to transcriptional CDK inhibitors, including THZ531. Accordingly, these kinase inhibitors potentiate the effects of THZ531, disrupting gene expression and increasing intronic polyadenylation. CONCLUSION Overall, this study demonstrates the critical role of ABCG2 in limiting the efficacy of transcriptional CDK inhibitors and identifies multiple kinase inhibitors that disrupt ABCG2 transporter function and thereby synergize with these CDK inhibitors. These findings therefore further facilitate the development of new (combination) therapies targeting transcriptional CDKs and highlight the importance of evaluating the role of ABC transporters in synergistic drug-drug interactions in general.
Collapse
Affiliation(s)
- Vera E van der Noord
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Wanda van der Stel
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Gijs Louwerens
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Danielle Verhoeven
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Hendrik J Kuiken
- Division of Molecular Carcinogenesis, The NKI Robotics and Screening Center, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Cor Lieftink
- Division of Molecular Carcinogenesis, The NKI Robotics and Screening Center, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Melanie Grandits
- Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria
| | - Gerhard F Ecker
- Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria
| | - Roderick L Beijersbergen
- Division of Molecular Carcinogenesis, The NKI Robotics and Screening Center, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Peter Bouwman
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Sylvia E Le Dévédec
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Bob van de Water
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.
| |
Collapse
|
24
|
Šimon M, Mikec Š, Morton NM, Atanur SS, Konc J, Horvat S, Kunej T. Genome-wide screening for genetic variants in polyadenylation signal (PAS) sites in mouse selection lines for fatness and leanness. Mamm Genome 2023; 34:12-31. [PMID: 36414820 PMCID: PMC9684942 DOI: 10.1007/s00335-022-09967-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 10/31/2022] [Indexed: 11/23/2022]
Abstract
Alternative polyadenylation (APA) determines mRNA stability, localisation, translation and protein function. Several diseases, including obesity, have been linked to APA. Studies have shown that single nucleotide polymorphisms in polyadenylation signals (PAS-SNPs) can influence APA and affect phenotype and disease susceptibility. However, these studies focussed on associations between single PAS-SNP alleles with very large effects and phenotype. Therefore, we performed a genome-wide screening for PAS-SNPs in the polygenic mouse selection lines for fatness and leanness by whole-genome sequencing. The genetic variants identified in the two lines were overlapped with locations of PAS sites obtained from the PolyASite 2.0 database. Expression data for selected genes were extracted from the microarray expression experiment performed on multiple tissue samples. In total, 682 PAS-SNPs were identified within 583 genes involved in various biological processes, including transport, protein modifications and degradation, cell adhesion and immune response. Moreover, 63 of the 583 orthologous genes in human have been previously associated with human diseases, such as nervous system and physical disorders, and immune, endocrine, and metabolic diseases. In both lines, PAS-SNPs have also been identified in genes broadly involved in APA, such as Polr2c, Eif3e and Ints11. Five PAS-SNPs within 5 genes (Car, Col4a1, Itga7, Lat, Nmnat1) were prioritised as potential functional variants and could contribute to the phenotypic disparity between the two selection lines. The developed PAS-SNPs catalogue presents a key resource for planning functional studies to uncover the role of PAS-SNPs in APA, disease susceptibility and fat deposition.
Collapse
Affiliation(s)
- Martin Šimon
- grid.8954.00000 0001 0721 6013Biotechnical Faculty, Department of Animal Science, University of Ljubljana, Domžale, Slovenia
| | - Špela Mikec
- grid.8954.00000 0001 0721 6013Biotechnical Faculty, Department of Animal Science, University of Ljubljana, Domžale, Slovenia
| | - Nicholas M. Morton
- grid.511172.10000 0004 0613 128XUniversity of Edinburgh, The Queen’s Medical Research Institute, Centre for Cardiovascular Science, Edinburgh, UK
| | - Santosh S. Atanur
- grid.7445.20000 0001 2113 8111Faculty of Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
- grid.4305.20000 0004 1936 7988Centre for Genomic and Experimental Medicine, University of Edinburgh, Edinburgh, UK
| | - Janez Konc
- grid.454324.00000 0001 0661 0844Laboratory for Molecular Modeling, National Institute of Chemistry, Ljubljana, Slovenia
| | - Simon Horvat
- grid.8954.00000 0001 0721 6013Biotechnical Faculty, Department of Animal Science, University of Ljubljana, Domžale, Slovenia
| | - Tanja Kunej
- grid.8954.00000 0001 0721 6013Biotechnical Faculty, Department of Animal Science, University of Ljubljana, Domžale, Slovenia
| |
Collapse
|
25
|
He PC, Wei J, Dou X, Harada BT, Zhang Z, Ge R, Liu C, Zhang LS, Yu X, Wang S, Lyu R, Zou Z, Chen M, He C. Exon architecture controls mRNA m 6A suppression and gene expression. Science 2023; 379:677-682. [PMID: 36705538 PMCID: PMC9990141 DOI: 10.1126/science.abj9090] [Citation(s) in RCA: 61] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 01/16/2023] [Indexed: 01/28/2023]
Abstract
N6-methyladenosine (m6A) is the most abundant messenger RNA (mRNA) modification and plays crucial roles in diverse physiological processes. Using a massively parallel assay for m6A (MPm6A), we discover that m6A specificity is globally regulated by suppressors that prevent m6A deposition in unmethylated transcriptome regions. We identify exon junction complexes (EJCs) as m6A suppressors that protect exon junction-proximal RNA within coding sequences from methylation and regulate mRNA stability through m6A suppression. EJC suppression of m6A underlies multiple global characteristics of mRNA m6A specificity, with the local range of EJC protection sufficient to suppress m6A deposition in average-length internal exons but not in long internal and terminal exons. EJC-suppressed methylation sites colocalize with EJC-suppressed splice sites, which suggests that exon architecture broadly determines local mRNA accessibility to regulatory complexes.
Collapse
Affiliation(s)
- P. Cody He
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Committee on Immunology, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Jiangbo Wei
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Xiaoyang Dou
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Bryan T. Harada
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Zijie Zhang
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
- State Key Laboratory for Conservation and Utilization of Bio-Resources, School of Life Sciences, Yunnan University, Kunming, Yunnan 650091, China
| | - Ruiqi Ge
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Chang Liu
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Li-Sheng Zhang
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Xianbin Yu
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Shuai Wang
- Department of Neurobiology, The University of Chicago, Chicago, IL 60637, USA
| | - Ruitu Lyu
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Zhongyu Zou
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Mengjie Chen
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL 60637, USA
| | - Chuan He
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637, USA
- Committee on Immunology, The University of Chicago, Chicago, IL 60637, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
26
|
Genetic Variation in ATXN3 (Ataxin-3) 3'UTR: Insights into the Downstream Regulatory Elements of the Causative Gene of Machado-Joseph Disease/Spinocerebellar Ataxia Type 3. CEREBELLUM (LONDON, ENGLAND) 2023; 22:37-45. [PMID: 35034258 DOI: 10.1007/s12311-021-01358-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/09/2021] [Indexed: 02/01/2023]
Abstract
Untranslated regions are involved in the regulation of transcriptional and post-transcriptional processes. Characterization of these regions remains poorly explored for ATXN3, the causative gene of Machado-Joseph disease (MJD). Although a few genetic modifiers have been identified for MJD age at onset (AO), they only explain a small fraction of the AO variance. Our aim was to analyse variation at the 3'UTR of ATXN3 in MJD patients, analyse its impact on AO and attempt to build haplotypes that might discriminate between normal and expanded alleles.After assessing ATXN3 3'UTR variants in molecularly confirmed MJD patients, an in silico analysis was conducted to predict their functional impact (e.g. their effect on miRNA-binding sites). Alleles in cis with the expanded (CAG)n were inferred from family data, and haplotypes were built. The effect of the alternative alleles on the AO and on SARA and NESSCA ataxia scales was tested.Nine variants, all previously described, were found. For eight variants, in silico analyses predicted (a) deleterious effects (rs10151135; rs55966267); (b) changes on miRNA-binding sites (rs11628764; rs55966267; rs709930) and (c) alterations of RNA-binding protein (RBP)-binding sites (rs1055996; rs910369; rs709930; rs10151135; rs3092822; rs7158733). Patients harbouring the alternative allele at rs10151135 had significantly higher SARA Axial subscores (p = 0.023), comparatively with those homozygous for the reference allele. Ten different haplotypes were obtained, one of which was exclusively found in cis with the expanded and four with the normal allele. These findings, which are relevant for the design of allele-specific therapies, warrant further investigation in independent MJD cohorts.
Collapse
|
27
|
Hong D, Jeong S. 3'UTR Diversity: Expanding Repertoire of RNA Alterations in Human mRNAs. Mol Cells 2023; 46:48-56. [PMID: 36697237 PMCID: PMC9880603 DOI: 10.14348/molcells.2023.0003] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 01/05/2023] [Accepted: 01/08/2023] [Indexed: 01/27/2023] Open
Abstract
Genomic information stored in the DNA is transcribed to the mRNA and translated to proteins. The 3' untranslated regions (3'UTRs) of the mRNA serve pivotal roles in posttranscriptional gene expression, regulating mRNA stability, translation, and localization. Similar to DNA mutations producing aberrant proteins, RNA alterations expand the transcriptome landscape and change the cellular proteome. Recent global analyses reveal that many genes express various forms of altered RNAs, including 3'UTR length variants. Alternative polyadenylation and alternative splicing are involved in diversifying 3'UTRs, which could act as a hidden layer of eukaryotic gene expression control. In this review, we summarize the functions and regulations of 3'UTRs and elaborate on the generation and functional consequences of 3'UTR diversity. Given that dynamic 3'UTR length control contributes to phenotypic complexity, dysregulated 3'UTR diversity might be relevant to disease development, including cancers. Thus, 3'UTR diversity in cancer could open exciting new research areas and provide avenues for novel cancer theragnostics.
Collapse
Affiliation(s)
- Dawon Hong
- Laboratory of RNA Cell Biology, Department of Bioconvergence Engineering, Dankook University Graduate School, Yongin 16892, Korea
| | - Sunjoo Jeong
- Laboratory of RNA Cell Biology, Department of Bioconvergence Engineering, Dankook University Graduate School, Yongin 16892, Korea
| |
Collapse
|
28
|
Mukherjee S, Graber JH, Moore CL. Macrophage differentiation is marked by increased abundance of the mRNA 3' end processing machinery, altered poly(A) site usage, and sensitivity to the level of CstF64. Front Immunol 2023; 14:1091403. [PMID: 36761770 PMCID: PMC9905730 DOI: 10.3389/fimmu.2023.1091403] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/11/2023] [Indexed: 01/26/2023] Open
Abstract
Regulation of mRNA polyadenylation is important for response to external signals and differentiation in several cell types, and results in mRNA isoforms that vary in the amount of coding sequence or 3' UTR regulatory elements. However, its role in differentiation of monocytes to macrophages has not been investigated. Macrophages are key effectors of the innate immune system that help control infection and promote tissue-repair. However, overactivity of macrophages contributes to pathogenesis of many diseases. In this study, we show that macrophage differentiation is characterized by shortening and lengthening of mRNAs in relevant cellular pathways. The cleavage/polyadenylation (C/P) proteins increase during differentiation, suggesting a possible mechanism for the observed changes in poly(A) site usage. This was surprising since higher C/P protein levels correlate with higher proliferation rates in other systems, but monocytes stop dividing after induction of differentiation. Depletion of CstF64, a C/P protein and known regulator of polyadenylation efficiency, delayed macrophage marker expression, cell cycle exit, attachment, and acquisition of structural complexity, and impeded shortening of mRNAs with functions relevant to macrophage biology. Conversely, CstF64 overexpression increased use of promoter-proximal poly(A) sites and caused the appearance of differentiated phenotypes in the absence of induction. Our findings indicate that regulation of polyadenylation plays an important role in macrophage differentiation.
Collapse
Affiliation(s)
- Srimoyee Mukherjee
- Department of Developmental, Molecular, and Chemical Biology, Tufts University School of Medicine, Boston, MA, United States
| | - Joel H. Graber
- Computational Biology and Bioinformatics Core, Mount Desert Island Biological Laboratory, Bar Harbor, ME, United States
| | - Claire L. Moore
- Department of Developmental, Molecular, and Chemical Biology, Tufts University School of Medicine, Boston, MA, United States
| |
Collapse
|
29
|
Jonnakuti VS, Wagner EJ, Maletić-Savatić M, Liu Z, Yalamanchili HK. PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.23.523471. [PMID: 36747700 PMCID: PMC9900750 DOI: 10.1101/2023.01.23.523471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3' untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3'UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer's Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data.
Collapse
Affiliation(s)
- Venkata Soumith Jonnakuti
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
- Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eric J. Wagner
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
| | - Mirjana Maletić-Savatić
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
- Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
| |
Collapse
|
30
|
Zhang J, Lin X, Chen Y, Li T, Lee AC, Chow EY, Cho WC, Chan T. LAFITE Reveals the Complexity of Transcript Isoforms in Subcellular Fractions. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2203480. [PMID: 36461702 PMCID: PMC9875686 DOI: 10.1002/advs.202203480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 10/28/2022] [Indexed: 06/17/2023]
Abstract
Characterization of the subcellular distribution of RNA is essential for understanding the molecular basis of biological processes. Here, the subcellular nanopore direct RNA-sequencing (DRS) of four lung cancer cell lines (A549, H1975, H358, and HCC4006) is performed, coupled with a computational pipeline, Low-abundance Aware Full-length Isoform clusTEr (LAFITE), to comprehensively analyze the full-length cytoplasmic and nuclear transcriptome. Using additional DRS and orthogonal data sets, it is shown that LAFITE outperforms current methods for detecting full-length transcripts, particularly for low-abundance isoforms that are usually overlooked due to poor read coverage. Experimental validation of six novel isoforms exclusively identified by LAFITE further confirms the reliability of this pipeline. By applying LAFITE to subcellular DRS data, the complexity of the nuclear transcriptome is revealed in terms of isoform diversity, 3'-UTR usage, m6A modification patterns, and intron retention. Overall, LAFITE provides enhanced full-length isoform identification and enables a high-resolution view of the RNA landscape at the isoform level.
Collapse
Affiliation(s)
- Jizhou Zhang
- School of Life SciencesThe Chinese University of Hong KongShatinHong Kong SARChina
- State Key Laboratory of AgrobiotechnologyThe Chinese University of Hong KongShatinHong Kong SARChina
| | - Xiao Lin
- School of Life SciencesThe Chinese University of Hong KongShatinHong Kong SARChina
- State Key Laboratory of AgrobiotechnologyThe Chinese University of Hong KongShatinHong Kong SARChina
| | - Yuelong Chen
- School of Life SciencesThe Chinese University of Hong KongShatinHong Kong SARChina
| | - Tsz‐Ho Li
- School of Life SciencesThe Chinese University of Hong KongShatinHong Kong SARChina
- State Key Laboratory of AgrobiotechnologyThe Chinese University of Hong KongShatinHong Kong SARChina
| | - Alan Chun‐Kit Lee
- School of Life SciencesThe Chinese University of Hong KongShatinHong Kong SARChina
| | | | | | - Ting‐Fung Chan
- School of Life SciencesThe Chinese University of Hong KongShatinHong Kong SARChina
- State Key Laboratory of AgrobiotechnologyThe Chinese University of Hong KongShatinHong Kong SARChina
| |
Collapse
|
31
|
Qian SH, Chen L, Xiong YL, Chen ZX. Evolution and function of developmentally dynamic pseudogenes in mammals. Genome Biol 2022; 23:235. [PMID: 36348461 PMCID: PMC9641868 DOI: 10.1186/s13059-022-02802-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 10/23/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Pseudogenes are excellent markers for genome evolution, which are emerging as crucial regulators of development and disease, especially cancer. However, systematic functional characterization and evolution of pseudogenes remain largely unexplored. RESULTS To systematically characterize pseudogenes, we date the origin of human and mouse pseudogenes across vertebrates and observe a burst of pseudogene gain in these two lineages. Based on a hybrid sequencing dataset combining full-length PacBio sequencing, sample-matched Illumina sequencing, and public time-course transcriptome data, we observe that abundant mammalian pseudogenes could be transcribed, which contribute to the establishment of organ identity. Our analyses reveal that developmentally dynamic pseudogenes are evolutionarily conserved and show an increasing weight during development. Besides, they are involved in complex transcriptional and post-transcriptional modulation, exhibiting the signatures of functional enrichment. Coding potential evaluation suggests that 19% of human pseudogenes could be translated, thus serving as a new way for protein innovation. Moreover, pseudogenes carry disease-associated SNPs and conduce to cancer transcriptome perturbation. CONCLUSIONS Our discovery reveals an unexpectedly high abundance of mammalian pseudogenes that can be transcribed and translated, and these pseudogenes represent a novel regulatory layer. Our study also prioritizes developmentally dynamic pseudogenes with signatures of functional enrichment and provides a hybrid sequencing dataset for further unraveling their biological mechanisms in organ development and carcinogenesis in the future.
Collapse
Affiliation(s)
- Sheng Hu Qian
- grid.35155.370000 0004 1790 4137Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070 PR China ,grid.35155.370000 0004 1790 4137Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 PR China
| | - Lu Chen
- grid.35155.370000 0004 1790 4137Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070 PR China ,grid.35155.370000 0004 1790 4137Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 PR China
| | - Yu-Li Xiong
- grid.35155.370000 0004 1790 4137Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070 PR China ,grid.35155.370000 0004 1790 4137Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 PR China
| | - Zhen-Xia Chen
- grid.35155.370000 0004 1790 4137Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070 PR China ,grid.35155.370000 0004 1790 4137Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 PR China ,grid.35155.370000 0004 1790 4137Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070 PR China ,grid.35155.370000 0004 1790 4137Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Shenzhen, 518124 PR China ,grid.488316.00000 0004 4912 1102Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518124 PR China
| |
Collapse
|
32
|
Navigating the Multiverse of Antisense RNAs: The Transcription- and RNA-Dependent Dimension. Noncoding RNA 2022; 8:ncrna8060074. [PMID: 36412909 PMCID: PMC9680235 DOI: 10.3390/ncrna8060074] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 10/21/2022] [Accepted: 10/23/2022] [Indexed: 12/14/2022] Open
Abstract
Evidence accumulated over the past decades shows that the number of identified antisense transcripts is continuously increasing, promoting them from transcriptional noise to real genes with specific functions. Indeed, recent studies have begun to unravel the complexity of the antisense RNA (asRNA) world, starting from the multidimensional mechanisms that they can exert in physiological and pathological conditions. In this review, we discuss the multiverse of the molecular functions of asRNAs, describing their action through transcription-dependent and RNA-dependent mechanisms. Then, we report the workflow and methodologies to study and functionally characterize single asRNA candidates.
Collapse
|
33
|
Brancato V, Brentari I, Coscujuela Tarrero L, Furlan M, Nicassio F, Denti MA. News from around the RNA world: new avenues in RNA biology, biotechnology and therapeutics from the 2022 SIBBM meeting. Biol Open 2022; 11:277240. [PMID: 36239357 PMCID: PMC9581514 DOI: 10.1242/bio.059597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Since the formalization of the Central Dogma of molecular biology, the relevance of RNA in modulating the flow of information from DNA to proteins has been clear. More recently, the discovery of a vast set of non-coding transcripts involved in crucial aspects of cellular biology has renewed the enthusiasm of the RNA community. Moreover, the remarkable impact of RNA therapies in facing the COVID19 pandemics has bolstered interest in the translational opportunities provided by this incredible molecule. For all these reasons, the Italian Society of Biophysics and Molecular Biology (SIBBM) decided to dedicate its 17th yearly meeting, held in June 2022 in Rome, to the many fascinating aspects of RNA biology. More than thirty national and international speakers covered the properties, modes of action and applications of RNA, from its role in the control of development and cell differentiation to its involvement in disease. Here, we summarize the scientific content of the conference, highlighting the take-home message of each presentation, and we stress the directions the community is currently exploring to push forward our comprehension of the RNA World 3.0.
Collapse
Affiliation(s)
- Virginia Brancato
- Center for Genomic Science IIT@SEMM, Italian Institute of Technology, Milan 20139, Italy
| | - Ilaria Brentari
- Department of Cellular, Computational and Integrative Biology - CIBIO, University of Trento, Trento 38123, Italy
| | | | - Mattia Furlan
- Center for Genomic Science IIT@SEMM, Italian Institute of Technology, Milan 20139, Italy
| | - Francesco Nicassio
- Center for Genomic Science IIT@SEMM, Italian Institute of Technology, Milan 20139, Italy
| | - Michela A Denti
- Department of Cellular, Computational and Integrative Biology - CIBIO, University of Trento, Trento 38123, Italy
| |
Collapse
|
34
|
Cacioppo R, Lindon C. Regulating the regulator: a survey of mechanisms from transcription to translation controlling expression of mammalian cell cycle kinase Aurora A. Open Biol 2022; 12:220134. [PMID: 36067794 PMCID: PMC9448500 DOI: 10.1098/rsob.220134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Aurora Kinase A (AURKA) is a positive regulator of mitosis with a strict cell cycle-dependent expression pattern. Recently, novel oncogenic roles of AURKA have been uncovered that are independent of the kinase activity and act within multiple signalling pathways, including cell proliferation, survival and cancer stem cell phenotypes. For this, cellular abundance of AURKA protein is per se crucial and must be tightly fine-tuned. Indeed, AURKA is found overexpressed in different cancers, typically as a result of gene amplification or enhanced transcription. It has however become clear that impaired processing, decay and translation of AURKA mRNA can also offer the basis for altered AURKA levels. Accordingly, the involvement of gene expression mechanisms controlling AURKA expression in human diseases is increasingly recognized and calls for much more research. Here, we explore and create an integrated view of the molecular processes regulating AURKA expression at the level of transcription, post-transcription and translation, intercalating discussion on how impaired regulation underlies disease. Given that targeting AURKA levels might affect more functions compared to inhibiting the kinase activity, deeper understanding of its gene expression may aid the design of alternative and therapeutically more successful ways of suppressing the AURKA oncogene.
Collapse
Affiliation(s)
- Roberta Cacioppo
- Department of Pharmacology, University of Cambridge, Cambridge CB2 1PD, UK
| | - Catherine Lindon
- Department of Pharmacology, University of Cambridge, Cambridge CB2 1PD, UK
| |
Collapse
|
35
|
Rouse WB, O'Leary CA, Booher NJ, Moss WN. Expansion of the RNAStructuromeDB to include secondary structural data spanning the human protein-coding transcriptome. Sci Rep 2022; 12:14515. [PMID: 36008510 PMCID: PMC9403969 DOI: 10.1038/s41598-022-18699-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 08/17/2022] [Indexed: 11/22/2022] Open
Abstract
RNA plays vital functional roles in almost every component of biology, and these functional roles are often influenced by its folding into secondary and tertiary structures. An important role of RNA secondary structure is in maintaining proper gene regulation; therefore, making accurate predictions of the structures involved in these processes is important. In this study, we have expanded on our previous work that led to the creation of the RNAStructuromeDB. Unlike this previous study that analyzed the human genome at low resolution, we have now scanned the protein-coding human transcriptome at high (single nt) resolution. This provides more robust structure predictions for over 100,000 isoforms of known protein-coding genes. Notably, we also utilize the motif identification tool, ScanFold, to model structures with high propensity for ordered/evolved stability. All data have been uploaded to the RNAStructuromeDB, allowing for easy searching of transcripts, visualization of data tracks (via the Integrative Genomics Viewer or IGV), and download of ScanFold data—including unique highly-ordered motifs. Herein, we provide an example analysis of MAT2A to demonstrate the utility of ScanFold at finding known and novel secondary structures, highlighting regions of potential functionality, and guiding generation of functional hypotheses through use of the data.
Collapse
Affiliation(s)
- Warren B Rouse
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, 50011, USA
| | - Collin A O'Leary
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, 50011, USA
| | - Nicholas J Booher
- Infrastructure and Research IT Services, Iowa State University, Ames, IA, 50011, USA
| | - Walter N Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
36
|
Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci. Nat Commun 2022; 13:4659. [PMID: 36002455 PMCID: PMC9402578 DOI: 10.1038/s41467-022-32358-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 07/26/2022] [Indexed: 12/26/2022] Open
Abstract
Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we select CDS incomplete isoforms annotated in GENCODE and identify 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-sequencing among these incomplete isoforms, we reveal 29 full-length isoforms with unannotated CDSs associated with GWAS traits. Furthermore, we show that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases. Splicing QTL (sQTL), genetic variants regulating alternative splicing, can be biologically important, but complex to detect and interpret. Here, the authors identify sQTL by focusing on protein coding sequences, as an alternative to junction-based approaches.
Collapse
|
37
|
Hu X, Song J, Chyr J, Wan J, Wang X, Du J, Duan J, Zhang H, Zhou X, Wu X. APAview: A web-based platform for alternative polyadenylation analyses in hematological cancers. Front Genet 2022; 13:928862. [PMID: 36035147 PMCID: PMC9411867 DOI: 10.3389/fgene.2022.928862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/06/2022] [Indexed: 11/30/2022] Open
Abstract
Background: Hematologic malignancies, such as acute promyelocytic leukemia (APL) and acute myeloid leukemia (AML), are cancers that start in blood-forming tissues and can affect the blood, bone marrow, and lymph nodes. They are often caused by genetic and molecular alterations such as mutations and gene expression changes. Alternative polyadenylation (APA) is a post-transcriptional process that regulates gene expression, and dysregulation of APA contributes to hematological malignancies. RNA-sequencing-based bioinformatic methods can identify APA sites and quantify APA usages as molecular indexes to study APA roles in disease development, diagnosis, and treatment. Unfortunately, APA data pre-processing, analysis, and visualization are time-consuming, inconsistent, and laborious. A comprehensive, user-friendly tool will greatly simplify processes for APA feature screening and mining. Results: Here, we present APAview, a web-based platform to explore APA features in hematological cancers and perform APA statistical analysis. APAview server runs on Python3 with a Flask framework and a Jinja2 templating engine. For visualization, APAview client is built on Bootstrap and Plotly. Multimodal data, such as APA quantified by QAPA/DaPars, gene expression data, and clinical information, can be uploaded to APAview and analyzed interactively. Correlation, survival, and differential analyses among user-defined groups can be performed via the web interface. Using APAview, we explored APA features in two hematological cancers, APL and AML. APAview can also be applied to other diseases by uploading different experimental data.
Collapse
Affiliation(s)
- Xi Hu
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Jialin Song
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Jacqueline Chyr
- Center for Computational Systems Medicine, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, United States
| | - Jinping Wan
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Xiaoyan Wang
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Jianqiang Du
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Junbo Duan
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Huqin Zhang
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, United States
| | - Xiaoming Wu
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
- *Correspondence: Xiaoming Wu,
| |
Collapse
|
38
|
Chakraborty A, Cadix M, Relier S, Taricco N, Alaeitabar T, Devaux A, Labbé CM, Martineau S, Heneman-Masurel A, Gestraud P, Inga A, Servant N, Vagner S, Dutertre M. Compartment-specific and ELAVL1-coordinated regulation of intronic polyadenylation isoforms by doxorubicin. Genome Res 2022; 32:gr.276192.121. [PMID: 35858751 PMCID: PMC9341504 DOI: 10.1101/gr.276192.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 06/16/2022] [Indexed: 01/03/2023]
Abstract
Intronic polyadenylation (IPA) isoforms, which contain alternative last exons, are widely regulated in various biological processes and by many factors. However, little is known about their cytoplasmic regulation and translational status. In this study, we provide the first evidence that the genome-wide patterns of IPA isoform regulation during a biological process can be very distinct between the transcriptome and translatome, and between the nucleus and cytosol. Indeed, by 3'-seq analyses on breast cancer cells, we show that the genotoxic anticancer drug, doxorubicin, preferentially down-regulates the IPA to the last-exon (IPA:LE) isoform ratio in whole cells (as previously reported) but preferentially up-regulates it in polysomes. We further show that in nuclei, doxorubicin almost exclusively down-regulates the IPA:LE ratio, whereas in the cytosol, it preferentially up-regulates the isoform ratio, as in polysomes. Then, focusing on IPA isoforms that are up-regulated by doxorubicin in the cytosol and highly translated (up-regulated and/or abundant in polysomes), we identify several IPA isoforms that promote cell survival to doxorubicin. Mechanistically, by using an original approach of condition- and compartment-specific CLIP-seq (CCS-iCLIP) to analyze ELAVL1-RNA interactions in the nucleus and cytosol in the presence and absence of doxorubicin, as well as 3'-seq analyses upon ELAVL1 depletion, we show that the RNA-binding protein ELAVL1 mediates both nuclear down-regulation and cytosolic up-regulation of the IPA:LE isoform ratio in distinct sets of genes in response to doxorubicin. Altogether, these findings reveal differential regulation of the IPA:LE isoform ratio across subcellular compartments during drug response and its coordination by an RNA-binding protein.
Collapse
Affiliation(s)
- Alina Chakraborty
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Mandy Cadix
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
- INSERM U900, Mines Paris Tech, Institut Curie, 75000 Paris, France
| | - Sébastien Relier
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Nicolò Taricco
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Tina Alaeitabar
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
- INSERM U900, Mines Paris Tech, Institut Curie, 75000 Paris, France
| | - Alexandre Devaux
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Céline M Labbé
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Sylvain Martineau
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Amélie Heneman-Masurel
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Pierre Gestraud
- INSERM U900, Mines Paris Tech, Institut Curie, 75000 Paris, France
| | - Alberto Inga
- Laboratory of Transcriptional Networks, Department CIBIO, University of Trento, 38123 Trento, Italy
| | - Nicolas Servant
- INSERM U900, Mines Paris Tech, Institut Curie, 75000 Paris, France
| | - Stéphan Vagner
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| | - Martin Dutertre
- Institut Curie, Université PSL, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Université Paris-Saclay, CNRS UMR3348, INSERM U1278, 91400 Orsay, France
- Equipe Labellisée Ligue Nationale Contre le Cancer, 91400 Orsay, France
| |
Collapse
|
39
|
Leveraging omic features with F3UTER enables identification of unannotated 3'UTRs for synaptic genes. Nat Commun 2022; 13:2270. [PMID: 35477703 PMCID: PMC9046390 DOI: 10.1038/s41467-022-30017-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 03/18/2022] [Indexed: 11/08/2022] Open
Abstract
There is growing evidence for the importance of 3' untranslated region (3'UTR) dependent regulatory processes. However, our current human 3'UTR catalogue is incomplete. Here, we develop a machine learning-based framework, leveraging both genomic and tissue-specific transcriptomic features to predict previously unannotated 3'UTRs. We identify unannotated 3'UTRs associated with 1,563 genes across 39 human tissues, with the greatest abundance found in the brain. These unannotated 3'UTRs are significantly enriched for RNA binding protein (RBP) motifs and exhibit high human lineage-specificity. We find that brain-specific unannotated 3'UTRs are enriched for the binding motifs of important neuronal RBPs such as TARDBP and RBFOX1, and their associated genes are involved in synaptic function. Our data is shared through an online resource F3UTER ( https://astx.shinyapps.io/F3UTER/ ). Overall, our data improves 3'UTR annotation and provides additional insights into the mRNA-RBP interactome in the human brain, with implications for our understanding of neurological and neurodevelopmental diseases.
Collapse
|
40
|
Mikheenko A, Prjibelski AD, Joglekar A, Tilgner HU. Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore technologies reveals platform-specific error patterns. Genome Res 2022; 32:726-737. [PMID: 35301264 PMCID: PMC8997348 DOI: 10.1101/gr.276405.121] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 03/05/2022] [Indexed: 12/04/2022]
Abstract
Long-read transcriptomics require understanding error sources inherent to technologies. Current approaches cannot compare methods for an individual RNA molecule. Here, we present a novel platform-comparison method that combines barcoding strategies and long-read sequencing to sequence cDNA copies representing an individual RNA molecule on both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). We compare these long-read pairs in terms of sequence content and isoform patterns. Although individual read pairs show high similarity, we find differences in (1) aligned length, (2) transcription start site (TSS), (3) polyadenylation site (poly(A)-site) assignment, and (4) exon–intron structures. Overall, 25% of read pairs disagree on either TSS, poly(A)-site, or splice site. Intron-chain disagreement typically arises from alignment errors of microexons and complicated splice sites. Our single-molecule technology comparison reveals that inconsistencies are often caused by sequencing error–induced inaccurate ONT alignments, especially to downstream GUNNGU donor motifs. However, annotation-disagreeing upstream shifts in NAGNAG acceptors in ONT are often confirmed by PacBio and are thus likely real. In both barcoded and nonbarcoded ONT reads, we find that intron number and proximity of GU/AGs better predict inconsistencies with the annotation than read quality alone. We summarize these findings in an annotation-based algorithm for spliced alignment correction that improves subsequent transcript construction with ONT reads.
Collapse
|
41
|
Biswas B, Guemiri R, Cadix M, Labbé CM, Chakraborty A, Dutertre M, Robert C, Vagner S. Differential Effects on the Translation of Immune-Related Alternatively Polyadenylated mRNAs in Melanoma and T Cells by eIF4A Inhibition. Cancers (Basel) 2022; 14:cancers14051177. [PMID: 35267483 PMCID: PMC8909304 DOI: 10.3390/cancers14051177] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 02/20/2022] [Accepted: 02/21/2022] [Indexed: 02/05/2023] Open
Abstract
Targeting the translation initiation complex eIF4F, which binds the 5' cap of mRNAs, is a promising anti-cancer approach. Silvestrol, a small molecule inhibitor of eIF4A, the RNA helicase component of eIF4F, inhibits the translation of the mRNA encoding the signal transducer and activator of transcription 1 (STAT1) transcription factor, which, in turn, reduces the transcription of the gene encoding one of the major immune checkpoint proteins, i.e., programmed death ligand-1 (PD-L1) in melanoma cells. A large proportion of human genes produce multiple mRNAs differing in their 3'-ends through the use of alternative polyadenylation (APA) sites, which, when located in alternative last exons, can generate protein isoforms, as in the STAT1 gene. Here, we provide evidence that the STAT1α, but not STAT1β protein isoform generated by APA, is required for silvestrol-dependent inhibition of PD-L1 expression in interferon-γ-treated melanoma cells. Using polysome profiling in activated T cells we find that, beyond STAT1, eIF4A inhibition downregulates the translation of some important immune-related mRNAs, such as the ones encoding TIM-3, LAG-3, IDO1, CD27 or CD137, but with little effect on the ones for BTLA and ADAR-1 and no effect on the ones encoding CTLA-4, PD-1 and CD40-L. We next apply RT-qPCR and 3'-seq (RNA-seq focused on mRNA 3' ends) on polysomal RNAs to analyze in a high throughput manner the effect of eIF4A inhibition on the translation of APA isoforms. We identify about 150 genes, including TIM-3, LAG-3, AHNAK and SEMA4D, for which silvestrol differentially inhibits the translation of APA isoforms in T cells. It is therefore crucial to consider 3'-end mRNA heterogeneity in the understanding of the anti-tumor activities of eIF4A inhibitors.
Collapse
Affiliation(s)
- Biswendu Biswas
- Institut Curie, PSL Research University, CNRS UMR 3348, INSERM U1278, 91401 Orsay, France; (B.B.); (M.C.); (C.M.L.); (A.C.); (M.D.)
- Biologie de l’ARN, Signalisation et Cancer, Université Paris Sud, Université Paris-Saclay, CNRS UMR 3348, 91401 Orsay, France
- Équipe Labellisée Ligue Contre le Cancer, 91401 Orsay, France
- INSERM U981, Gustave Roussy Cancer Campus, 94805 Villejuif, France;
- Faculté de Médecine, Université Paris Sud, Université Paris-Saclay, 94270 Kremlin-Bicêtre, France
| | - Ramdane Guemiri
- INSERM U981, Gustave Roussy Cancer Campus, 94805 Villejuif, France;
- Faculté de Médecine, Université Paris Sud, Université Paris-Saclay, 94270 Kremlin-Bicêtre, France
| | - Mandy Cadix
- Institut Curie, PSL Research University, CNRS UMR 3348, INSERM U1278, 91401 Orsay, France; (B.B.); (M.C.); (C.M.L.); (A.C.); (M.D.)
- Biologie de l’ARN, Signalisation et Cancer, Université Paris Sud, Université Paris-Saclay, CNRS UMR 3348, 91401 Orsay, France
- Équipe Labellisée Ligue Contre le Cancer, 91401 Orsay, France
| | - Céline M. Labbé
- Institut Curie, PSL Research University, CNRS UMR 3348, INSERM U1278, 91401 Orsay, France; (B.B.); (M.C.); (C.M.L.); (A.C.); (M.D.)
- Biologie de l’ARN, Signalisation et Cancer, Université Paris Sud, Université Paris-Saclay, CNRS UMR 3348, 91401 Orsay, France
- Équipe Labellisée Ligue Contre le Cancer, 91401 Orsay, France
| | - Alina Chakraborty
- Institut Curie, PSL Research University, CNRS UMR 3348, INSERM U1278, 91401 Orsay, France; (B.B.); (M.C.); (C.M.L.); (A.C.); (M.D.)
- Biologie de l’ARN, Signalisation et Cancer, Université Paris Sud, Université Paris-Saclay, CNRS UMR 3348, 91401 Orsay, France
- Équipe Labellisée Ligue Contre le Cancer, 91401 Orsay, France
| | - Martin Dutertre
- Institut Curie, PSL Research University, CNRS UMR 3348, INSERM U1278, 91401 Orsay, France; (B.B.); (M.C.); (C.M.L.); (A.C.); (M.D.)
- Biologie de l’ARN, Signalisation et Cancer, Université Paris Sud, Université Paris-Saclay, CNRS UMR 3348, 91401 Orsay, France
- Équipe Labellisée Ligue Contre le Cancer, 91401 Orsay, France
| | - Caroline Robert
- INSERM U981, Gustave Roussy Cancer Campus, 94805 Villejuif, France;
- Faculté de Médecine, Université Paris Sud, Université Paris-Saclay, 94270 Kremlin-Bicêtre, France
- Correspondence: (C.R.); (S.V.)
| | - Stéphan Vagner
- Institut Curie, PSL Research University, CNRS UMR 3348, INSERM U1278, 91401 Orsay, France; (B.B.); (M.C.); (C.M.L.); (A.C.); (M.D.)
- Biologie de l’ARN, Signalisation et Cancer, Université Paris Sud, Université Paris-Saclay, CNRS UMR 3348, 91401 Orsay, France
- Équipe Labellisée Ligue Contre le Cancer, 91401 Orsay, France
- Correspondence: (C.R.); (S.V.)
| |
Collapse
|
42
|
Implications of Poly(A) Tail Processing in Repeat Expansion Diseases. Cells 2022; 11:cells11040677. [PMID: 35203324 PMCID: PMC8870147 DOI: 10.3390/cells11040677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/11/2022] [Accepted: 02/13/2022] [Indexed: 11/21/2022] Open
Abstract
Repeat expansion diseases are a group of more than 40 disorders that affect mainly the nervous and/or muscular system and include myotonic dystrophies, Huntington’s disease, and fragile X syndrome. The mutation-driven expanded repeat tract occurs in specific genes and is composed of tri- to dodeca-nucleotide-long units. Mutant mRNA is a pathogenic factor or important contributor to the disease and has great potential as a therapeutic target. Although repeat expansion diseases are quite well known, there are limited studies concerning polyadenylation events for implicated transcripts that could have profound effects on transcript stability, localization, and translation efficiency. In this review, we briefly present polyadenylation and alternative polyadenylation (APA) mechanisms and discuss their role in the pathogenesis of selected diseases. We also discuss several methods for poly(A) tail measurement (both transcript-specific and transcriptome-wide analyses) and APA site identification—the further development and use of which may contribute to a better understanding of the correlation between APA events and repeat expansion diseases. Finally, we point out some future perspectives on the research into repeat expansion diseases, as well as APA studies.
Collapse
|
43
|
Mulroney L, Wulf MG, Schildkraut I, Tzertzinis G, Buswell J, Jain M, Olsen H, Diekhans M, Corrêa IR, Akeson M, Ettwiller L. Identification of high-confidence human poly(A) RNA isoform scaffolds using nanopore sequencing. RNA (NEW YORK, N.Y.) 2022; 28:162-176. [PMID: 34728536 PMCID: PMC8906549 DOI: 10.1261/rna.078703.121] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 10/13/2021] [Indexed: 06/13/2023]
Abstract
Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional direct RNA nanopore sequencing, the 5' and 3' ends of poly(A) RNA cannot be identified unambiguously. This is due in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoforms among ∼4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5' m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nt oligomer. This oligomer adaptation method improved 5' end sequencing and ensured correct identification of the 5' m7G capped ends. Second, among these 5'-capped nanopore reads, we screened for features consistent with a 3' polyadenylation site. Combining these two steps, we identified 294,107 individual high-confidence full-length RNA scaffolds from human GM12878 cells, most of which (257,721) aligned to protein-coding genes. Of these, 4876 scaffolds indicated unannotated isoforms that were often internal to longer, previously identified RNA isoforms. Orthogonal data for m7G caps and open chromatin, such as CAGE and DNase-HS seq, confirmed the validity of these high-confidence RNA scaffolds.
Collapse
Affiliation(s)
- Logan Mulroney
- Biomolecular Engineering Department, UC Santa Cruz, California 95064, USA
| | | | | | | | - John Buswell
- New England Biolabs, Ipswich, Massachusetts 01938, USA
| | - Miten Jain
- Biomolecular Engineering Department, UC Santa Cruz, California 95064, USA
| | - Hugh Olsen
- Biomolecular Engineering Department, UC Santa Cruz, California 95064, USA
| | - Mark Diekhans
- Genomics Institute, UC Santa Cruz, California 95064, USA
| | - Ivan R Corrêa
- New England Biolabs, Ipswich, Massachusetts 01938, USA
| | - Mark Akeson
- Biomolecular Engineering Department, UC Santa Cruz, California 95064, USA
| | | |
Collapse
|
44
|
Rouse WB, Andrews RJ, Booher NJ, Wang J, Woodman M, Dow E, Jessop TC, Moss WN. Prediction and analysis of functional RNA structures within the integrative genomics viewer. NAR Genom Bioinform 2022; 4:lqab127. [PMID: 35047817 PMCID: PMC8759568 DOI: 10.1093/nargab/lqab127] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 12/03/2021] [Accepted: 12/22/2021] [Indexed: 12/14/2022] Open
Abstract
In recent years, interest in RNA secondary structure has exploded due to its implications in almost all biological functions and its newly appreciated capacity as a therapeutic agent/target. This surge of interest has driven the development and adaptation of many computational and biochemical methods to discover novel, functional structures across the genome/transcriptome. To further enhance efforts to study RNA secondary structure, we have integrated the functional secondary structure prediction tool ScanFold, into IGV. This allows users to directly perform structure predictions and visualize results—in conjunction with probing data and other annotations—in one program. We illustrate the utility of this new tool by mapping the secondary structural landscape of the human MYC precursor mRNA. We leverage the power of vast ‘omics’ resources by comparing individually predicted structures with published data including: biochemical structure probing, RNA binding proteins, microRNA binding sites, RNA modifications, single nucleotide polymorphisms, and others that allow functional inferences to be made and aid in the discovery of potential drug targets. This new tool offers the RNA community an easy to use tool to find, analyze, and characterize RNA secondary structures in the context of all available data, in order to find those worthy of further analyses.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Walter N Moss
- To whom correspondence should be addressed. Tel: +1 515 294 6116;
| |
Collapse
|
45
|
Fiszbein A, McGurk M, Calvo-Roitberg E, Kim G, Burge CB, Pai AA. Widespread occurrence of hybrid internal-terminal exons in human transcriptomes. SCIENCE ADVANCES 2022; 8:eabk1752. [PMID: 35044812 PMCID: PMC8769537 DOI: 10.1126/sciadv.abk1752] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Accepted: 11/23/2021] [Indexed: 06/12/2023]
Abstract
Messenger RNA isoform differences are predominantly driven by alternative first, internal, and last exons. Despite the importance of classifying exons to understand isoform structure, few tools examine isoform-specific exon usage. We recently observed that alternative transcription start sites often arise near internal exons, often creating “hybrid” first/internal exons. To systematically detect hybrid exons, we built the hybrid-internal-terminal (HIT) pipeline to classify exons depending on their isoform-specific usage. On the basis of splice junction reads in RNA sequencing data and probabilistic modeling, the HIT index identified thousands of previously misclassified hybrid first-internal and internal-last exons. Hybrid exons are enriched in long genes and genes involved in RNA splicing and have longer flanking introns and strong splice sites. Their usage varies considerably across human tissues. By developing the first method to classify exons according to isoform contexts, our findings document the occurrence of hybrid exons, a common quirk of the human transcriptome.
Collapse
Affiliation(s)
- Ana Fiszbein
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biology, Boston University, Boston, MA, USA
| | - Michael McGurk
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - GyeungYun Kim
- Department of Biology, Boston University, Boston, MA, USA
| | - Christopher B. Burge
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Athma A. Pai
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA, USA
| |
Collapse
|
46
|
Zhang Y, Song J, Zhang M, Deng Z. Analysis Polyadenylation Signal Usage in Sus scrofa. Animals (Basel) 2022; 12:ani12020194. [PMID: 35049816 PMCID: PMC8773104 DOI: 10.3390/ani12020194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/01/2022] [Accepted: 01/10/2022] [Indexed: 12/12/2022] Open
Abstract
RNA polyadenylation is an important step in the messenger RNA (mRNA) maturation process, and the first step is recognizing the polyadenylation signal (PAS). The PAS type and distribution is a key determinant of post-transcriptional mRNA modification and gene expression. However, little is known about PAS usage and alternative polyadenylation (APA) regulation in livestock species. Recently, sequencing technology has enabled the generation of a large amount of sequencing data revealing variation in poly(A) signals and APA regulation in Sus scrofa. We identified 62,491 polyadenylation signals in Sus scrofa using expressed sequence tag (EST) sequences combined with RNA-seq analysis. The composition and usage frequency of polyadenylation signal in Sus scrofa is similar with that of human and mouse. The most highly conserved polyadenylation signals are AAUAAA and AUUAAA, used for over 63.35% of genes. In addition, we also analyzed the U/GU-rich downstream sequence (DSE) element, located downstream of the cleavage site. Our results indicate that APA regulation was widely occurred in Sus scrofa, as in other organisms. Our result was useful for the accurate annotation of RNA 3' ends in Sus scrofa and the analysis of polyadenylation signal usage in Sus scrofa would give the new insights into the mechanisms of transcriptional regulation.
Collapse
Affiliation(s)
- Yuting Zhang
- School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China; (Y.Z.); (M.Z.)
| | - Jingwen Song
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China;
| | - Min Zhang
- School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China; (Y.Z.); (M.Z.)
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China;
| | - Zhongyuan Deng
- School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China; (Y.Z.); (M.Z.)
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China;
- Correspondence:
| |
Collapse
|
47
|
Salmen F, De Jonghe J, Kaminski TS, Alemany A, Parada GE, Verity-Legg J, Yanagida A, Kohler TN, Battich N, van den Brekel F, Ellermann AL, Arias AM, Nichols J, Hemberg M, Hollfelder F, van Oudenaarden A. High-throughput total RNA sequencing in single cells using VASA-seq. Nat Biotechnol 2022; 40:1780-1793. [PMID: 35760914 PMCID: PMC9750877 DOI: 10.1038/s41587-022-01361-8] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 05/13/2022] [Indexed: 01/14/2023]
Abstract
Most methods for single-cell transcriptome sequencing amplify the termini of polyadenylated transcripts, capturing only a small fraction of the total cellular transcriptome. This precludes the detection of many long non-coding, short non-coding and non-polyadenylated protein-coding transcripts and hinders alternative splicing analysis. We, therefore, developed VASA-seq to detect the total transcriptome in single cells, which is enabled by fragmenting and tailing all RNA molecules subsequent to cell lysis. The method is compatible with both plate-based formats and droplet microfluidics. We applied VASA-seq to more than 30,000 single cells in the developing mouse embryo during gastrulation and early organogenesis. Analyzing the dynamics of the total single-cell transcriptome, we discovered cell type markers, many based on non-coding RNA, and performed in vivo cell cycle analysis via detection of non-polyadenylated histone genes. RNA velocity characterization was improved, accurately retracing blood maturation trajectories. Moreover, our VASA-seq data provide a comprehensive analysis of alternative splicing during mammalian development, which highlighted substantial rearrangements during blood development and heart morphogenesis.
Collapse
Affiliation(s)
- Fredrik Salmen
- grid.7692.a0000000090126352Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands ,grid.499559.dOncode Institute, Utrecht, Netherlands
| | - Joachim De Jonghe
- grid.5335.00000000121885934Department of Biochemistry, University of Cambridge, Cambridge, UK ,grid.451388.30000 0004 1795 1830Present Address: Francis Crick Institute, London, UK
| | - Tomasz S. Kaminski
- grid.5335.00000000121885934Department of Biochemistry, University of Cambridge, Cambridge, UK ,grid.12847.380000 0004 1937 1290Present Address: Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Anna Alemany
- grid.7692.a0000000090126352Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands ,grid.499559.dOncode Institute, Utrecht, Netherlands
| | - Guillermo E. Parada
- grid.52788.300000 0004 0427 7672Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Joe Verity-Legg
- grid.7692.a0000000090126352Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands ,grid.499559.dOncode Institute, Utrecht, Netherlands
| | - Ayaka Yanagida
- grid.26999.3d0000 0001 2151 536XDivision of Stem Cell Therapy, Center for Stem Cell Biology and Regenerative Medicine, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Timo N. Kohler
- grid.5335.00000000121885934Department of Biochemistry, University of Cambridge, Cambridge, UK ,grid.5335.00000000121885934Wellcome Trust – Medical Research Council Stem Cell Institute, University of Cambridge, Jeffrey Cheah Biomedical Centre, Cambridge, UK
| | - Nicholas Battich
- grid.7692.a0000000090126352Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands ,grid.499559.dOncode Institute, Utrecht, Netherlands
| | - Floris van den Brekel
- grid.7692.a0000000090126352Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands ,grid.499559.dOncode Institute, Utrecht, Netherlands
| | - Anna L. Ellermann
- grid.5335.00000000121885934Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Alfonso Martinez Arias
- grid.425902.80000 0000 9601 989XSystems Bioengineering, DCEXS, Universidad Pompeu Fabra, Doctor Aiguader 88 ICREA (Institució Catalana de Recerca i Estudis Avançats), Barcelona, Spain
| | - Jennifer Nichols
- grid.5335.00000000121885934Wellcome Trust – Medical Research Council Stem Cell Institute, University of Cambridge, Jeffrey Cheah Biomedical Centre, Cambridge, UK ,grid.5335.00000000121885934Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | - Martin Hemberg
- grid.52788.300000 0004 0427 7672Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK ,grid.38142.3c000000041936754XEvergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA USA
| | - Florian Hollfelder
- grid.5335.00000000121885934Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Alexander van Oudenaarden
- grid.7692.a0000000090126352Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands ,grid.499559.dOncode Institute, Utrecht, Netherlands
| |
Collapse
|
48
|
Shah A, Mittleman BE, Gilad Y, Li YI. Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation. Genome Biol 2021; 22:291. [PMID: 34649612 PMCID: PMC8518154 DOI: 10.1186/s13059-021-02502-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 09/16/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3' ends. Most APA occurs within 3' UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. RESULTS APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools-TAPAS, QAPA, DaPars2, GETUTR, and APATrap- against 3'-Seq, a specialized RNA-seq protocol that enriches for reads at the 3' ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3'-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3'-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). CONCLUSIONS We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3'-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input.
Collapse
Affiliation(s)
- Ankeeta Shah
- Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Briana E Mittleman
- Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Yoav Gilad
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Yang I Li
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
49
|
Zhao Z, Xu Q, Wei R, Huang L, Wang W, Wei G, Ni T. Comprehensive characterization of somatic variants associated with intronic polyadenylation in human cancers. Nucleic Acids Res 2021; 49:10369-10381. [PMID: 34508351 PMCID: PMC8501991 DOI: 10.1093/nar/gkab772] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 08/16/2021] [Accepted: 08/26/2021] [Indexed: 11/29/2022] Open
Abstract
Somatic single nucleotide variants (SNVs) in cancer genome affect gene expression through various mechanisms depending on their genomic location. While somatic SNVs near canonical splice sites have been reported to cause abnormal splicing of cancer-related genes, whether these SNVs can affect gene expression through other mechanisms remains an open question. Here, we analyzed RNA sequencing and exome data from 4,998 cancer patients covering ten cancer types and identified 152 somatic SNVs near splice sites that were associated with abnormal intronic polyadenylation (IPA). IPA-associated somatic variants favored the localization near the donor splice sites compared to the acceptor splice sites. A proportion of SNV-associated IPA events overlapped with premature cleavage and polyadenylation events triggered by U1 small nuclear ribonucleoproteins (snRNP) inhibition. GC content, intron length and polyadenylation signal were three genomic features that differentiated between SNV-associated IPA and intron retention. Notably, IPA-associated SNVs were enriched in tumor suppressor genes (TSGs), including the well-known TSGs such as PTEN and CDH1 with recurrent SNV-associated IPA events. Minigene assay confirmed that SNVs from PTEN, CDH1, VEGFA, GRHL2, CUL3 and WWC2 could lead to IPA. This work reveals that IPA acts as a novel mechanism explaining the functional consequence of somatic SNVs in human cancer.
Collapse
Affiliation(s)
- Zhaozhao Zhao
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Qiushi Xu
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Ran Wei
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China.,Department of Pathology, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, P.R. China
| | - Leihuan Huang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Weixu Wang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Gang Wei
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China.,MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China.,Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
| |
Collapse
|
50
|
Zhao Z, Xu Q, Wei R, Wang W, Ding D, Yang Y, Yao J, Zhang L, Hu YQ, Wei G, Ni T. Cancer-associated dynamics and potential regulators of intronic polyadenylation revealed by IPAFinder using standard RNA-seq data. Genome Res 2021; 31:2095-2106. [PMID: 34475268 PMCID: PMC8559711 DOI: 10.1101/gr.271627.120] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 08/31/2021] [Indexed: 12/19/2022]
Abstract
Intronic polyadenylation (IpA) usually leads to changes in the coding region of an mRNA, and its implication in diseases has been recognized, although at its very beginning status. Conveniently and accurately identifying IpA is of great importance for further evaluating its biological significance. Here, we developed IPAFinder, a bioinformatic method for the de novo identification of intronic poly(A) sites and their dynamic changes from standard RNA-seq data. Applying IPAFinder to 256 pan-cancer tumor/normal pairs across six tumor types, we discovered 490 recurrent dynamically changed IpA events, some of which are novel and derived from cancer-associated genes such as TSC1, SPERD2, and CCND2. Furthermore, IPAFinder revealed that IpA could be regulated by factors related to splicing and m6A modification. In summary, IPAFinder enables the global discovery and characterization of biologically regulated IpA with standard RNA-seq data and should reveal the biological significance of IpA in various processes.
Collapse
Affiliation(s)
- Zhaozhao Zhao
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Qiushi Xu
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Ran Wei
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Weixu Wang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Dong Ding
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Yu Yang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Jun Yao
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Liye Zhang
- School of Life Science and Technology, ShanghaiTech University, Shanghai 200438, P.R. China
| | - Yue-Qing Hu
- State Key Laboratory of Genetic Engineering, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
| | - Gang Wei
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, P.R. China
| |
Collapse
|