1
|
Timing is everything: advances in quantifying splicing kinetics. Trends Cell Biol 2024:S0962-8924(24)00070-9. [PMID: 38777664 DOI: 10.1016/j.tcb.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 03/26/2024] [Accepted: 03/27/2024] [Indexed: 05/25/2024]
Abstract
Splicing is a highly regulated process critical for proper pre-mRNA maturation and the maintenance of a healthy cellular environment. Splicing events are impacted by ongoing transcription, neighboring splicing events, and cis and trans regulatory factors on the respective pre-mRNA transcript. Within this complex regulatory environment, splicing kinetics have the potential to influence splicing outcomes but have historically been challenging to study in vivo. In this review, we highlight recent technological advancements that have enabled measurements of global splicing kinetics and of the variability of splicing kinetics at single introns. We demonstrate how identifying features that are correlated with splicing kinetics has increased our ability to form potential models for how splicing kinetics may be regulated in vivo.
Collapse
|
2
|
Co-transcriptional gene regulation in eukaryotes and prokaryotes. Nat Rev Mol Cell Biol 2024:10.1038/s41580-024-00706-2. [PMID: 38509203 DOI: 10.1038/s41580-024-00706-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/19/2024] [Indexed: 03/22/2024]
Abstract
Many steps of RNA processing occur during transcription by RNA polymerases. Co-transcriptional activities are deemed commonplace in prokaryotes, in which the lack of membrane barriers allows mixing of all gene expression steps, from transcription to translation. In the past decade, an extraordinary level of coordination between transcription and RNA processing has emerged in eukaryotes. In this Review, we discuss recent developments in our understanding of co-transcriptional gene regulation in both eukaryotes and prokaryotes, comparing methodologies and mechanisms, and highlight striking parallels in how RNA polymerases interact with the machineries that act on nascent RNA. The development of RNA sequencing and imaging techniques that detect transient transcription and RNA processing intermediates has facilitated discoveries of transcription coordination with splicing, 3'-end cleavage and dynamic RNA folding and revealed physical contacts between processing machineries and RNA polymerases. Such studies indicate that intron retention in a given nascent transcript can prevent 3'-end cleavage and cause transcriptional readthrough, which is a hallmark of eukaryotic cellular stress responses. We also discuss how coordination between nascent RNA biogenesis and transcription drives fundamental aspects of gene expression in both prokaryotes and eukaryotes.
Collapse
|
3
|
Genome-wide kinetic profiling of pre-mRNA 3' end cleavage. RNA (NEW YORK, N.Y.) 2024; 30:256-270. [PMID: 38164598 PMCID: PMC10870368 DOI: 10.1261/rna.079783.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Accepted: 12/13/2023] [Indexed: 01/03/2024]
Abstract
Cleavage and polyadenylation is necessary for the formation of mature mRNA molecules. The rate at which this process occurs can determine the temporal availability of mRNA for subsequent function throughout the cell and is likely tightly regulated. Despite advances in high-throughput approaches for global kinetic profiling of RNA maturation, genome-wide 3' end cleavage rates have never been measured. Here, we describe a novel approach to estimate the rates of cleavage, using metabolic labeling of nascent RNA, high-throughput sequencing, and mathematical modeling. Using in silico simulations of nascent RNA-seq data, we show that our approach can accurately and precisely estimate cleavage half-lives for both constitutive and alternative sites. We find that 3' end cleavage is fast on average, with half-lives under a minute, but highly variable across individual sites. Rapid cleavage is promoted by the presence of canonical sequence elements and an increased density of polyadenylation signals near a cleavage site. Finally, we find that cleavage rates are associated with the localization of RNA polymerase II at the end of a gene, and faster cleavage leads to quicker degradation of downstream readthrough RNA. Our findings shed light on the features important for efficient 3' end cleavage and the regulation of transcription termination.
Collapse
|
4
|
Transcription readthrough is prevalent in healthy human tissues and associated with inherent genomic features. Commun Biol 2024; 7:100. [PMID: 38225287 PMCID: PMC10789751 DOI: 10.1038/s42003-024-05779-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 01/04/2024] [Indexed: 01/17/2024] Open
Abstract
Transcription termination is a crucial step in the production of conforming mRNAs and functional proteins. Under cellular stress conditions, the transcription machinery fails to identify the termination site and continues transcribing beyond gene boundaries, a phenomenon designated as transcription readthrough. However, the prevalence and impact of this phenomenon in healthy human tissues remain unexplored. Here, we assessed transcription readthrough in almost 3000 transcriptome profiles representing 23 human tissues and found that 34% of the expressed protein-coding genes produced readthrough transcripts. The production of readthrough transcripts was restricted in genomic regions with high transcriptional activity and was associated with inefficient splicing and increased chromatin accessibility in terminal regions. In addition, we showed that these transcripts contained several binding sites for the same miRNA, unravelling a potential role as miRNA sponges. Overall, this work provides evidence that transcription readthrough is pervasive and non-stochastic, not only in abnormal conditions but also in healthy tissues. This suggests a potential role for such transcripts in modulating normal cellular functions.
Collapse
|
5
|
Splicing quality control mediated by DHX15 and its G-patch activator SUGP1. Cell Rep 2023; 42:113223. [PMID: 37805921 PMCID: PMC10842378 DOI: 10.1016/j.celrep.2023.113223] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 07/27/2023] [Accepted: 09/20/2023] [Indexed: 10/10/2023] Open
Abstract
Pre-mRNA splicing is surveilled at different stages by quality control (QC) mechanisms. The leukemia-associated DExH-box family helicase hDHX15/scPrp43 is known to disassemble spliceosomes after splicing. Here, using rapid protein depletion and analysis of nascent and mature RNA to enrich for direct effects, we identify a widespread splicing QC function for DHX15 in human cells, consistent with recent in vitro studies. We find that suboptimal introns with weak splice sites, multiple branch points, and cryptic introns are repressed by DHX15, suggesting a general role in promoting splicing fidelity. We identify SUGP1 as a G-patch factor that activates DHX15's splicing QC function. This interaction is dependent on both DHX15's ATPase activity and on SUGP1's U2AF ligand motif (ULM) domain. Together, our results support a model in which DHX15 plays a major role in splicing QC when recruited and activated by SUGP1.
Collapse
|
6
|
Genome-wide probing of eukaryotic nascent RNA structure elucidates cotranscriptional folding and its antimutagenic effect. Nat Commun 2023; 14:5853. [PMID: 37730811 PMCID: PMC10511511 DOI: 10.1038/s41467-023-41550-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 09/08/2023] [Indexed: 09/22/2023] Open
Abstract
The transcriptional intermediates of RNAs fold into secondary structures with multiple regulatory roles, yet the details of such cotranscriptional RNA folding are largely unresolved in eukaryotes. Here, we present eSPET-seq (Structural Probing of Elongating Transcripts in eukaryotes), a method to assess the cotranscriptional RNA folding in Saccharomyces cerevisiae. Our study reveals pervasive structural transitions during cotranscriptional folding and overall structural similarities between nascent and mature RNAs. Furthermore, a combined analysis with genome-wide R-loop and mutation rate approximations provides quantitative evidence for the antimutator effect of nascent RNA folding through competitive inhibition of the R-loops, known to facilitate transcription-associated mutagenesis. Taken together, we present an experimental evaluation of cotranscriptional folding in eukaryotes and demonstrate the antimutator effect of nascent RNA folding. These results suggest genome-wide coupling between the processing and transmission of genetic information through RNA folding.
Collapse
|
7
|
Coupling of co-transcriptional splicing and 3' end Pol II pausing during termination in Arabidopsis. Genome Biol 2023; 24:206. [PMID: 37697420 PMCID: PMC10496290 DOI: 10.1186/s13059-023-03050-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 09/04/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND In Arabidopsis, RNA Polymerase II (Pol II) often pauses within a few hundred base pairs downstream of the polyadenylation site, reflecting efficient transcriptional termination, but how such pausing is regulated remains largely elusive. RESULT Here, we analyze Pol II dynamics at 3' ends by combining comprehensive experiments with mathematical modelling. We generate high-resolution serine 2 phosphorylated (Ser2P) Pol II positioning data specifically enriched at 3' ends and define a 3' end pause index (3'PI). The position but not the extent of the 3' end pause correlates with the termination window size. The 3'PI is not decreased but even mildly increased in the termination deficient mutant xrn3, indicating 3' end pause is a regulatory step early during the termination and before XRN3-mediated RNA decay that releases Pol II. Unexpectedly, 3'PI is closely associated with gene exon numbers and co-transcriptional splicing efficiency. Multiple exons genes often display stronger 3' end pauses and more efficient on-chromatin splicing than genes with fewer exons. Chemical inhibition of splicing strongly reduces the 3'PI and disrupts its correlation with exon numbers but does not globally impact 3' end readthrough levels. These results are further confirmed by fitting Pol II positioning data with a mathematical model, which enables the estimation of parameters that define Pol II dynamics. CONCLUSION Our work highlights that the number of exons via co-transcriptional splicing is a major determinant of Pol II pausing levels at the 3' end of genes in plants.
Collapse
|
8
|
Coordination of alternative splicing and alternative polyadenylation revealed by targeted long read sequencing. Nat Commun 2023; 14:5506. [PMID: 37679364 PMCID: PMC10484994 DOI: 10.1038/s41467-023-41207-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 08/25/2023] [Indexed: 09/09/2023] Open
Abstract
Nervous system development is associated with extensive regulation of alternative splicing (AS) and alternative polyadenylation (APA). AS and APA have been extensively studied in isolation, but little is known about how these processes are coordinated. Here, the coordination of cassette exon (CE) splicing and APA in Drosophila was investigated using a targeted long-read sequencing approach we call Pull-a-Long-Seq (PL-Seq). This cost-effective method uses cDNA pulldown and Nanopore sequencing combined with an analysis pipeline to quantify inclusion of alternative exons in connection with alternative 3' ends. Using PL-Seq, we identified genes that exhibit significant differences in CE splicing depending on connectivity to short versus long 3'UTRs. Genomic long 3'UTR deletion was found to alter upstream CE splicing in short 3'UTR isoforms and ELAV loss differentially affected CE splicing depending on connectivity to alternative 3'UTRs. This work highlights the importance of considering connectivity to alternative 3'UTRs when monitoring AS events.
Collapse
|
9
|
Pre-mRNA splicing and its cotranscriptional connections. Trends Genet 2023; 39:672-685. [PMID: 37236814 PMCID: PMC10524715 DOI: 10.1016/j.tig.2023.04.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 04/23/2023] [Accepted: 04/26/2023] [Indexed: 05/28/2023]
Abstract
Transcription of eukaryotic genes by RNA polymerase II (Pol II) yields RNA precursors containing introns that must be spliced out and the flanking exons ligated together. Splicing is catalyzed by a dynamic ribonucleoprotein complex called the spliceosome. Recent evidence has shown that a large fraction of splicing occurs cotranscriptionally as the RNA chain is extruded from Pol II at speeds of up to 5 kb/minute. Splicing is more efficient when it is tethered to the transcription elongation complex, and this linkage permits functional coupling of splicing with transcription. We discuss recent progress that has uncovered a network of connections that link splicing to transcript elongation and other cotranscriptional RNA processing events.
Collapse
|
10
|
Abstract
Formation of the 3' end of a eukaryotic mRNA is a key step in the production of a mature transcript. This process is mediated by a number of protein factors that cleave the pre-mRNA, add a poly(A) tail, and regulate transcription by protein dephosphorylation. Cleavage and polyadenylation specificity factor (CPSF) in humans, or cleavage and polyadenylation factor (CPF) in yeast, coordinates these enzymatic activities with each other, with RNA recognition, and with transcription. The site of pre-mRNA cleavage can strongly influence the translation, stability, and localization of the mRNA. Hence, cleavage site selection is highly regulated. The length of the poly(A) tail is also controlled to ensure that every transcript has a similar tail when it is exported from the nucleus. In this review, we summarize new mechanistic insights into mRNA 3'-end processing obtained through structural studies and biochemical reconstitution and outline outstanding questions in the field.
Collapse
|
11
|
U1 snRNP increases RNA Pol II elongation rate to enable synthesis of long genes. Mol Cell 2023; 83:1264-1279.e10. [PMID: 36965480 PMCID: PMC10135401 DOI: 10.1016/j.molcel.2023.03.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/06/2023] [Accepted: 02/28/2023] [Indexed: 03/27/2023]
Abstract
The expansion of introns within mammalian genomes poses a challenge for the production of full-length messenger RNAs (mRNAs), with increasing evidence that these long AT-rich sequences present obstacles to transcription. Here, we investigate RNA polymerase II (RNAPII) elongation at high resolution in mammalian cells and demonstrate that RNAPII transcribes faster across introns. Moreover, we find that this acceleration requires the association of U1 snRNP (U1) with the elongation complex at 5' splice sites. The role of U1 to stimulate elongation rate through introns reduces the frequency of both premature termination and transcriptional arrest, thereby dramatically increasing RNA production. We further show that changes in RNAPII elongation rate due to AT content and U1 binding explain previous reports of pausing or termination at splice junctions and the edge of CpG islands. We propose that U1-mediated acceleration of elongation has evolved to mitigate the risks that long AT-rich introns pose to transcript completion.
Collapse
|
12
|
Ageing-associated changes in transcriptional elongation influence longevity. Nature 2023; 616:814-821. [PMID: 37046086 PMCID: PMC10132977 DOI: 10.1038/s41586-023-05922-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 03/07/2023] [Indexed: 04/14/2023]
Abstract
Physiological homeostasis becomes compromised during ageing, as a result of impairment of cellular processes, including transcription and RNA splicing1-4. However, the molecular mechanisms leading to the loss of transcriptional fidelity are so far elusive, as are ways of preventing it. Here we profiled and analysed genome-wide, ageing-related changes in transcriptional processes across different organisms: nematodes, fruitflies, mice, rats and humans. The average transcriptional elongation speed (RNA polymerase II speed) increased with age in all five species. Along with these changes in elongation speed, we observed changes in splicing, including a reduction of unspliced transcripts and the formation of more circular RNAs. Two lifespan-extending interventions, dietary restriction and lowered insulin-IGF signalling, both reversed most of these ageing-related changes. Genetic variants in RNA polymerase II that reduced its speed in worms5 and flies6 increased their lifespan. Similarly, reducing the speed of RNA polymerase II by overexpressing histone components, to counter age-associated changes in nucleosome positioning, also extended lifespan in flies and the division potential of human cells. Our findings uncover fundamental molecular mechanisms underlying animal ageing and lifespan-extending interventions, and point to possible preventive measures.
Collapse
|
13
|
Coordination of Alternative Splicing and Alternative Polyadenylation revealed by Targeted Long-Read Sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.23.533999. [PMID: 36993601 PMCID: PMC10055423 DOI: 10.1101/2023.03.23.533999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Nervous system development is associated with extensive regulation of alternative splicing (AS) and alternative polyadenylation (APA). AS and APA have been extensively studied in isolation, but little is known about how these processes are coordinated. Here, the coordination of cassette exon (CE) splicing and APA in Drosophila was investigated using a targeted long-read sequencing approach we call Pull-a-Long-Seq (PL-Seq). This cost-effective method uses cDNA pulldown and Nanopore sequencing combined with an analysis pipeline to resolve the connectivity of alternative exons to alternative 3' ends. Using PL-Seq, we identified genes that exhibit significant differences in CE splicing depending on connectivity to short versus long 3'UTRs. Genomic long 3'UTR deletion was found to alter upstream CE splicing in short 3'UTR isoforms and ELAV loss differentially affected CE splicing depending on connectivity to alternative 3'UTRs. This work highlights the importance of considering connectivity to alternative 3'UTRs when monitoring AS events.
Collapse
|
14
|
Profiling lariat intermediates reveals genetic determinants of early and late co-transcriptional splicing. Mol Cell 2022; 82:4681-4699.e8. [PMID: 36435176 PMCID: PMC10448999 DOI: 10.1016/j.molcel.2022.11.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 09/10/2022] [Accepted: 11/02/2022] [Indexed: 11/27/2022]
Abstract
Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.
Collapse
|
15
|
Genome-wide characterization of nascent RNA processing in plants. CURRENT OPINION IN PLANT BIOLOGY 2022; 69:102294. [PMID: 36063636 DOI: 10.1016/j.pbi.2022.102294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 07/29/2022] [Accepted: 07/29/2022] [Indexed: 06/15/2023]
Abstract
Following transcription initiation, RNA polymerase II (Pol II) elongates through the genic region and terminates after the polyadenylation signal. This process is accompanied by splicing, 3' cleavage, and polyadenylation, to eventually form a mature mRNA. Recent advances in short-read and long-read high-throughput sequencing methods have shed light on the global landscape of these co-transcriptional events at nucleotide resolution. In this mini review, we summarize recent developments in genome-wide approaches that broadened our understanding of nascent RNA processing in plants.
Collapse
|
16
|
Abstract
Transcription elongation by RNA polymerase II (Pol II) has emerged as a regulatory hub in gene expression. A key control point occurs during early transcription elongation when Pol II pauses in the promoter-proximal region at the majority of genes in mammalian cells and at a large set of genes in Drosophila. An increasing number of trans-acting factors have been linked to promoter-proximal pausing. Some factors help to establish the pause, whereas others are required for the release of Pol II into productive elongation. A dysfunction of this elongation control point leads to aberrant gene expression and can contribute to disease development. The BET bromodomain protein BRD4 has been implicated in elongation control. However, only recently direct BRD4-specific functions in Pol II transcription elongation have been uncovered. This mainly became possible with technological advances that allow selective and rapid ablation of BRD4 in cells along with the availability of approaches that capture the immediate consequences on nascent transcription. This review sheds light on the experimental breakthroughs that led to the emerging view of BRD4 as a general regulator of transcription elongation.
Collapse
|
17
|
Altiratinib blocks Toxoplasma gondii and Plasmodium falciparum development by selectively targeting a spliceosome kinase. Sci Transl Med 2022; 14:eabn3231. [PMID: 35921477 DOI: 10.1126/scitranslmed.abn3231] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
The Apicomplexa comprise a large phylum of single-celled, obligate intracellular protozoa that include Toxoplasma gondii, Plasmodium, and Cryptosporidium spp., which infect humans and animals and cause severe parasitic diseases. Available therapeutics against these diseases are limited by suboptimal efficacy and frequent side effects, as well as the emergence and spread of resistance. We use a drug repurposing strategy and identify altiratinib, a compound originally developed to treat glioblastoma, as a promising drug candidate with broad spectrum activity against apicomplexans. Altiratinib is parasiticidal and blocks the development of intracellular zoites in the nanomolar range and with a high selectivity index when used against T. gondii. We have identified TgPRP4K of T. gondii as the primary target of altiratinib using genetic target deconvolution, which highlighted key residues within the kinase catalytic site that conferred drug resistance when mutated. We have further elucidated the molecular basis of the inhibitory mechanism and species selectivity of altiratinib for TgPRP4K and for its Plasmodium falciparum counterpart, PfCLK3. Our data identified structural features critical for binding of the other PfCLK3 inhibitor, TCMDC-135051. Consistent with the splicing control activity of this kinase family, we have shown that altiratinib can cause global disruption of splicing, primarily through intron retention in both T. gondii and P. falciparum. Thus, our data establish parasitic PRP4K/CLK3 as a potential pan-apicomplexan target whose repertoire of inhibitors can be expanded by the addition of altiratinib.
Collapse
|
18
|
Comprehensive analysis of the circadian nuclear and cytoplasmic transcriptome in mouse liver. PLoS Genet 2022; 18:e1009903. [PMID: 35921362 PMCID: PMC9377612 DOI: 10.1371/journal.pgen.1009903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 08/15/2022] [Accepted: 07/06/2022] [Indexed: 11/19/2022] Open
Abstract
In eukaryotes, RNA is synthesised in the nucleus, spliced, and exported to the cytoplasm where it is translated and finally degraded. Any of these steps could be subject to temporal regulation during the circadian cycle, resulting in daily fluctuations of RNA accumulation and affecting the distribution of transcripts in different subcellular compartments. Our study analysed the nuclear and cytoplasmic, poly(A) and total transcriptomes of mouse livers collected over the course of a day. These data provide a genome-wide temporal inventory of enrichment in subcellular RNA, and revealed specific signatures of splicing, nuclear export and cytoplasmic mRNA stability related to transcript and gene lengths. Combined with a mathematical model describing rhythmic RNA profiles, we could test the rhythmicity of export rates and cytoplasmic degradation rates of approximately 1400 genes. With nuclear export times usually much shorter than cytoplasmic half-lives, we found that nuclear export contributes to the modulation and generation of rhythmic profiles of 10% of the cycling nuclear mRNAs. This study contributes to a better understanding of the dynamic regulation of the transcriptome during the day-night cycle.
Collapse
|
19
|
Transcription and genome integrity. DNA Repair (Amst) 2022; 118:103373. [PMID: 35914488 DOI: 10.1016/j.dnarep.2022.103373] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Revised: 07/16/2022] [Accepted: 07/17/2022] [Indexed: 11/03/2022]
Abstract
Transcription can cause genome instability by promoting R-loop formation but also act as a mutation-suppressing machinery by sensing of DNA lesions leading to the activation of DNA damage signaling and transcription-coupled repair. Recovery of RNA synthesis following the resolution of repair of transcription-blocking lesions is critical to avoid apoptosis and several new factors involved in this process have recently been identified. Some DNA repair proteins are recruited to initiating RNA polymerases and this may expediate the recruitment of other factors that participate in the repair of transcription-blocking DNA lesions. Recent studies have shown that transcription of protein-coding genes does not always give rise to spliced transcripts, opening the possibility that cells may use the transcription machinery in a splicing-uncoupled manner for other purposes including surveillance of the transcribed genome.
Collapse
|
20
|
Single-nuclei isoform RNA sequencing unlocks barcoded exon connectivity in frozen brain tissue. Nat Biotechnol 2022; 40:1082-1092. [PMID: 35256815 PMCID: PMC9287170 DOI: 10.1038/s41587-022-01231-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 01/20/2022] [Indexed: 12/11/2022]
Abstract
Single-nuclei RNA sequencing characterizes cell types at the gene level. However, compared to single-cell approaches, many single-nuclei cDNAs are purely intronic, lack barcodes and hinder the study of isoforms. Here we present single-nuclei isoform RNA sequencing (SnISOr-Seq). Using microfluidics, PCR-based artifact removal, target enrichment and long-read sequencing, SnISOr-Seq increased barcoded, exon-spanning long reads 7.5-fold compared to naive long-read single-nuclei sequencing. We applied SnISOr-Seq to adult human frontal cortex and found that exons associated with autism exhibit coordinated and highly cell-type-specific inclusion. We found two distinct combination patterns: those distinguishing neural cell types, enriched in TSS-exon, exon-polyadenylation-site and non-adjacent exon pairs, and those with multiple configurations within one cell type, enriched in adjacent exon pairs. Finally, we observed that human-specific exons are almost as tightly coordinated as conserved exons, implying that coordination can be rapidly established during evolution. SnISOr-Seq enables cell-type-specific long-read isoform analysis in human brain and in any frozen or hard-to-dissociate sample.
Collapse
|
21
|
It's a DoG-eat-DoG world-altered transcriptional mechanisms drive downstream-of-gene (DoG) transcript production. Mol Cell 2022; 82:1981-1991. [PMID: 35487209 PMCID: PMC9208299 DOI: 10.1016/j.molcel.2022.04.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 02/24/2022] [Accepted: 04/04/2022] [Indexed: 10/18/2022]
Abstract
The past decade has revolutionized our understanding of regulatory noncoding RNAs (ncRNAs). Among the most recently identified ncRNAs are downstream-of-gene (DoG)-containing transcripts that are produced by widespread transcriptional readthrough. The discovery of DoGs has set the stage for future studies to address many unanswered questions regarding the mechanisms that promote readthrough transcription, RNA processing, and the cellular functions of the unique transcripts. In this review, we summarize current findings regarding the biogenesis, function, and mechanisms regulating this exciting new class of RNA molecules.
Collapse
|
22
|
Ubiquitous mRNA decay fragments in E. coli redefine the functional transcriptome. Nucleic Acids Res 2022; 50:5029-5046. [PMID: 35524564 PMCID: PMC9122600 DOI: 10.1093/nar/gkac295] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/11/2022] [Accepted: 04/13/2022] [Indexed: 01/01/2023] Open
Abstract
Bacterial mRNAs have short life cycles, in which transcription is rapidly followed by translation and degradation within seconds to minutes. The resulting diversity of mRNA molecules across different life-cycle stages impacts their functionality but has remained unresolved. Here we quantitatively map the 3’ status of cellular RNAs in Escherichia coli during steady-state growth and report a large fraction of molecules (median>60%) that are fragments of canonical full-length mRNAs. The majority of RNA fragments are decay intermediates, whereas nascent RNAs contribute to a smaller fraction. Despite the prevalence of decay intermediates in total cellular RNA, these intermediates are underrepresented in the pool of ribosome-associated transcripts and can thus distort quantifications and differential expression analyses for the abundance of full-length, functional mRNAs. The large heterogeneity within mRNA molecules in vivo highlights the importance in discerning functional transcripts and provides a lens for studying the dynamic life cycle of mRNAs.
Collapse
|
23
|
Who let the DoGs out? - biogenesis of stress-induced readthrough transcripts. Trends Biochem Sci 2022; 47:206-217. [PMID: 34489151 PMCID: PMC8840951 DOI: 10.1016/j.tibs.2021.08.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 07/27/2021] [Accepted: 08/10/2021] [Indexed: 01/22/2023]
Abstract
Readthrough transcription caused by inefficient 3'-end cleavage of nascent mRNAs has emerged as a hallmark of the mammalian cellular stress response and results in the production of long noncoding RNAs known as downstream-of-gene (DoG)-containing transcripts. DoGs arise from around 10% of human protein-coding genes and are retained in the nucleus. They are produced minutes after cell exposure to stress and can be detected hours after stress removal. However, their biogenesis and the role(s) that DoGs or their production play in the cellular stress response are incompletely understood. We discuss findings that implicate host and viral proteins in the mechanisms underlying DoG production, as well as the transcriptional landscapes that accompany DoG induction under different stress conditions.
Collapse
|
24
|
Spatial organization of transcribed eukaryotic genes. Nat Cell Biol 2022; 24:327-339. [PMID: 35177821 DOI: 10.1038/s41556-022-00847-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 01/10/2022] [Indexed: 12/19/2022]
Abstract
Despite the well-established role of nuclear organization in the regulation of gene expression, little is known about the reverse: how transcription shapes the spatial organization of the genome. Owing to the small sizes of most previously studied genes and the limited resolution of microscopy, the structure and spatial arrangement of a single transcribed gene are still poorly understood. Here we study several long highly expressed genes and demonstrate that they form open-ended transcription loops with polymerases moving along the loops and carrying nascent RNAs. Transcription loops can span across micrometres, resembling lampbrush loops and polytene puffs. The extension and shape of transcription loops suggest their intrinsic stiffness, which we attribute to decoration with multiple voluminous nascent ribonucleoproteins. Our data contradict the model of transcription factories and suggest that although microscopically resolvable transcription loops are specific for long highly expressed genes, the mechanisms underlying their formation could represent a general aspect of eukaryotic transcription.
Collapse
|
25
|
Transcription and splicing dynamics during early Drosophila development. RNA (NEW YORK, N.Y.) 2022; 28:139-161. [PMID: 34667107 PMCID: PMC8906543 DOI: 10.1261/rna.078933.121] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 09/23/2021] [Indexed: 05/03/2023]
Abstract
Widespread cotranscriptional splicing has been demonstrated from yeast to human. However, most studies to date addressing the kinetics of splicing relative to transcription used either Saccharomyces cerevisiae or metazoan cultured cell lines. Here, we adapted native elongating transcript sequencing technology (NET-seq) to measure cotranscriptional splicing dynamics during the early developmental stages of Drosophila melanogaster embryos. Our results reveal the position of RNA polymerase II (Pol II) when both canonical and recursive splicing occur. We found heterogeneity in splicing dynamics, with some RNAs spliced immediately after intron transcription, whereas for other transcripts no splicing was observed over the first 100 nt of the downstream exon. Introns that show splicing completion before Pol II has reached the end of the downstream exon are necessarily intron-defined. We studied the splicing dynamics of both nascent pre-mRNAs transcribed in the early embryo, which have few and short introns, as well as pre-mRNAs transcribed later in embryonic development, which contain multiple long introns. As expected, we found a relationship between the proportion of spliced reads and intron size. However, intron definition was observed at all intron sizes. We further observed that genes transcribed in the early embryo tend to be isolated in the genome whereas genes transcribed later are often overlapped by a neighboring convergent gene. In isolated genes, transcription termination occurred soon after the polyadenylation site, while in overlapped genes, Pol II persisted associated with the DNA template after cleavage and polyadenylation of the nascent transcript. Taken together, our data unravel novel dynamic features of Pol II transcription and splicing in the developing Drosophila embryo.
Collapse
|
26
|
A genetic screen in C. elegans reveals roles for KIN17 and PRCC in maintaining 5' splice site identity. PLoS Genet 2022; 18:e1010028. [PMID: 35143478 PMCID: PMC8865678 DOI: 10.1371/journal.pgen.1010028] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 02/23/2022] [Accepted: 01/10/2022] [Indexed: 01/11/2023] Open
Abstract
Pre-mRNA splicing is an essential step of eukaryotic gene expression carried out by a series of dynamic macromolecular protein/RNA complexes, known collectively and individually as the spliceosome. This series of spliceosomal complexes define, assemble on, and catalyze the removal of introns. Molecular model snapshots of intermediates in the process have been created from cryo-EM data, however, many aspects of the dynamic changes that occur in the spliceosome are not fully understood. Caenorhabditis elegans follow the GU-AG rule of splicing, with almost all introns beginning with 5’ GU and ending with 3’ AG. These splice sites are identified early in the splicing cycle, but as the cycle progresses and “custody” of the pre-mRNA splice sites is passed from factor to factor as the catalytic site is built, the mechanism by which splice site identity is maintained or re-established through these dynamic changes is unclear. We performed a genetic screen in C. elegans for factors that are capable of changing 5’ splice site choice. We report that KIN17 and PRCC are involved in splice site choice, the first functional splicing role proposed for either of these proteins. Previously identified suppressors of cryptic 5’ splicing promote distal cryptic GU splice sites, however, mutations in KIN17 and PRCC instead promote usage of an unusual proximal 5’ splice site which defines an intron beginning with UU, separated by 1nt from a GU donor. We performed high-throughput mRNA sequencing analysis and found that mutations in PRCC, and to a lesser extent KIN17, changed alternative 5’ splice site usage at native sites genome-wide, often promoting usage of nearby non-consensus sites. Our work has uncovered both fine and coarse mechanisms by which the spliceosome maintains splice site identity during the complex assembly process. Pre-messenger RNA splicing is an important regulator of eukaryotic gene expression, changing the content, frame, and functionality of both coding and non-coding transcripts. Our understanding of how the spliceosome chooses where to cut has focused on the initial identification of splice sites. However, our results suggest that the spliceosome also relies on other components in later steps to maintain the identity of the splice donor sites. We are currently in the midst of a “resolution revolution”, with ever-clearer cryo-EM snapshots of stalled complexes, allowing researchers to visualize moments in time in the splicing cycle. These models are illuminating, but do not always elucidate mechanistic functioning of a highly dynamic ribonucleoprotein complex. Therefore, our lab takes a complementary approach, using the power of genetics in a multicellular animal to gain functional insights into the spliceosome. Using a C.elegans genetic screen, we have found novel functional splicing roles for two proteins, KIN17 and PRCC. Mutations in PRCC in particular promote nearby alternative 5’ splice sites at native loci. This work improves our understanding of how the spliceosome maintains the identity of where to cut the pre-mRNA, and thus how genes are expressed and used in multicellular animals.
Collapse
|
27
|
Identification of Alternative Polyadenylation in Cyanidioschyzon merolae Through Long-Read Sequencing of mRNA. Front Genet 2022; 12:818697. [PMID: 35154260 PMCID: PMC8831791 DOI: 10.3389/fgene.2021.818697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 12/22/2021] [Indexed: 12/04/2022] Open
Abstract
Alternative polyadenylation (APA) is widespread among metazoans and has been shown to have important impacts on mRNA stability and protein expression. Beyond a handful of well-studied organisms, however, its existence and consequences have not been well investigated. We therefore turned to the deep-branching red alga, Cyanidioschyzon merolae, to study the biology of polyadenylation in an organism highly diverged from humans and yeast. C. merolae is an acidothermophilic alga that lives in volcanic hot springs. It has a highly reduced genome (16.5 Mbp) and has lost all but 27 of its introns and much of its splicing machinery, suggesting that it has been under substantial pressure to simplify its RNA processing pathways. We used long-read sequencing to assess the key features of C. merolae mRNAs, including splicing status and polyadenylation cleavage site (PAS) usage. Splicing appears to be less efficient in C. merolae compared with yeast, flies, and mammalian cells. A high proportion of transcripts (63%) have at least two distinct PAS’s, and 34% appear to utilize three or more sites. The apparent polyadenylation signal UAAA is used in more than 90% of cases, in cells grown in both rich media or limiting nitrogen. Our documentation of APA for the first time in this non-model organism highlights its conservation and likely biological importance of this regulatory step in gene expression.
Collapse
|
28
|
Mechanisms of lncRNA biogenesis as revealed by nascent transcriptomics. Nat Rev Mol Cell Biol 2022; 23:389-406. [DOI: 10.1038/s41580-021-00447-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2021] [Indexed: 12/14/2022]
|
29
|
Analysis of eukaryotic lincRNA sequences indicates signatures of hindered translation linked to selection pressure. Mol Biol Evol 2021; 39:6460347. [PMID: 34897509 PMCID: PMC8826458 DOI: 10.1093/molbev/msab356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Long intergenic noncoding RNAs (lincRNAs) represent a large fraction of transcribed loci in eukaryotic genomes. Although classified as noncoding, most lincRNAs contain open reading frames (ORFs), and it remains unclear why cytoplasmic lincRNAs are not or very inefficiently translated. Here, we analyzed signatures of hindered translation in lincRNA sequences from five eukaryotes, covering a range of natural selection pressures. In fission yeast and Caenorhabditis elegans, that is, species under strong selection, we detected significantly shorter ORFs, a suboptimal sequence context around start codons for translation initiation, and trinucleotides (“codons”) corresponding to less abundant tRNAs than for neutrally evolving control sequences, likely impeding translation elongation. For human, we detected signatures for cell-type-specific hindrance of lincRNA translation, in particular codons in abundant cytoplasmic lincRNAs corresponding to lower expressed tRNAs than control codons, in three out of five human cell lines. We verified that varying tRNA expression levels between cell lines are reflected in the amount of ribosomes bound to cytoplasmic lincRNAs in each cell line. We further propose that codons at ORF starts are particularly important for reducing ribosome-binding to cytoplasmic lincRNA ORFs. Altogether, our analyses indicate that in species under stronger selection lincRNAs evolved sequence features generally hindering translation and support cell-type-specific hindrance of translation efficiency in human lincRNAs. The sequence signatures we have identified may improve predicting peptide-coding and genuine noncoding lincRNAs in a cell type.
Collapse
|
30
|
Abstract
Within the nucleus, messenger RNA is generated and processed in a highly organized and regulated manner. Messenger RNA processing begins during transcription initiation and continues until the RNA is translated and degraded. Processes such as 5' capping, alternative splicing, and 3' end processing have been studied extensively with biochemical methods and more recently with single-molecule imaging approaches. In this review, we highlight how imaging has helped understand the highly dynamic process of RNA processing. We conclude with open questions and new technological developments that may further our understanding of RNA processing.
Collapse
|
31
|
Landscape of transcription termination in Arabidopsis revealed by single-molecule nascent RNA sequencing. Genome Biol 2021; 22:322. [PMID: 34823554 PMCID: PMC8613925 DOI: 10.1186/s13059-021-02543-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 11/01/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The dynamic process of transcription termination produces transient RNA intermediates that are difficult to distinguish from each other via short-read sequencing methods. RESULTS Here, we use single-molecule nascent RNA sequencing to characterize the various forms of transient RNAs during termination at genome-wide scale in wildtype Arabidopsis and in atxrn3, fpa, and met1 mutants. Our data reveal a wide range of termination windows among genes, ranging from ~ 50 nt to over 1000 nt. We also observe efficient termination before downstream tRNA genes, suggesting that chromatin structure around the promoter region of tRNA genes may block pol II elongation. 5' Cleaved readthrough transcription in atxrn3 with delayed termination can run into downstream genes to produce normally spliced and polyadenylated mRNAs in the absence of their own transcription initiation. Consistent with previous reports, we also observe long chimeric transcripts with cryptic splicing in fpa mutant; but loss of CG DNA methylation has no obvious impact on termination in the met1 mutant. CONCLUSIONS Our method is applicable to establish a comprehensive termination landscape in a broad range of species.
Collapse
|
32
|
FLEP-seq: simultaneous detection of RNA polymerase II position, splicing status, polyadenylation site and poly(A) tail length at genome-wide scale by single-molecule nascent RNA sequencing. Nat Protoc 2021; 16:4355-4381. [PMID: 34331052 DOI: 10.1038/s41596-021-00581-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 06/03/2021] [Indexed: 01/23/2023]
Abstract
Elongation, splicing and polyadenylation are fundamental steps of transcription, and studying their coordination requires simultaneous monitoring of these dynamic processes on one transcript. We recently developed a full-length nascent RNA sequencing method in the model plant Arabidopsis that simultaneously detects RNA polymerase II position, splicing status, polyadenylation site and poly(A) tail length at genome-wide scale. This method allows calculation of the kinetics of cotranscriptional splicing and detects polyadenylated transcripts with unspliced introns retained at specific positions posttranscriptionally. Here we describe a detailed protocol for this method called FLEP-seq (full-length elongating and polyadenylated RNA sequencing) that is applicable to plants. Library production requires as little as one nanogram of nascent RNA (after rRNA/tRNA removal), and either Nanopore or PacBio platforms can be used for sequencing. We also provide a complete bioinformatic pipeline from raw data processing to downstream analysis. The minimum time required for FLEP-seq, including RNA extraction and library preparation, is 36 h. The subsequent long-read sequencing and initial data analysis ranges between 31 and 40 h, depending on the sequencing platform.
Collapse
|
33
|
The upstream 5' splice site remains associated to the transcription machinery during intron synthesis. Nat Commun 2021; 12:4545. [PMID: 34315864 PMCID: PMC8316553 DOI: 10.1038/s41467-021-24774-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 07/02/2021] [Indexed: 12/28/2022] Open
Abstract
In the earliest step of spliceosome assembly, the two splice sites flanking an intron are brought into proximity by U1 snRNP and U2AF along with other proteins. The mechanism that facilitates this intron looping is poorly understood. Using a CRISPR interference-based approach to halt RNA polymerase II transcription in the middle of introns in human cells, we discovered that the nascent 5′ splice site base pairs with a U1 snRNA that is tethered to RNA polymerase II during intron synthesis. This association functionally corresponds with splicing outcome, involves bona fide 5′ splice sites and cryptic intronic sites, and occurs transcriptome-wide. Overall, our findings reveal that the upstream 5′ splice sites remain attached to the transcriptional machinery during intron synthesis and are thus brought into proximity of the 3′ splice sites; potentially mediating the rapid splicing of long introns. We know that most splicing reactions take place co-transcriptionally, but how the transcription machinery facilitate splicing of introns is unknown. Here the authors show that the 5′ splice site remains associated with the transcription machinery during intron synthesis through U1 snRNP, providing a basis for the rapid splicing reaction of introns.
Collapse
|
34
|
Co-transcriptional splicing efficiencies differ within genes and between cell types. RNA (NEW YORK, N.Y.) 2021; 27:rna.078662.120. [PMID: 33975916 PMCID: PMC8208053 DOI: 10.1261/rna.078662.120] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 05/05/2021] [Indexed: 06/01/2023]
Abstract
Pre-mRNA splicing is carried out by the spliceosome and involves splice site recognition, removal of introns, and ligation of exons. Components of the spliceosome have been shown to interact with the elongating RNA polymerase II (RNAPII) which is thought to allow splicing to occur concurrently with transcription. However, little is known about the regulation and efficiency of co-transcriptional splicing in human cells. In this study, we used Bru-seq and BruChase-seq to determine the co-transcriptional splicing efficiencies of 17,000 introns expressed across 6 human cell lines. We found that less than half of all introns across these 6 cell lines were co-transcriptionally spliced. Splicing efficiencies for individual introns showed variations across cell lines, suggesting that splicing may be regulated in a cell-type specific manner. Moreover, the splicing efficiency of introns varied within genes. The efficiency of co-transcriptional splicing did not correlate with gene length, intron position, splice site strengths, or the intron/neighboring exons GC content. However, we identified binding signals from multiple RNA binding proteins (RBPs) that correlated with splicing efficiency, including core spliceosomal machinery components-such as SF3B4, U2AF1 and U2AF2 showing higher binding signals in poorly spliced introns. In addition, multiple RBPs, such as BUD13, PUM1 and SND1, showed preferential binding in exons that flank introns with high splicing efficiencies. The nascent RNA splicing patterns presented here across multiple cell types add to our understanding of the complexity in RNA splicing, wherein RNA-binding proteins may play important roles in determining splicing outcomes in a cell type- and intron-specific manner.
Collapse
|
35
|
Dynamic imaging of nascent RNA reveals general principles of transcription dynamics and stochastic splice site selection. Cell 2021; 184:2878-2895.e20. [PMID: 33979654 DOI: 10.1016/j.cell.2021.04.012] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 11/12/2020] [Accepted: 04/08/2021] [Indexed: 01/06/2023]
Abstract
The activities of RNA polymerase and the spliceosome are responsible for the heterogeneity in the abundance and isoform composition of mRNA in human cells. However, the dynamics of these megadalton enzymatic complexes working in concert on endogenous genes have not been described. Here, we establish a quasi-genome-scale platform for observing synthesis and processing kinetics of single nascent RNA molecules in real time. We find that all observed genes show transcriptional bursting. We also observe large kinetic variation in intron removal for single introns in single cells, which is inconsistent with deterministic splice site selection. Transcriptome-wide footprinting of the U2AF complex, nascent RNA profiling, long-read sequencing, and lariat sequencing further reveal widespread stochastic recursive splicing within introns. We propose and validate a unified theoretical model to explain the general features of transcription and pervasive stochastic splice site selection.
Collapse
|
36
|
Nuclear mechanisms of gene expression control: pre-mRNA splicing as a life or death decision. Curr Opin Genet Dev 2021; 67:67-76. [PMID: 33291060 PMCID: PMC8084925 DOI: 10.1016/j.gde.2020.11.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 10/26/2020] [Accepted: 11/03/2020] [Indexed: 02/06/2023]
Abstract
Thousands of genes produce polyadenylated mRNAs that still contain one or more introns. These transcripts are known as retained intron RNAs (RI-RNAs). In the past 10 years, RI-RNAs have been linked to post-transcriptional alternative splicing in a variety of developmental contexts, but they can also be dead-end products fated for RNA decay. Here we discuss the role of intron retention in shaping gene expression programs, as well as recent evidence suggesting that the biogenesis and fate of RI-RNAs is regulated by nuclear organization. We discuss the possibility that proximity of RNA to nuclear speckles - biomolecular condensates that are highly enriched in splicing factors and other RNA binding proteins - is associated with choices ranging from efficient co-transcriptional splicing, export and stability to regulated post-transcriptional splicing and possible vulnerability to decay.
Collapse
|
37
|
Anything but Ordinary – Emerging Splicing Mechanisms in Eukaryotic Gene Regulation. Trends Genet 2021; 37:355-372. [DOI: 10.1016/j.tig.2020.10.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 10/14/2020] [Accepted: 10/19/2020] [Indexed: 12/11/2022]
|
38
|
Intron exon boundary junctions in human genome have in-built unique structural and energetic signals. Nucleic Acids Res 2021; 49:2674-2683. [PMID: 33621338 PMCID: PMC7969029 DOI: 10.1093/nar/gkab098] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Revised: 01/21/2021] [Accepted: 02/22/2021] [Indexed: 11/13/2022] Open
Abstract
Precise identification of correct exon–intron boundaries is a prerequisite to analyze the location and structure of genes. The existing framework for genomic signals, delineating exon and introns in a genomic segment, seems insufficient, predominantly due to poor sequence consensus as well as limitations of training on available experimental data sets. We present here a novel concept for characterizing exon–intron boundaries in genomic segments on the basis of structural and energetic properties. We analyzed boundary junctions on both sides of all the exons (3 28 368) of protein coding genes from human genome (GENCODE database) using 28 structural and three energy parameters. Study of sequence conservation at these sites shows very poor consensus. It is observed that DNA adopts a unique structural and energy state at the boundary junctions. Also, signals are somewhat different for housekeeping and tissue specific genes. Clustering of 31 parameters into four derived vectors gives some additional insights into the physical mechanisms involved in this biological process. Sites of structural and energy signals correlate well to the positions playing important roles in pre-mRNA splicing.
Collapse
|
39
|
POINT technology illuminates the processing of polymerase-associated intact nascent transcripts. Mol Cell 2021; 81:1935-1950.e6. [PMID: 33735606 PMCID: PMC8122139 DOI: 10.1016/j.molcel.2021.02.034] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 12/21/2020] [Accepted: 02/24/2021] [Indexed: 12/29/2022]
Abstract
Mammalian chromatin is the site of both RNA polymerase II (Pol II) transcription and coupled RNA processing. However, molecular details of such co-transcriptional mechanisms remain obscure, partly because of technical limitations in purifying authentic nascent transcripts. We present a new approach to characterize nascent RNA, called polymerase intact nascent transcript (POINT) technology. This three-pronged methodology maps nascent RNA 5′ ends (POINT-5), establishes the kinetics of co-transcriptional splicing patterns (POINT-nano), and profiles whole transcription units (POINT-seq). In particular, we show by depletion of the nuclear exonuclease Xrn2 that this activity acts selectively on cleaved 5′ P-RNA at polyadenylation sites. Furthermore, POINT-nano reveals that co-transcriptional splicing either occurs immediately after splice site transcription or is delayed until Pol II transcribes downstream sequences. Finally, we connect RNA cleavage and splicing with either premature or full-length transcript termination. We anticipate that POINT technology will afford full dissection of the complexity of co-transcriptional RNA processing. POINT methodology dissects intact nascent RNA processing Specificity of Xrn2 exonuclease in co-transcriptional RNA degradation Splicing suppresses Xrn2-dependent premature termination Different kinetic classes of co-transcriptional splicing in human genes
Collapse
|
40
|
Revealing nascent RNA processing dynamics with nano-COP. Nat Protoc 2021; 16:1343-1375. [PMID: 33514943 PMCID: PMC8713461 DOI: 10.1038/s41596-020-00469-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 11/20/2020] [Indexed: 01/30/2023]
Abstract
During maturation, eukaryotic precursor RNAs undergo processing events including intron splicing, 3'-end cleavage, and polyadenylation. Here we describe nanopore analysis of co-transcriptional processing (nano-COP), a method for probing the timing and patterns of RNA processing. An extension of native elongating transcript sequencing, which quantifies transcription genome-wide through short-read sequencing of nascent RNA 3' ends, nano-COP uses long-read nascent RNA sequencing to observe global patterns of RNA processing. First, nascent RNA is stringently purified through a combination of 4-thiouridine metabolic labeling and cellular fractionation. In contrast to cDNA or short-read-based approaches relying on reverse transcription or amplification, the sample is sequenced directly through nanopores to reveal the native context of nascent RNA. nano-COP identifies both active transcription sites and splice isoforms of single RNA molecules during synthesis, providing insight into patterns of intron removal and the physical coupling between transcription and splicing. The nano-COP protocol yields data within 3 d.
Collapse
|
41
|
Alternative RNA structures formed during transcription depend on elongation rate and modify RNA processing. Mol Cell 2021; 81:1789-1801.e5. [PMID: 33631106 DOI: 10.1016/j.molcel.2021.01.040] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 01/26/2021] [Accepted: 01/27/2021] [Indexed: 12/24/2022]
Abstract
Most RNA processing occurs co-transcriptionally. We interrogated nascent pol II transcripts by chemical and enzymatic probing and determined how the "nascent RNA structureome" relates to splicing, A-I editing and transcription speed. RNA folding within introns and steep structural transitions at splice sites are associated with efficient co-transcriptional splicing. A slow pol II mutant elicits extensive remodeling into more folded conformations with increased A-I editing. Introns that become more structured at their 3' splice sites get co-transcriptionally excised more efficiently. Slow pol II altered folding of intronic Alu elements where cryptic splicing and intron retention are stimulated, an outcome mimicked by UV, which decelerates transcription. Slow transcription also remodeled RNA folding around alternative exons in distinct ways that predict whether skipping or inclusion is favored, even though it occurs post-transcriptionally. Hence, co-transcriptional RNA folding modulates post-transcriptional alternative splicing. In summary, the plasticity of nascent transcripts has widespread effects on RNA processing.
Collapse
|
42
|
Co-transcriptional splicing regulates 3' end cleavage during mammalian erythropoiesis. Mol Cell 2021; 81:998-1012.e7. [PMID: 33440169 DOI: 10.1016/j.molcel.2020.12.018] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 12/07/2020] [Accepted: 12/10/2020] [Indexed: 12/11/2022]
Abstract
Pre-mRNA processing steps are tightly coordinated with transcription in many organisms. To determine how co-transcriptional splicing is integrated with transcription elongation and 3' end formation in mammalian cells, we performed long-read sequencing of individual nascent RNAs and precision run-on sequencing (PRO-seq) during mouse erythropoiesis. Splicing was not accompanied by transcriptional pausing and was detected when RNA polymerase II (Pol II) was within 75-300 nucleotides of 3' splice sites (3'SSs), often during transcription of the downstream exon. Interestingly, several hundred introns displayed abundant splicing intermediates, suggesting that splicing delays can take place between the two catalytic steps. Overall, splicing efficiencies were correlated among introns within the same transcript, and intron retention was associated with inefficient 3' end cleavage. Remarkably, a thalassemia patient-derived mutation introducing a cryptic 3'SS improved both splicing and 3' end cleavage of individual β-globin transcripts, demonstrating functional coupling between the two co-transcriptional processes as a determinant of productive gene output.
Collapse
|
43
|
Processing of coding and non-coding RNAs in plant development and environmental responses. Essays Biochem 2020; 64:931-945. [DOI: 10.1042/ebc20200029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 10/21/2020] [Accepted: 10/23/2020] [Indexed: 12/14/2022]
Abstract
Abstract
Precursor RNAs undergo extensive processing to become mature RNAs. RNA transcripts are subjected to 5′ capping, 3′-end processing, splicing, and modification; they also form dynamic secondary structures during co-transcriptional and post-transcriptional processing. Like coding RNAs, non-coding RNAs (ncRNAs) undergo extensive processing. For example, secondary small interfering RNA (siRNA) transcripts undergo RNA processing, followed by further cleavage to become mature siRNAs. Transcriptome studies have revealed roles for co-transcriptional and post-transcriptional RNA processing in the regulation of gene expression and the coordination of plant development and plant–environment interactions. In this review, we present the latest progress on RNA processing in gene expression and discuss phased siRNAs (phasiRNAs), a kind of germ cell-specific secondary small RNA (sRNA), focusing on their functions in plant development and environmental responses.
Collapse
|
44
|
Calculating the most likely intron splicing orders in S. pombe, fruit fly, Arabidopsis thaliana, and humans. BMC Bioinformatics 2020; 21:478. [PMID: 33099301 PMCID: PMC7585206 DOI: 10.1186/s12859-020-03818-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 10/15/2020] [Indexed: 12/01/2022] Open
Abstract
Background Introns have been shown to be spliced in a defined order, and this order influences both alternative splicing regulation and splicing fidelity, but previous studies have only considered neighbouring introns. The detailed intron splicing order remains unknown.
Results In this work, a method was developed that can calculate the intron splicing orders of all introns in each transcript. A simulation study showed that this method can accurately calculate intron splicing orders. I further applied this method to real S. pombe, fruit fly, Arabidopsis thaliana, and human sequencing datasets and found that intron splicing orders change from gene to gene and that humans contain more not in-order spliced transcripts than S. pombe, fruit fly and Arabidopsis thaliana. In addition, I reconfirmed that the first introns in humans are spliced slower than those in S. pombe, fruit fly, and Arabidopsis thaliana genome-widely. Both the calculated most likely orders and the method developed here are available on the web. Conclusions A novel computational method was developed to calculate the intron splicing orders and applied the method to real sequencing datasets. I obtained intron splicing orders for hundreds or thousands of genes in four organisms. I found humans contain more number of not in-order spliced transcripts.
Collapse
|
45
|
Preparation of Mammalian Nascent RNA for Long Read Sequencing. CURRENT PROTOCOLS IN MOLECULAR BIOLOGY 2020; 133:e128. [PMID: 33085989 PMCID: PMC7586757 DOI: 10.1002/cpmb.128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Long read sequencing technologies now allow high-quality sequencing of RNAs (or their cDNAs) that are hundreds to thousands of nucleotides long. Long read sequences of nascent RNA provide single-nucleotide-resolution information about co-transcriptional RNA processing events-e.g., splicing, folding, and base modifications. Here, we describe how to isolate nascent RNA from mammalian cells through subcellular fractionation of chromatin-associated RNA, as well as how to deplete poly(A)+ RNA and rRNA, and, finally, how to generate a full-length cDNA library for use on long read sequencing platforms. This approach allows for an understanding of coordinated splicing status across multi-intron transcripts by revealing patterns of splicing or other RNA processing events that cannot be gained from traditional short read RNA sequencing. © 2020 Wiley Periodicals LLC. Basic Protocol 1: Subcellular fractionation Basic Protocol 2: Nascent RNA isolation and adapter ligation Basic Protocol 3: cDNA amplicon preparation.
Collapse
|
46
|
Elements at the 5' end of Xist harbor SPEN-independent transcriptional antiterminator activity. Nucleic Acids Res 2020; 48:10500-10517. [PMID: 32986830 PMCID: PMC7544216 DOI: 10.1093/nar/gkaa789] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 08/20/2020] [Accepted: 09/12/2020] [Indexed: 12/22/2022] Open
Abstract
The Xist lncRNA requires Repeat A, a conserved RNA element located in its 5' end, to induce gene silencing during X-chromosome inactivation. Intriguingly, Repeat A is also required for production of Xist. While silencing by Repeat A requires the protein SPEN, how Repeat A promotes Xist production remains unclear. We report that in mouse embryonic stem cells, expression of a transgene comprising the first two kilobases of Xist (Xist-2kb) causes transcriptional readthrough of downstream polyadenylation sequences. Readthrough required Repeat A and the ∼750 nucleotides downstream, did not require SPEN, and was attenuated by splicing. Despite associating with SPEN and chromatin, Xist-2kb did not robustly silence transcription, whereas a 5.5-kb Xist transgene robustly silenced transcription and read through its polyadenylation sequence. Longer, spliced Xist transgenes also induced robust silencing yet terminated efficiently. Thus, in contexts examined here, Xist requires sequence elements beyond its first two kilobases to robustly silence transcription, and the 5' end of Xist harbors SPEN-independent transcriptional antiterminator activity that can repress proximal cleavage and polyadenylation. In endogenous contexts, this antiterminator activity may help produce full-length Xist RNA while rendering the Xist locus resistant to silencing by the same repressive complexes that the lncRNA recruits to other genes.
Collapse
|
47
|
Widespread Transcriptional Readthrough Caused by Nab2 Depletion Leads to Chimeric Transcripts with Retained Introns. Cell Rep 2020; 33:108324. [PMID: 33113357 DOI: 10.1016/j.celrep.2020.108324] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 09/15/2020] [Accepted: 10/07/2020] [Indexed: 01/26/2023] Open
Abstract
Nascent RNA sequencing has revealed that pre-mRNA splicing can occur shortly after introns emerge from RNA polymerase II (RNA Pol II). Differences in co-transcriptional splicing profiles suggest regulation by cis- and/or trans-acting factors. Here, we use single-molecule intron tracking (SMIT) to identify a cohort of regulators by machine learning in budding yeast. Of these, Nab2 displays reduced co-transcriptional splicing when depleted. Unexpectedly, these splicing defects are attributable to aberrant "intrusive" transcriptional readthrough from upstream genes, as revealed by long-read sequencing. Transcripts that originate from the intron-containing gene's own transcription start site (TSS) are efficiently spliced, indicating no direct role of Nab2 in splicing per se. This work highlights the coupling between transcription, splicing, and 3' end formation in the context of gene organization along chromosomes. We conclude that Nab2 is required for proper 3' end processing, which ensures gene-specific control of co-transcriptional RNA processing.
Collapse
|
48
|
Macrophage development and activation involve coordinated intron retention in key inflammatory regulators. Nucleic Acids Res 2020; 48:6513-6529. [PMID: 32449925 PMCID: PMC7337907 DOI: 10.1093/nar/gkaa435] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 05/04/2020] [Accepted: 05/11/2020] [Indexed: 12/31/2022] Open
Abstract
Monocytes and macrophages are essential components of the innate immune system. Herein, we report that intron retention (IR) plays an important role in the development and function of these cells. Using Illumina mRNA sequencing, Nanopore direct cDNA sequencing and proteomics analysis, we identify IR events that affect the expression of key genes/proteins involved in macrophage development and function. We demonstrate that decreased IR in nuclear-detained mRNA is coupled with increased expression of genes encoding regulators of macrophage transcription, phagocytosis and inflammatory signalling, including ID2, IRF7, ENG and LAT. We further show that this dynamic IR program persists during the polarisation of resting macrophages into activated macrophages. In the presence of proinflammatory stimuli, intron-retaining CXCL2 and NFKBIZ transcripts are rapidly spliced, enabling timely expression of these key inflammatory regulators by macrophages. Our study provides novel insights into the molecular factors controlling vital regulators of the innate immune response.
Collapse
|
49
|
Circular RNAs: The Brain Transcriptome Comes Full Circle. Trends Neurosci 2020; 43:752-766. [PMID: 32829926 DOI: 10.1016/j.tins.2020.07.007] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 07/02/2020] [Accepted: 07/22/2020] [Indexed: 12/13/2022]
Abstract
Circular RNAs (circRNAs) are a class of RNA molecules with a covalently closed loop structure formed by back-splicing of exon-exon junctions. The detection of circRNAs across many eukaryotic species, often with cell-type- and tissue-type-specific expression, has catalyzed a growing interest in understanding circRNA biogenesis and their potential functions. circRNAs are enriched in the brain, and accumulate upon neuronal differentiation and depolarization, suggesting that these RNAs are an integral component of the brain transcriptome, and may play functional roles. Here, we give an overview of the current understanding of circRNA biogenesis and function, discuss how circRNAs contribute to transcriptome complexity in the brain, and discuss recent data on the functional roles of circRNAs in the brain. We also discuss emerging data on the role of circRNAs in brain disorders and address common challenges of circRNA quantification in postmortem human brain.
Collapse
|
50
|
Extending rnaSPAdes functionality for hybrid transcriptome assembly. BMC Bioinformatics 2020; 21:302. [PMID: 32703149 PMCID: PMC7379828 DOI: 10.1186/s12859-020-03614-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 06/18/2020] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND De novo RNA-Seq assembly is a powerful method for analysing transcriptomes when the reference genome is not available or poorly annotated. However, due to the short length of Illumina reads it is usually impossible to reconstruct complete sequences of complex genes and alternative isoforms. Recently emerged possibility to generate long RNA reads, such as PacBio and Oxford Nanopores, may dramatically improve the assembly quality, and thus the consecutive analysis. While reference-based tools for analysing long RNA reads were recently developed, there is no established pipeline for de novo assembly of such data. RESULTS In this work we present a novel method that allows to perform high-quality de novo transcriptome assemblies by combining accuracy and reliability of short reads with exon structure information carried out from long error-prone reads. The algorithm is designed by incorporating existing hybridSPAdes approach into rnaSPAdes pipeline and adapting it for transcriptomic data. CONCLUSION To evaluate the benefit of using long RNA reads we selected several datasets containing both Illumina and Iso-seq or Oxford Nanopore Technologies (ONT) reads. Using an existing quality assessment software, we show that hybrid assemblies performed with rnaSPAdes contain more full-length genes and alternative isoforms comparing to the case when only short-read data is used.
Collapse
|