1
|
Capitanchik C, Wilkins OG, Wagner N, Gagneur J, Ule J. From computational models of the splicing code to regulatory mechanisms and therapeutic implications. Nat Rev Genet 2025; 26:171-190. [PMID: 39358547 DOI: 10.1038/s41576-024-00774-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/27/2024] [Indexed: 10/04/2024]
Abstract
Since the discovery of RNA splicing and its role in gene expression, researchers have sought a set of rules, an algorithm or a computational model that could predict the splice isoforms, and their frequencies, produced from any transcribed gene in a specific cellular context. Over the past 30 years, these models have evolved from simple position weight matrices to deep-learning models capable of integrating sequence data across vast genomic distances. Most recently, new model architectures are moving the field closer to context-specific alternative splicing predictions, and advances in sequencing technologies are expanding the type of data that can be used to inform and interpret such models. Together, these developments are driving improved understanding of splicing regulatory mechanisms and emerging applications of the splicing code to the rational design of RNA- and splicing-based therapeutics.
Collapse
Affiliation(s)
- Charlotte Capitanchik
- The Francis Crick Institute, London, UK
- UK Dementia Research Institute at King's College London, London, UK
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King's College London, London, UK
| | - Oscar G Wilkins
- The Francis Crick Institute, London, UK
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Nils Wagner
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany
| | - Julien Gagneur
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany.
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany.
| | - Jernej Ule
- The Francis Crick Institute, London, UK.
- UK Dementia Research Institute at King's College London, London, UK.
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King's College London, London, UK.
- National Institute of Chemistry, Ljubljana, Slovenia.
| |
Collapse
|
2
|
La Fleur A, Shi Y, Seelig G. Decoding biology with massively parallel reporter assays and machine learning. Genes Dev 2024; 38:843-865. [PMID: 39362779 PMCID: PMC11535156 DOI: 10.1101/gad.351800.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
Collapse
Affiliation(s)
- Alyssa La Fleur
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA;
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA;
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
3
|
Wesp V, Theißen G, Schuster S. Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content. Sci Rep 2023; 13:22996. [PMID: 38151539 PMCID: PMC10752896 DOI: 10.1038/s41598-023-49626-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 12/10/2023] [Indexed: 12/29/2023] Open
Abstract
Knowledge of the frequencies of synonymous triplets in protein-coding and non-coding DNA stretches can be used in gene finding. These frequencies depend on the GC content of the genome or parts of it. An example of interest is provided by stop codons. This is relevant for the definition of Open Reading Frames. A generic case is provided by pseudo-random sequences, especially when they code for complex proteins or when they are non-coding and not subject to selection pressure. Here, we calculate, for such sequences and for all 25 known genetic codes, the frequency of each amino acid and stop codon based on their set of codons and as a function of GC content. The amino acids can be classified into five groups according to the GC content where their expected frequency reaches its maximum. We determine the overall Shannon information based on groups of synonymous codons and show that it becomes maximum at a percent GC of 43.3% (for the standard code). This is in line with the observation that in most fungi, plants, and animals, this genomic parameter is in the range from 35 to 50%. By analysing natural sequences, we show that there is a clear bias for triplets corresponding to stop codons near the 5'- and 3'-splice sites in the introns of various clades.
Collapse
Affiliation(s)
- Valentin Wesp
- Department of Bioinformatics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
| | - Günter Theißen
- Department of Genetics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Philosophenweg 12, 07743, Jena, Germany
| | - Stefan Schuster
- Department of Bioinformatics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany.
| |
Collapse
|
4
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
5
|
Rogalska ME, Vivori C, Valcárcel J. Regulation of pre-mRNA splicing: roles in physiology and disease, and therapeutic prospects. Nat Rev Genet 2023; 24:251-269. [PMID: 36526860 DOI: 10.1038/s41576-022-00556-8] [Citation(s) in RCA: 106] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2022] [Indexed: 12/23/2022]
Abstract
The removal of introns from mRNA precursors and its regulation by alternative splicing are key for eukaryotic gene expression and cellular function, as evidenced by the numerous pathologies induced or modified by splicing alterations. Major recent advances have been made in understanding the structures and functions of the splicing machinery, in the description and classification of physiological and pathological isoforms and in the development of the first therapies for genetic diseases based on modulation of splicing. Here, we review this progress and discuss important remaining challenges, including predicting splice sites from genomic sequences, understanding the variety of molecular mechanisms and logic of splicing regulation, and harnessing this knowledge for probing gene function and disease aetiology and for the design of novel therapeutic approaches.
Collapse
Affiliation(s)
- Malgorzata Ewa Rogalska
- Genome Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Claudia Vivori
- Genome Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra (UPF), Barcelona, Spain
- The Francis Crick Institute, London, UK
| | - Juan Valcárcel
- Genome Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
6
|
Horn T, Gosliga A, Li C, Enculescu M, Legewie S. Position-dependent effects of RNA-binding proteins in the context of co-transcriptional splicing. NPJ Syst Biol Appl 2023; 9:1. [PMID: 36653378 PMCID: PMC9849329 DOI: 10.1038/s41540-022-00264-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 12/08/2022] [Indexed: 01/19/2023] Open
Abstract
Alternative splicing is an important step in eukaryotic mRNA pre-processing which increases the complexity of gene expression programs, but is frequently altered in disease. Previous work on the regulation of alternative splicing has demonstrated that splicing is controlled by RNA-binding proteins (RBPs) and by epigenetic DNA/histone modifications which affect splicing by changing the speed of polymerase-mediated pre-mRNA transcription. The interplay of these different layers of splicing regulation is poorly understood. In this paper, we derived mathematical models describing how splicing decisions in a three-exon gene are made by combinatorial spliceosome binding to splice sites during ongoing transcription. We additionally take into account the effect of a regulatory RBP and find that the RBP binding position within the sequence is a key determinant of how RNA polymerase velocity affects splicing. Based on these results, we explain paradoxical observations in the experimental literature and further derive rules explaining why the same RBP can act as inhibitor or activator of cassette exon inclusion depending on its binding position. Finally, we derive a stochastic description of co-transcriptional splicing regulation at the single-cell level and show that splicing outcomes show little noise and follow a binomial distribution despite complex regulation by a multitude of factors. Taken together, our simulations demonstrate the robustness of splicing outcomes and reveal that quantitative insights into kinetic competition of co-transcriptional events are required to fully understand this important mechanism of gene expression diversity.
Collapse
Affiliation(s)
- Timur Horn
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Alison Gosliga
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
- University of Stuttgart, Department of Systems Biology and Stuttgart Research Center Systems Biology (SRCSB), Allmandring 31, 70569, Stuttgart, Germany
| | - Congxin Li
- University of Stuttgart, Department of Systems Biology and Stuttgart Research Center Systems Biology (SRCSB), Allmandring 31, 70569, Stuttgart, Germany
| | - Mihaela Enculescu
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany.
| | - Stefan Legewie
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany.
- University of Stuttgart, Department of Systems Biology and Stuttgart Research Center Systems Biology (SRCSB), Allmandring 31, 70569, Stuttgart, Germany.
| |
Collapse
|
7
|
Аpplication of massive parallel reporter analysis in biotechnology and medicine. КЛИНИЧЕСКАЯ ПРАКТИКА 2023. [DOI: 10.17816/clinpract115063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The development and functioning of an organism relies on tissue-specific gene programs. Genome regulatory elements play a key role in the regulation of such programs, and disruptions in their function can lead to the development of various pathologies, including cancers, malformations and autoimmune diseases. The emergence of high-throughput genomic studies has led to massively parallel reporter analysis (MPRA) methods, which allow the functional verification and identification of regulatory elements on a genome-wide scale. Initially MPRA was used as a tool to investigate fundamental aspects of epigenetics, but the approach also has great potential for clinical and practical biotechnology. Currently, MPRA is used for validation of clinically significant mutations, identification of tissue-specific regulatory elements, search for the most promising loci for transgene integration, and is an indispensable tool for creating highly efficient expression systems, the range of application of which extends from approaches for protein development and design of next-generation therapeutic antibody superproducers to gene therapy. In this review, the main principles and areas of practical application of high-throughput reporter assays will be discussed.
Collapse
|
8
|
Ding L, Li X, Zhu H, Luo H. Single-Cell Sequencing in Rheumatic Diseases: New Insights from the Perspective of the Cell Type. Aging Dis 2022; 13:1633-1651. [PMID: 36465169 PMCID: PMC9662270 DOI: 10.14336/ad.2022.0323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 03/23/2022] [Indexed: 11/02/2023] Open
Abstract
Rheumatic diseases are a group of highly heterogeneous autoimmune and inflammatory disorders involving multiple systems. Dysfunction of immune and non-immune cells participates in the complex pathogenesis of rheumatic diseases. Therefore, studies on the abnormal activation of cell subtypes provided a specific basis for understanding the pathogenesis of rheumatic diseases, which promoted the accuracy of disease diagnosis and the effectiveness of various treatments. However, there was still a far way to achieve individualized precision medicine as the result of heterogeneity among cell subtypes. To obtain the biological information of cell subtypes, single-cell sequencing, a cutting-edge technology, is used for analyzing their genomes, transcriptomes, epigenetics, and proteomics. Novel results identified multiple cell subtypes in tissues of patients with rheumatic diseases by single-cell sequencing. Consequently, we provide an overview of recent applications of single-cell sequencing in rheumatic disease and cross-tissue to understand the cell subtypes and functions.
Collapse
Affiliation(s)
- Liqing Ding
- The Department of Rheumatology and Immunology, Xiangya Hospital of Central South University, Changsha, Hunan, China.
| | - Xiaojing Li
- The Department of Rheumatology and Immunology, Xiangya Hospital of Central South University, Changsha, Hunan, China.
| | - Honglin Zhu
- The Department of Rheumatology and Immunology, Xiangya Hospital of Central South University, Changsha, Hunan, China.
- Provincial Clinical Research Center for Rheumatic and Immunologic Diseases, Xiangya Hospital, Changsha, Hunan, China.
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Changsha, Hunan, China.
| | - Hui Luo
- The Department of Rheumatology and Immunology, Xiangya Hospital of Central South University, Changsha, Hunan, China.
- Provincial Clinical Research Center for Rheumatic and Immunologic Diseases, Xiangya Hospital, Changsha, Hunan, China.
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Changsha, Hunan, China.
| |
Collapse
|
9
|
Mechanism and modeling of human disease-associated near-exon intronic variants that perturb RNA splicing. Nat Struct Mol Biol 2022; 29:1043-1055. [PMID: 36303034 DOI: 10.1038/s41594-022-00844-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 08/23/2022] [Indexed: 12/24/2022]
Abstract
It is estimated that 10%-30% of disease-associated genetic variants affect splicing. Splicing variants may generate deleteriously altered gene product and are potential therapeutic targets. However, systematic diagnosis or prediction of splicing variants is yet to be established, especially for the near-exon intronic splice region. The major challenge lies in the redundant and ill-defined branch sites and other splicing motifs therein. Here, we carried out unbiased massively parallel splicing assays on 5,307 disease-associated variants that overlapped with branch sites and collected 5,884 variants across the 5' splice region. We found that strong splice sites and exonic features preserve splicing from intronic sequence variation. Whereas the splice-altering mechanism of the 3' intronic variants is complex, that of the 5' is mainly splice-site destruction. Statistical learning combined with these molecular features allows precise prediction of altered splicing from an intronic variant. This statistical model provides the identity and ranking of biological features that determine splicing, which serves as transferable knowledge and out-performs the benchmarking predictive tool. Moreover, we demonstrated that intronic splicing variants may associate with disease risks in the human population. Our study elucidates the mechanism of splicing response of intronic variants, which classify disease-associated splicing variants for the promise of precision medicine.
Collapse
|
10
|
Arora A, Castro-Gutierrez R, Moffatt C, Eletto D, Becker R, Brown M, Moor A, Russ HA, Taliaferro JM. High-throughput identification of RNA localization elements in neuronal cells. Nucleic Acids Res 2022; 50:10626-10642. [PMID: 36107770 PMCID: PMC9561290 DOI: 10.1093/nar/gkac763] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 08/18/2022] [Accepted: 08/25/2022] [Indexed: 12/15/2022] Open
Abstract
Hundreds of RNAs are enriched in the projections of neuronal cells. For the vast majority of them, though, the sequence elements that regulate their localization are unknown. To identify RNA elements capable of directing transcripts to neurites, we deployed a massively parallel reporter assay that tested the localization regulatory ability of thousands of sequence fragments drawn from endogenous mouse 3' UTRs. We identified peaks of regulatory activity within several 3' UTRs and found that sequences derived from these peaks were both necessary and sufficient for RNA localization to neurites in mouse and human neuronal cells. The localization elements were enriched in adenosine and guanosine residues. They were at least tens to hundreds of nucleotides long as shortening of two identified elements led to significantly reduced activity. Using RNA affinity purification and mass spectrometry, we found that the RNA-binding protein Unk was associated with the localization elements. Depletion of Unk in cells reduced the ability of the elements to drive RNAs to neurites, indicating a functional requirement for Unk in their trafficking. These results provide a framework for the unbiased, high-throughput identification of RNA elements and mechanisms that govern transcript localization in neurons.
Collapse
Affiliation(s)
- Ankita Arora
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, USA
| | | | - Charlie Moffatt
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, USA
| | - Davide Eletto
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Raquel Becker
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, USA
| | - Maya Brown
- RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, USA
| | - Andreas E Moor
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Holger A Russ
- Barbara Davis Center for Diabetes, University of Colorado Anschutz Medical Campus, USA
| | - J Matthew Taliaferro
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, USA
- RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, USA
| |
Collapse
|
11
|
Mikl M, Eletto D, Nijim M, Lee M, Lafzi A, Mhamedi F, David O, Sain SB, Handler K, Moor A. A massively parallel reporter assay reveals focused and broadly encoded RNA localization signals in neurons. Nucleic Acids Res 2022; 50:10643-10664. [PMID: 36156153 PMCID: PMC9561380 DOI: 10.1093/nar/gkac806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 08/24/2022] [Accepted: 09/08/2022] [Indexed: 11/14/2022] Open
Abstract
Asymmetric subcellular mRNA localization allows spatial regulation of gene expression and functional compartmentalization. In neurons, localization of specific mRNAs to neurites is essential for cellular functioning. However, it is largely unknown how transcript sorting works in a sequence-specific manner. Here, we combined subcellular transcriptomics and massively parallel reporter assays and tested ∼50 000 sequences for their ability to localize to neurites. Mapping the localization potential of >300 genes revealed two ways neurite targeting can be achieved: focused localization motifs and broadly encoded localization potential. We characterized the interplay between RNA stability and localization and identified motifs able to bias localization towards neurite or soma as well as the trans-acting factors required for their action. Based on our data, we devised machine learning models that were able to predict the localization behavior of novel reporter sequences. Testing this predictor on native mRNA sequencing data showed good agreement between predicted and observed localization potential, suggesting that the rules uncovered by our MPRA also apply to the localization of native full-length transcripts.
Collapse
Affiliation(s)
- Martin Mikl
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Department of Human Biology, University of Haifa, Haifa, Israel
| | - Davide Eletto
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Malak Nijim
- Department of Human Biology, University of Haifa, Haifa, Israel
| | - Minkyoung Lee
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Atefeh Lafzi
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Farah Mhamedi
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Orit David
- Department of Human Biology, University of Haifa, Haifa, Israel
| | - Simona Baghai Sain
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Kristina Handler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Andreas E Moor
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| |
Collapse
|
12
|
Cortés-López M, Schulz L, Enculescu M, Paret C, Spiekermann B, Quesnel-Vallières M, Torres-Diz M, Unic S, Busch A, Orekhova A, Kuban M, Mesitov M, Mulorz MM, Shraim R, Kielisch F, Faber J, Barash Y, Thomas-Tikhonenko A, Zarnack K, Legewie S, König J. High-throughput mutagenesis identifies mutations and RNA-binding proteins controlling CD19 splicing and CART-19 therapy resistance. Nat Commun 2022; 13:5570. [PMID: 36138008 PMCID: PMC9500061 DOI: 10.1038/s41467-022-31818-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 07/05/2022] [Indexed: 11/29/2022] Open
Abstract
Following CART-19 immunotherapy for B-cell acute lymphoblastic leukaemia (B-ALL), many patients relapse due to loss of the cognate CD19 epitope. Since epitope loss can be caused by aberrant CD19 exon 2 processing, we herein investigate the regulatory code that controls CD19 splicing. We combine high-throughput mutagenesis with mathematical modelling to quantitatively disentangle the effects of all mutations in the region comprising CD19 exons 1-3. Thereupon, we identify ~200 single point mutations that alter CD19 splicing and thus could predispose B-ALL patients to developing CART-19 resistance. Furthermore, we report almost 100 previously unknown splice isoforms that emerge from cryptic splice sites and likely encode non-functional CD19 proteins. We further identify cis-regulatory elements and trans-acting RNA-binding proteins that control CD19 splicing (e.g., PTBP1 and SF3B4) and validate that loss of these factors leads to pervasive CD19 mis-splicing. Our dataset represents a comprehensive resource for identifying predictive biomarkers for CART-19 therapy. Multiple alternative splicing events in CD19 mRNA have been associated with resistance/relapse to CD19 CAR-T therapy in patients with B cell malignancies. Here, by combining patient data and a high-throughput mutagenesis screen, the authors identify single point mutations and RNA-binding proteins that can control CD19 splicing and be associated with CD19 CAR-T therapy resistance.
Collapse
Affiliation(s)
| | - Laura Schulz
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Mihaela Enculescu
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Claudia Paret
- Department of Pediatric Hematology/Oncology, Center for Pediatric and Adolescent Medicine, University Medical Center of the Johannes Gutenberg University Mainz, 55131, Mainz, Germany.,University Cancer Center (UCT), University Medical Center of the Johannes Gutenberg University Mainz, 55131, Mainz, Germany.,German Cancer Consortium (DKTK), site Frankfurt/Mainz, Germany, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Bea Spiekermann
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Mathieu Quesnel-Vallières
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Biochemistry and Biophysics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Manuel Torres-Diz
- Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Sebastian Unic
- Department of Systems Biology, Institute for Biomedical Genetics (IBMG), University of Stuttgart, Allmandring 30E, 70569, Stuttgart, Germany
| | - Anke Busch
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Anna Orekhova
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Monika Kuban
- Department of Systems Biology, Institute for Biomedical Genetics (IBMG), University of Stuttgart, Allmandring 30E, 70569, Stuttgart, Germany
| | - Mikhail Mesitov
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Miriam M Mulorz
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Rawan Shraim
- Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.,Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Fridolin Kielisch
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Jörg Faber
- Department of Pediatric Hematology/Oncology, Center for Pediatric and Adolescent Medicine, University Medical Center of the Johannes Gutenberg University Mainz, 55131, Mainz, Germany.,University Cancer Center (UCT), University Medical Center of the Johannes Gutenberg University Mainz, 55131, Mainz, Germany.,German Cancer Consortium (DKTK), site Frankfurt/Mainz, Germany, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Andrei Thomas-Tikhonenko
- Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.,Department of Pathology & Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kathi Zarnack
- Buchmann Institute for Molecular Life Sciences (BMLS), Max-von-Laue-Str. 15, 60438, Frankfurt, Germany. .,Faculty Biological Sciences, Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438, Frankfurt, Germany.
| | - Stefan Legewie
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany. .,Department of Systems Biology, Institute for Biomedical Genetics (IBMG), University of Stuttgart, Allmandring 30E, 70569, Stuttgart, Germany. .,Stuttgart Research Center for Systems Biology (SRCSB), University of Stuttgart, Stuttgart, Germany.
| | - Julian König
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany.
| |
Collapse
|
13
|
Vaknin I, Amit R. Molecular and experimental tools to design synthetic enhancers. Curr Opin Biotechnol 2022; 76:102728. [PMID: 35525178 DOI: 10.1016/j.copbio.2022.102728] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Revised: 03/16/2022] [Accepted: 04/03/2022] [Indexed: 11/03/2022]
Abstract
Understanding the grammar of enhancers and how they regulate gene expression is key for both basic research and for the pharma and biotech industries. The design and characterization of synthetic enhancers can expand the known regulatory space. This is achieved by the utilization of DNA Oligo Libraries (OLs), which facilitates screening of as many as millions of synthetic enhancer variants simultaneously. This review includes the latest commercial DNA OL synthesis technology and its capabilities, and a general 'know-how' guide for the design, construction, and analysis of OL-based synthetic enhancer characterization experiments. Specifically, we focus on synthetic-enhancer-based massively parallel reporter assay, Sort-seq methodologies (e.g. flow cytometry, deep sequencing), and a brief description of machine learning-based attempts for OL-analysis and follow-up validation experiments.
Collapse
Affiliation(s)
- Inbal Vaknin
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, Haifa 3200000, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, Haifa 3200000, Israel; The Russell Berrie Nanotechnology Institute, Technion - Israel Institute of Technology, Haifa 3200000, Israel.
| |
Collapse
|
14
|
A broad analysis of splicing regulation in yeast using a large library of synthetic introns. PLoS Genet 2021; 17:e1009805. [PMID: 34570750 PMCID: PMC8496845 DOI: 10.1371/journal.pgen.1009805] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 10/07/2021] [Accepted: 09/03/2021] [Indexed: 11/19/2022] Open
Abstract
RNA splicing is a key process in eukaryotic gene expression, in which an intron is spliced out of a pre-mRNA molecule to eventually produce a mature mRNA. Most intron-containing genes are constitutively spliced, hence efficient splicing of an intron is crucial for efficient regulation of gene expression. Here we use a large synthetic oligo library of ~20,000 variants to explore how different intronic sequence features affect splicing efficiency and mRNA expression levels in S. cerevisiae. Introns are defined by three functional sites, the 5’ donor site, the branch site, and the 3’ acceptor site. Using a combinatorial design of synthetic introns, we demonstrate how non-consensus splice site sequences in each of these sites affect splicing efficiency. We then show that S. cerevisiae splicing machinery tends to select alternative 3’ splice sites downstream of the original site, and we suggest that this tendency created a selective pressure, leading to the avoidance of cryptic splice site motifs near introns’ 3’ ends. We further use natural intronic sequences from other yeast species, whose splicing machineries have diverged to various extents, to show how intron architectures in the various species have been adapted to the organism’s splicing machinery. We suggest that the observed tendency for cryptic splicing is a result of a loss of a specific splicing factor, U2AF1. Lastly, we show that synthetic sequences containing two introns give rise to alternative RNA isoforms in S. cerevisiae, demonstrating that merely a synthetic fusion of two introns might be suffice to facilitate alternative splicing in yeast. Our study reveals novel mechanisms by which introns are shaped in evolution to allow cells to regulate their transcriptome. In addition, it provides a valuable resource to study the regulation of constitutive and alternative splicing in a model organism. RNA splicing is a process in which parts of a new pre-mRNA are spliced out of the mRNA molecule to produce eventually a mature mRNA. Those RNA segments that are spliced out are termed introns, and they are found in most genes in eukaryotic organisms. Hence regulation of this process has a major role in the control of gene expression. The budding yeast S. cerevisiae is a popular model organism for eukaryotic cell biology, but in terms of splicing it differs, as it has only few intron-containing genes. Nevertheless, this species has been used to study basic principles of splicing regulation based on its ~300 introns. Here we used the technology of a large synthetic genetic library to introduce many new intron-containing genes to the yeast genome, to explore splicing regulation at a wider scope than was possible so far. Reassuringly, our results confirm known regulatory mechanisms, and further expand our understanding of splicing regulation, specifically how the yeast splicing machinery interacts with the end of introns, and how through evolution introns have evolved to avoid unwanted misidentifications of this end. We further demonstrate the potential of the yeast splicing machinery to alternatively splice a two-intron gene, which is common in other eukaryotes but rare in yeast. Our work presents a first-of-its-kind resource for the systematic study of splicing in live cells.
Collapse
|
15
|
An extended catalogue of tandem alternative splice sites in human tissue transcriptomes. PLoS Comput Biol 2021; 17:e1008329. [PMID: 33826604 PMCID: PMC8055015 DOI: 10.1371/journal.pcbi.1008329] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 04/19/2021] [Accepted: 03/22/2021] [Indexed: 12/18/2022] Open
Abstract
Tandem alternative splice sites (TASS) is a special class of alternative splicing events that are characterized by a close tandem arrangement of splice sites. Most TASS lack functional characterization and are believed to arise from splicing noise. Based on the RNA-seq data from the Genotype Tissue Expression project, we present an extended catalogue of TASS in healthy human tissues and analyze their tissue-specific expression. The expression of TASS is usually dominated by one major splice site (maSS), while the expression of minor splice sites (miSS) is at least an order of magnitude lower. Among 46k miSS with sufficient read support, 9k (20%) are significantly expressed above the expected noise level, and among them 2.5k are expressed tissue-specifically. We found significant correlations between tissue-specific expression of RNA-binding proteins (RBP), tissue-specific expression of miSS, and miSS response to RBP inactivation by shRNA. In combination with RBP profiling by eCLIP, this allowed prediction of novel cases of tissue-specific splicing regulation including a miSS in QKI mRNA that is likely regulated by PTBP1. The analysis of human primary cell transcriptomes suggested that both tissue-specific and cell-type-specific factors contribute to the regulation of miSS expression. More than 20% of tissue-specific miSS affect structured protein regions and may adjust protein-protein interactions or modify the stability of the protein core. The significantly expressed miSS evolve under the same selection pressure as maSS, while other miSS lack signatures of evolutionary selection and conservation. Using mixture models, we estimated that not more than 15% of maSS and not more than 54% of tissue-specific miSS are noisy, while the proportion of noisy splice sites among non-significantly expressed miSS is above 63%. Pre-mRNA splicing is an important step in the processing of the genomic information during gene expression. During splicing, introns are excised from a gene transcript, and the remaining exons are ligated. Our work concerns one its particular subtype, which involves the so-called tandem alternative splice sites, a group of closely located exon borders that are used alternatively. We analyzed RNA-seq measurements of gene expression provided by the Genotype-Tissue Expression (GTEx) project, the largest to-date collection of such measurements in healthy human tissues, and constructed a detailed catalogue of tandem alternative splice sites. Within this catalogue, we characterized patterns of tissue-specific expression, regulation, impact on protein structure, and evolutionary selection acting on tandem alternative splice sites. In a number of genes, we predicted regulatory mechanisms that could be responsible for choosing one of many tandem alternative splice sites. The results of this study provide an invaluable resource for molecular biologists studying alternative splicing.
Collapse
|
16
|
Cheng J, Çelik MH, Kundaje A, Gagneur J. MTSplice predicts effects of genetic variants on tissue-specific splicing. Genome Biol 2021; 22:94. [PMID: 33789710 PMCID: PMC8011109 DOI: 10.1186/s13059-021-02273-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2020] [Accepted: 01/14/2021] [Indexed: 12/20/2022] Open
Abstract
We develop the free and open-source model Multi-tissue Splicing (MTSplice) to predict the effects of genetic variants on splicing of cassette exons in 56 human tissues. MTSplice combines MMSplice, which models constitutive regulatory sequences, with a new neural network that models tissue-specific regulatory sequences. MTSplice outperforms MMSplice on predicting tissue-specific variations associated with genetic variants in most tissues of the GTEx dataset, with largest improvements on brain tissues. Furthermore, MTSplice predicts that autism-associated de novo mutations are enriched for variants affecting splicing specifically in the brain. We foresee that MTSplice will aid interpreting variants associated with tissue-specific disorders.
Collapse
Affiliation(s)
- Jun Cheng
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany.
| | - Muhammed Hasan Çelik
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany.
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.
- Institute of Human Genetics, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.
| |
Collapse
|
17
|
Sciarrillo R, Wojtuszkiewicz A, Assaraf YG, Jansen G, Kaspers GJL, Giovannetti E, Cloos J. The role of alternative splicing in cancer: From oncogenesis to drug resistance. Drug Resist Updat 2020; 53:100728. [PMID: 33070093 DOI: 10.1016/j.drup.2020.100728] [Citation(s) in RCA: 153] [Impact Index Per Article: 30.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 09/17/2020] [Accepted: 09/21/2020] [Indexed: 12/15/2022]
Abstract
Alternative splicing is a tightly regulated process whereby non-coding sequences of pre-mRNA are removed and protein-coding segments are assembled in diverse combinations, ultimately giving rise to proteins with distinct or even opposing functions. In the past decade, whole genome/transcriptome sequencing studies revealed the high complexity of splicing regulation, which occurs co-transcriptionally and is influenced by chromatin status and mRNA modifications. Consequently, splicing profiles of both healthy and malignant cells display high diversity and alternative splicing was shown to be widely deregulated in multiple cancer types. In particular, mutations in pre-mRNA regulatory sequences, splicing regulators and chromatin modifiers, as well as differential expression of splicing factors are important contributors to cancer pathogenesis. It has become clear that these aberrations contribute to many facets of cancer, including oncogenic transformation, cancer progression, response to anticancer drug treatment as well as resistance to therapy. In this respect, alternative splicing was shown to perturb the expression a broad spectrum of relevant genes involved in drug uptake/metabolism (i.e. SLC29A1, dCK, FPGS, and TP), activation of nuclear receptor pathways (i.e. GR, AR), regulation of apoptosis (i.e. MCL1, BCL-X, and FAS) and modulation of response to immunotherapy (CD19). Furthermore, aberrant splicing constitutes an important source of novel cancer biomarkers and the spliceosome machinery represents an attractive target for a novel and rapidly expanding class of therapeutic agents. Small molecule inhibitors targeting SF3B1 or splice factor kinases were highly cytotoxic against a wide range of cancer models, including drug-resistant cells. Importantly, these effects are enhanced in specific cancer subsets, such as splicing factor-mutated and c-MYC-driven tumors. Furthermore, pre-clinical studies report synergistic effects of spliceosome modulators in combination with conventional antitumor agents. These strategies based on the use of low dose splicing modulators could shift the therapeutic window towards decreased toxicity in healthy tissues. Here we provide an extensive overview of the latest findings in the field of regulation of splicing in cancer, including molecular mechanisms by which cancer cells harness alternative splicing to drive oncogenesis and evade anticancer drug treatment as well as splicing-based vulnerabilities that can provide novel treatment opportunities. Furthermore, we discuss current challenges arising from genome-wide detection and prediction methods of aberrant splicing, as well as unravelling functional relevance of the plethora of cancer-related splicing alterations.
Collapse
Affiliation(s)
- Rocco Sciarrillo
- Department of Hematology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Department of Pediatric Oncology, Emma's Children's Hospital, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Department of Medical Oncology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands
| | - Anna Wojtuszkiewicz
- Department of Hematology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands
| | - Yehuda G Assaraf
- The Fred Wyszkowski Cancer Research Laboratory, Department of Biology, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Gerrit Jansen
- Amsterdam Immunology and Rheumatology Center, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands
| | - Gertjan J L Kaspers
- Department of Pediatric Oncology, Emma's Children's Hospital, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Princess Máxima Center for Pediatric Oncology, Utrecht, Netherlands
| | - Elisa Giovannetti
- Department of Medical Oncology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Fondazione Pisana per la Scienza, Pisa, Italy
| | - Jacqueline Cloos
- Department of Hematology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands.
| |
Collapse
|
18
|
Backers L, Parton B, De Bruyne M, Tavernier SJ, Van Den Bogaert K, Lambrecht BN, Haerynck F, Claes KBM. Missing heritability in Bloom syndrome: First report of a deep intronic variant leading to pseudo-exon activation in the BLM gene. Clin Genet 2020; 99:292-297. [PMID: 33073370 DOI: 10.1111/cge.13859] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/17/2020] [Accepted: 09/30/2020] [Indexed: 12/16/2022]
Abstract
Pathogenic biallelic variants in the BLM/RECQL3 gene cause a rare autosomal recessive disorder called Bloom syndrome (BS). This syndrome is characterized by severe growth delay, immunodeficiency, dermatological manifestations and a predisposition to a wide variety of cancers, often multiple and very early in life. Literature shows that the main mode of BLM inactivation is protein translation termination. We expanded the molecular spectrum of BS by reporting the first deep intronic variant causing intron exonisation. We describe a patient with a clinical phenotype of BS and a strong increase in sister chromatid exchanges (SCE), who was found to be compound heterozygous for a novel nonsense variant c.3379C>T, p.(Gln1127Ter) in exon 18 and a deep intronic variant c.3020-258A>G in intron 15 of the BLM gene. The deep intronic variant creates a high-quality de novo donor splice site, which leads to retention of two intron segments. Both pseudo-exons introduce a premature stop codon into the reading frame and abolish BLM protein expression, confirmed by Western Blot analysis. These findings illustrate the role of non-coding variation in Mendelian disorders and herewith highlight an unmet need in routine testing of Mendelian disorders, being the added value of RNA-based approaches to provide a complete molecular diagnosis.
Collapse
Affiliation(s)
- Lynn Backers
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University and Ghent University Hospital, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Bram Parton
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University and Ghent University Hospital, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Marieke De Bruyne
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University and Ghent University Hospital, Ghent, Belgium
| | - Simon J Tavernier
- Unit of Molecular Signal Transduction in Inflammation, VIB-UGent Center for Inflammation Research, Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
| | - Kris Van Den Bogaert
- Center for Human Genetics, University Hospitals Leuven - Catholic University Leuven, Leuven, Belgium
| | - Bart N Lambrecht
- Unit of Immunoregulation and Mucosal Immunology, VIB-UGent Center for Inflammation Research, Ghent, Belgium.,Department of Internal Medicine and Pediatrics, Ghent University, Ghent, Belgium
| | - Filomeen Haerynck
- Department of Internal Medicine and Pediatrics, Ghent University, Ghent, Belgium
| | - Kathleen B M Claes
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University and Ghent University Hospital, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| |
Collapse
|
19
|
Mikl M, Pilpel Y, Segal E. High-throughput interrogation of programmed ribosomal frameshifting in human cells. Nat Commun 2020; 11:3061. [PMID: 32546731 PMCID: PMC7297798 DOI: 10.1038/s41467-020-16961-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 05/28/2020] [Indexed: 12/30/2022] Open
Abstract
Programmed ribosomal frameshifting (PRF) is the controlled slippage of the translating ribosome to an alternative frame. This process is widely employed by human viruses such as HIV and SARS coronavirus and is critical for their replication. Here, we developed a high-throughput approach to assess the frameshifting potential of a sequence. We designed and tested >12,000 sequences based on 15 viral and human PRF events, allowing us to systematically dissect the rules governing ribosomal frameshifting and discover novel regulatory inputs based on amino acid properties and tRNA availability. We assessed the natural variation in HIV gag-pol frameshifting rates by testing >500 clinical isolates and identified subtype-specific differences and associations between viral load in patients and the optimality of PRF rates. We devised computational models that accurately predict frameshifting potential and frameshifting rates, including subtle differences between HIV isolates. This approach can contribute to the development of antiviral agents targeting PRF.
Collapse
Affiliation(s)
- Martin Mikl
- Department of Computer Science and Applied Mathematics, Rehovot, 7610001, Israel.
- Department of Molecular Cell Biology and Weizmann Institute of Science, Rehovot, 7610001, Israel.
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel.
- Department of Human Biology, Faculty of Natural Sciences, University of Haifa, Mount Carmel, Haifa, 31905, Israel.
| | - Yitzhak Pilpel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Rehovot, 7610001, Israel.
- Department of Molecular Cell Biology and Weizmann Institute of Science, Rehovot, 7610001, Israel.
| |
Collapse
|
20
|
Gene Architecture and Sequence Composition Underpin Selective Dependency of Nuclear Export of Long RNAs on NXF1 and the TREX Complex. Mol Cell 2020; 79:251-267.e6. [PMID: 32504555 DOI: 10.1016/j.molcel.2020.05.013] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 03/23/2020] [Accepted: 05/11/2020] [Indexed: 12/14/2022]
Abstract
The core components of the nuclear RNA export pathway are thought to be required for export of virtually all polyadenylated RNAs. Here, we depleted different proteins that act in nuclear export in human cells and quantified the transcriptome-wide consequences on RNA localization. Different genes exhibited substantially variable sensitivities, with depletion of NXF1 and TREX components causing some transcripts to become strongly retained in the nucleus while others were not affected. Specifically, NXF1 is preferentially required for export of single- or few-exon transcripts with long exons or high A/U content, whereas depletion of TREX complex components preferentially affects spliced and G/C-rich transcripts. Using massively parallel reporter assays, we identified short sequence elements that render transcripts dependent on NXF1 for their export and identified synergistic effects of splicing and NXF1. These results revise the current model of how nuclear export shapes the distribution of RNA within human cells.
Collapse
|