1
|
Li J, Tan Y, Lu R, Liang P, Liu H, Yao X. Artificial intelligence for RNA-ligand interaction prediction: advances and prospects. Drug Discov Today 2025; 30:104366. [PMID: 40286982 DOI: 10.1016/j.drudis.2025.104366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2025] [Revised: 04/17/2025] [Accepted: 04/22/2025] [Indexed: 04/29/2025]
Abstract
Accurate prediction of RNA-ligand interactions is vital for understanding biological processes and advancing RNA-targeted drug discovery. Given their complexity, artificial intelligence (AI) is revolutionizing the study of RNA-ligand interactions, offering insights into the complex dynamics and therapeutic potential of RNA. In this review, we highlight advances in AI-driven RNA-ligand binding site identification, structure modeling, binding mode and binding affinity prediction, and virtual screening (VS). We also discuss key challenges, such as data set scarcity and modeling RNA flexibility. Future directions emphasize integrating cutting-edge AI techniques with physics-based models and expanding experimental data sets to enhance RNA-ligand interaction predictions.
Collapse
Affiliation(s)
- Jing Li
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Yi Tan
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Ruiqiang Lu
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Pengyu Liang
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Huanxiang Liu
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China.
| | - Xiaojun Yao
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China.
| |
Collapse
|
2
|
Gaither J, Lin YH, Bundschuh R. RBPBind: Quantitative prediction of Protein-RNA interactions. J Mol Biol 2022; 434:167515. [DOI: 10.1016/j.jmb.2022.167515] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 02/21/2022] [Accepted: 02/22/2022] [Indexed: 10/19/2022]
|
3
|
Shatoff E, Bundschuh R. dsRBPBind: modeling the effect of RNA secondary structure on double-stranded RNA-protein binding. Bioinformatics 2022; 38:687-693. [PMID: 34668517 DOI: 10.1093/bioinformatics/btab724] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/15/2021] [Accepted: 10/15/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION RNA-binding proteins are fundamental to many cellular processes. Double-stranded RNA-binding proteins (dsRBPs) in particular are crucial for RNA interference, mRNA elongation, A-to-I editing, host defense, splicing and a multitude of other important mechanisms. Since dsRBPs require double-stranded RNA to bind, their binding affinity depends on the competition among all possible secondary structures of the target RNA molecule. Here, we introduce a quantitative model that allows calculation of the effective affinity of dsRBPs to any RNA given a principal affinity and the sequence of the RNA, while fully taking into account the entire secondary structure ensemble of the RNA. RESULTS We implement our model within the ViennaRNA folding package while maintaining its O(N3) time complexity. We validate our quantitative model by comparing with experimentally determined binding affinities and stoichiometries for transactivation response element RNA-binding protein (TRBP). We also find that the change in dsRBP binding affinity purely due to the presence of alternative RNA structures can be many orders of magnitude and that the predicted affinity of TRBP for pre-miRNA-like constructs correlates with experimentally measured processing rates. AVAILABILITY AND IMPLEMENTATION Our modified version of the ViennaRNA package is available for download at http://bioserv.mps.ohio-state.edu/dsRBPBind, is free to use for research and educational purposes, and utilizes simple get/set methods for footprint size, concentration, cooperativity, principal dissociation constant and overlap.
Collapse
Affiliation(s)
- Elan Shatoff
- Department of Physics, The Ohio State University, Columbus, OH 43210, USA.,Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Ralf Bundschuh
- Department of Physics, The Ohio State University, Columbus, OH 43210, USA.,Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA.,Division of Hematology, Department of Internal Medicine, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
4
|
Intrinsic disorder and phase transitions: Pieces in the puzzling role of the prion protein in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2021; 183:1-43. [PMID: 34656326 DOI: 10.1016/bs.pmbts.2021.06.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
After four decades of prion protein research, the pressing questions in the literature remain similar to the common existential dilemmas. Who am I? Some structural characteristics of the cellular prion protein (PrPC) and scrapie PrP (PrPSc) remain unknown: there are no high-resolution atomic structures for either full-length endogenous human PrPC or isolated infectious PrPSc particles. Why am I here? It is not known why PrPC and PrPSc are found in specific cellular compartments such as the nucleus; while the physiological functions of PrPC are still being uncovered, the misfolding site remains obscure. Where am I going? The subcellular distribution of PrPC and PrPSc is wide (reported in 10 different locations in the cell). This complexity is further exacerbated by the eight different PrP fragments yielded from conserved proteolytic cleavages and by reversible post-translational modifications, such as glycosylation, phosphorylation, and ubiquitination. Moreover, about 55 pathological mutations and 16 polymorphisms on the PrP gene (PRNP) have been described. Prion diseases also share unique, challenging features: strain phenomenon (associated with the heterogeneity of PrPSc conformations) and the possible transmissibility between species, factors which contribute to PrP undruggability. However, two recent concepts in biochemistry-intrinsically disordered proteins and phase transitions-may shed light on the molecular basis of PrP's role in physiology and disease.
Collapse
|
5
|
Yan Z, Hamilton WL, Blanchette M. Graph neural representational learning of RNA secondary structures for predicting RNA-protein interactions. Bioinformatics 2021; 36:i276-i284. [PMID: 32657407 PMCID: PMC7355240 DOI: 10.1093/bioinformatics/btaa456] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Motivation RNA-protein interactions are key effectors of post-transcriptional regulation. Significant experimental and bioinformatics efforts have been expended on characterizing protein binding mechanisms on the molecular level, and on highlighting the sequence and structural traits of RNA that impact the binding specificity for different proteins. Yet our ability to predict these interactions in silico remains relatively poor. Results In this study, we introduce RPI-Net, a graph neural network approach for RNA-protein interaction prediction. RPI-Net learns and exploits a graph representation of RNA molecules, yielding significant performance gains over existing state-of-the-art approaches. We also introduce an approach to rectify an important type of sequence bias caused by the RNase T1 enzyme used in many CLIP-Seq experiments, and we show that correcting this bias is essential in order to learn meaningful predictors and properly evaluate their accuracy. Finally, we provide new approaches to interpret the trained models and extract simple, biologically interpretable representations of the learned sequence and structural motifs. Availability and implementation Source code can be accessed at https://www.github.com/HarveyYan/RNAonGraph. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zichao Yan
- School of Computer Science, McGill University, Montreal, QC H3A 2B2, Canada.,MILA, Quebec AI Institute, Montreal, QC H2S 3H1, Canada
| | - William L Hamilton
- School of Computer Science, McGill University, Montreal, QC H3A 2B2, Canada.,MILA, Quebec AI Institute, Montreal, QC H2S 3H1, Canada
| | - Mathieu Blanchette
- School of Computer Science, McGill University, Montreal, QC H3A 2B2, Canada
| |
Collapse
|
6
|
Shatoff E, Bundschuh R. Single nucleotide polymorphisms affect RNA-protein interactions at a distance through modulation of RNA secondary structures. PLoS Comput Biol 2020; 16:e1007852. [PMID: 32379750 PMCID: PMC7237046 DOI: 10.1371/journal.pcbi.1007852] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 05/19/2020] [Accepted: 04/06/2020] [Indexed: 11/19/2022] Open
Abstract
Single nucleotide polymorphisms are widely associated with disease, but the ways in which they cause altered phenotypes are often unclear, especially when they appear in non-coding regions. One way in which non-coding polymorphisms could cause disease is by affecting crucial RNA-protein interactions. While it is clear that changing a protein binding motif will alter protein binding, it has been shown that single nucleotide polymorphisms can affect RNA secondary structure, and here we show that single nucleotide polymorphisms can affect RNA-protein interactions from outside binding motifs through altered RNA secondary structure. By using a modified version of the Vienna Package and PAR-CLIP data for HuR (ELAVL1) in humans we characterize the genome-wide effect of single nucleotide polymorphisms on HuR binding and show that they can have a many-fold effect on the affinity of HuR binding to RNA transcripts from tens of bases away. We also find some evidence that the effect of single nucleotide polymorphisms on protein binding might be under selection, with the non-reference alleles tending to make it harder for a protein to bind.
Collapse
Affiliation(s)
- Elan Shatoff
- Department of Physics, The Ohio State University, Columbus, Ohio, United States of America
- Center for RNA Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Ralf Bundschuh
- Department of Physics, The Ohio State University, Columbus, Ohio, United States of America
- Center for RNA Biology, The Ohio State University, Columbus, Ohio, United States of America
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio, United States of America
- Division of Hematology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
7
|
Alaidi O, Aboul‐ela F. Statistical mechanical prediction of ligand perturbation to RNA secondary structure and application to riboswitches. J Comput Chem 2020; 41:1521-1537. [DOI: 10.1002/jcc.26195] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 01/03/2020] [Accepted: 03/09/2020] [Indexed: 02/04/2023]
Affiliation(s)
- Osama Alaidi
- Biocomplexity for Research and Consulting Cairo Egypt
| | - Fareed Aboul‐ela
- Center for X‐Ray Determination of the Structure of MatterZewail City of Science and Technology Giza Egypt
| |
Collapse
|
8
|
Sanchez de Groot N, Armaos A, Graña-Montes R, Alriquet M, Calloni G, Vabulas RM, Tartaglia GG. RNA structure drives interaction with proteins. Nat Commun 2019; 10:3246. [PMID: 31324771 PMCID: PMC6642211 DOI: 10.1038/s41467-019-10923-5] [Citation(s) in RCA: 108] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 06/10/2019] [Indexed: 12/12/2022] Open
Abstract
The combination of high-throughput sequencing and in vivo crosslinking approaches leads to the progressive uncovering of the complex interdependence between cellular transcriptome and proteome. Yet, the molecular determinants governing interactions in protein-RNA networks are not well understood. Here we investigated the relationship between the structure of an RNA and its ability to interact with proteins. Analysing in silico, in vitro and in vivo experiments, we find that the amount of double-stranded regions in an RNA correlates with the number of protein contacts. This relationship -which we call structure-driven protein interactivity- allows classification of RNA types, plays a role in gene regulation and could have implications for the formation of phase-separated ribonucleoprotein assemblies. We validate our hypothesis by showing that a highly structured RNA can rearrange the composition of a protein aggregate. We report that the tendency of proteins to phase-separate is reduced by interactions with specific RNAs.
Collapse
Affiliation(s)
- Natalia Sanchez de Groot
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Alexandros Armaos
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Ricardo Graña-Montes
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain.,Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Marion Alriquet
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.,Institute of Biophysical Chemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany
| | - Giulia Calloni
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.,Institute of Biophysical Chemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany
| | - R Martin Vabulas
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany. .,Institute of Biophysical Chemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.
| | - Gian Gaetano Tartaglia
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain. .,ICREA 23 Passeig Lluis Companys 08010 and Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain. .,Department of Biology 'Charles Darwin', Sapienza University of Rome, P.le A. Moro 5, Rome, 00185, Italy. .,Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genoa, Italy.
| |
Collapse
|
9
|
Jarmoskaite I, Denny SK, Vaidyanathan PP, Becker WR, Andreasson JOL, Layton CJ, Kappel K, Shivashankar V, Sreenivasan R, Das R, Greenleaf WJ, Herschlag D. A Quantitative and Predictive Model for RNA Binding by Human Pumilio Proteins. Mol Cell 2019; 74:966-981.e18. [PMID: 31078383 DOI: 10.1016/j.molcel.2019.04.012] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 01/31/2019] [Accepted: 04/05/2019] [Indexed: 01/09/2023]
Abstract
High-throughput methodologies have enabled routine generation of RNA target sets and sequence motifs for RNA-binding proteins (RBPs). Nevertheless, quantitative approaches are needed to capture the landscape of RNA-RBP interactions responsible for cellular regulation. We have used the RNA-MaP platform to directly measure equilibrium binding for thousands of designed RNAs and to construct a predictive model for RNA recognition by the human Pumilio proteins PUM1 and PUM2. Despite prior findings of linear sequence motifs, our measurements revealed widespread residue flipping and instances of positional coupling. Application of our thermodynamic model to published in vivo crosslinking data reveals quantitative agreement between predicted affinities and in vivo occupancies. Our analyses suggest a thermodynamically driven, continuous Pumilio-binding landscape that is negligibly affected by RNA structure or kinetic factors, such as displacement by ribosomes. This work provides a quantitative foundation for dissecting the cellular behavior of RBPs and cellular features that impact their occupancies.
Collapse
Affiliation(s)
- Inga Jarmoskaite
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Sarah K Denny
- Biophysics Program, Stanford University School of Medicine, Stanford, CA 94305, USA; Scribe Therapeutics, Berkeley, CA, 94704, USA
| | | | - Winston R Becker
- Biophysics Program, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Johan O L Andreasson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Curtis J Layton
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Kalli Kappel
- Biophysics Program, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | - Raashi Sreenivasan
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Applied Physics, Stanford University, Stanford, CA 94305, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| | - Daniel Herschlag
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Chemistry, Stanford University, Stanford, CA 94305, USA; Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA; ChEM-H Institute, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
10
|
Pan X, Rijnbeek P, Yan J, Shen HB. Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics 2018; 19:511. [PMID: 29970003 PMCID: PMC6029131 DOI: 10.1186/s12864-018-4889-1] [Citation(s) in RCA: 151] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 06/19/2018] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND RNA regulation is significantly dependent on its binding protein partner, known as the RNA-binding proteins (RBPs). Unfortunately, the binding preferences for most RBPs are still not well characterized. Interdependencies between sequence and secondary structure specificities is challenging for both predicting RBP binding sites and accurate sequence and structure motifs detection. RESULTS In this study, we propose a deep learning-based method, iDeepS, to simultaneously identify the binding sequence and structure motifs from RNA sequences using convolutional neural networks (CNNs) and a bidirectional long short term memory network (BLSTM). We first perform one-hot encoding for both the sequence and predicted secondary structure, to enable subsequent convolution operations. To reveal the hidden binding knowledge from the observed sequences, the CNNs are applied to learn the abstract features. Considering the close relationship between sequence and predicted structures, we use the BLSTM to capture possible long range dependencies between binding sequence and structure motifs identified by the CNNs. Finally, the learned weighted representations are fed into a classification layer to predict the RBP binding sites. We evaluated iDeepS on verified RBP binding sites derived from large-scale representative CLIP-seq datasets. The results demonstrate that iDeepS can reliably predict the RBP binding sites on RNAs, and outperforms the state-of-the-art methods. An important advantage compared to other methods is that iDeepS can automatically extract both binding sequence and structure motifs, which will improve our understanding of the mechanisms of binding specificities of RBPs. CONCLUSION Our study shows that the iDeepS method identifies the sequence and structure motifs to accurately predict RBP binding sites. iDeepS is available at https://github.com/xypan1232/iDeepS .
Collapse
Affiliation(s)
- Xiaoyong Pan
- Department of Medical Informatics, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Junchi Yan
- Institute of Software Engineering, East China Normal University, Shanghai, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| |
Collapse
|
11
|
Findeiß S, Etzel M, Will S, Mörl M, Stadler PF. Design of Artificial Riboswitches as Biosensors. SENSORS 2017; 17:s17091990. [PMID: 28867802 PMCID: PMC5621056 DOI: 10.3390/s17091990] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 08/23/2017] [Accepted: 08/25/2017] [Indexed: 12/11/2022]
Abstract
RNA aptamers readily recognize small organic molecules, polypeptides, as well as other nucleic acids in a highly specific manner. Many such aptamers have evolved as parts of regulatory systems in nature. Experimental selection techniques such as SELEX have been very successful in finding artificial aptamers for a wide variety of natural and synthetic ligands. Changes in structure and/or stability of aptamers upon ligand binding can propagate through larger RNA constructs and cause specific structural changes at distal positions. In turn, these may affect transcription, translation, splicing, or binding events. The RNA secondary structure model realistically describes both thermodynamic and kinetic aspects of RNA structure formation and refolding at a single, consistent level of modelling. Thus, this framework allows studying the function of natural riboswitches in silico. Moreover, it enables rationally designing artificial switches, combining essentially arbitrary sensors with a broad choice of read-out systems. Eventually, this approach sets the stage for constructing versatile biosensors.
Collapse
Affiliation(s)
- Sven Findeiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
- Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, University of Vienna, Währingerstraße 29, A-1090 Vienna, Austria.
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria.
| | - Maja Etzel
- Institute for Biochemistry, Leipzig University, Brüderstraße 34, 04103 Leipzig, Germany.
| | - Sebastian Will
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria.
- Institute for Biochemistry, Leipzig University, Brüderstraße 34, 04103 Leipzig, Germany.
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, Brüderstraße 34, 04103 Leipzig, Germany.
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany.
- Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, 04103 Leipzig, Germany.
- Center for RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg , Denmark.
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA.
| |
Collapse
|
12
|
Pietrosanto M, Mattei E, Helmer-Citterich M, Ferrè F. A novel method for the identification of conserved structural patterns in RNA: From small scale to high-throughput applications. Nucleic Acids Res 2016; 44:8600-8609. [PMID: 27580722 PMCID: PMC5062999 DOI: 10.1093/nar/gkw750] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 08/17/2016] [Indexed: 12/21/2022] Open
Abstract
Functional RNA regions are often related to recurrent secondary structure patterns (or motifs), which can exert their role in several different ways, particularly in dictating the interaction with RNA-binding proteins, and acting in the regulation of a large number of cellular processes. Among the available motif-finding tools, the majority focuses on sequence patterns, sometimes including secondary structure as additional constraints to improve their performance. Nonetheless, secondary structures motifs may be concurrent to their sequence counterparts or even encode a stronger functional signal. Current methods for searching structural motifs generally require long pipelines and/or high computational efforts or previously aligned sequences. Here, we present BEAM (BEAr Motif finder), a novel method for structural motif discovery from a set of unaligned RNAs, taking advantage of a recently developed encoding for RNA secondary structure named BEAR (Brand nEw Alphabet for RNAs) and of evolutionary substitution rates of secondary structure elements. Tested in a varied set of scenarios, from small- to large-scale, BEAM is successful in retrieving structural motifs even in highly noisy data sets, such as those that can arise in CLIP-Seq or other high-throughput experiments.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Eugenio Mattei
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 8/2, 40126 Bologna, Italy
| |
Collapse
|
13
|
Abstract
Long non-coding RNAs (lncRNAs) are associated to a plethora of cellular functions, most of which require the interaction with one or more RNA-binding proteins (RBPs); similarly, RBPs are often able to bind a large number of different RNAs. The currently available knowledge is already drawing an intricate network of interactions, whose deregulation is frequently associated to pathological states. Several different techniques were developed in the past years to obtain protein–RNA binding data in a high-throughput fashion. In parallel, in silico inference methods were developed for the accurate computational prediction of the interaction of RBP–lncRNA pairs. The field is growing rapidly, and it is foreseeable that in the near future, the protein–lncRNA interaction network will rise, offering essential clues for a better understanding of lncRNA cellular mechanisms and their disease-associated perturbations.
Collapse
|
14
|
Reyes-Herrera PH, Ficarra E. Computational Methods for CLIP-seq Data Processing. Bioinform Biol Insights 2014; 8:199-207. [PMID: 25336930 PMCID: PMC4196881 DOI: 10.4137/bbi.s16803] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Revised: 07/29/2014] [Accepted: 08/01/2014] [Indexed: 12/25/2022] Open
Abstract
RNA-binding proteins (RBPs) are at the core of post-transcriptional regulation and thus of gene expression control at the RNA level. One of the principal challenges in the field of gene expression regulation is to understand RBPs mechanism of action. As a result of recent evolution of experimental techniques, it is now possible to obtain the RNA regions recognized by RBPs on a transcriptome-wide scale. In fact, CLIP-seq protocols use the joint action of CLIP, crosslinking immunoprecipitation, and high-throughput sequencing to recover the transcriptome-wide set of interaction regions for a particular protein. Nevertheless, computational methods are necessary to process CLIP-seq experimental data and are a key to advancement in the understanding of gene regulatory mechanisms. Considering the importance of computational methods in this area, we present a review of the current status of computational approaches used and proposed for CLIP-seq data.
Collapse
Affiliation(s)
- Paula H Reyes-Herrera
- Facultad de Ingeniería Electrónica y Biomédica, Universidad Antonio Nariño, Bogotá, Colombia
| | - Elisa Ficarra
- Department of Control and Computer Engineering, Politecnico di Torino, TO, Italy
| |
Collapse
|
15
|
MicroRNA-519a demonstrates significant tumour suppressive activity in laryngeal squamous cells by targeting anti-carcinoma HuR gene. The Journal of Laryngology & Otology 2013; 127:1194-202. [DOI: 10.1017/s0022215113003174] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
AbstractObjective:This study investigated the expression and functional effects, and related molecular mechanisms, of microRNA-519a in laryngeal squamous cell carcinoma.Methods:MicroRNA-519a and HuR messenger RNA in laryngeal squamous cell carcinoma were measured using reverse transcription polymerase chain reaction. MicroRNA-519a effects on the growth of human epithelial type 2 cells were tested using an MTT assay. The influence of microRNA-519a on the expression levels of HuR and other related genes in protein was tested by Western blotting. Cell cycle analyses were performed using flow cytometry. Associations between expression levels and patients' clinical parameters were analysed with Pearson correlation analysis.Results:Expression of microRNA-519a in laryngeal squamous cell carcinoma tissues was significantly lower than in adjacent non-cancerous tissues. The expression of microRNA-519a was negatively associated with histological differentiation, tumour–node–metastasis stage, lymphatic metastasis and disease-free survival time. After increasing the level of microRNA-519a in laryngeal squamous cell carcinoma human epithelial type 2 cells, cell growth was inhibited and cell cycle was arrested in the G2/M phase. MicroRNA-519a down-regulated HuR gene expression in protein levels without affecting messenger RNA levels.Conclusion:MicroRNA-519a may function as a tumour suppressor by inhibiting HuR expression, and may serve as a therapeutic target for laryngeal squamous cell carcinoma.
Collapse
|
16
|
Li X, Kazan H, Lipshitz HD, Morris QD. Finding the target sites of RNA-binding proteins. WILEY INTERDISCIPLINARY REVIEWS-RNA 2013; 5:111-30. [PMID: 24217996 PMCID: PMC4253089 DOI: 10.1002/wrna.1201] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Revised: 09/27/2013] [Accepted: 10/01/2013] [Indexed: 12/15/2022]
Abstract
RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201
Collapse
Affiliation(s)
- Xiao Li
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
17
|
Dieterich C, Stadler PF. Computational biology of RNA interactions. WILEY INTERDISCIPLINARY REVIEWS-RNA 2012; 4:107-20. [PMID: 23139167 DOI: 10.1002/wrna.1147] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The biodiversity of the RNA world has been underestimated for decades. RNA molecules are key building blocks, sensors, and regulators of modern cells. The biological function of RNA molecules cannot be separated from their ability to bind to and interact with a wide space of chemical species, including small molecules, nucleic acids, and proteins. Computational chemists, physicists, and biologists have developed a rich tool set for modeling and predicting RNA interactions. These interactions are to some extent determined by the binding conformation of the RNA molecule. RNA binding conformations are approximated with often acceptable accuracy by sequence and secondary structure motifs. Secondary structure ensembles of a given RNA molecule can be efficiently computed in many relevant situations by employing a standard energy model for base pair interactions and dynamic programming techniques. The case of bi-molecular RNA-RNA interactions can be seen as an extension of this approach. However, unbiased transcriptome-wide scans for local RNA-RNA interactions are computationally challenging yet become efficient if the binding motif/mode is known and other external information can be used to confine the search space. Computational methods are less developed for proteins and small molecules, which bind to RNA with very high specificity. Binding descriptors of proteins are usually determined by in vitro high-throughput assays (e.g., microarrays or sequencing). Intriguingly, recent experimental advances, which are mostly based on light-induced cross-linking of binding partners, render in vivo binding patterns accessible yet require new computational methods for careful data interpretation. The grand challenge is to model the in vivo situation where a complex interplay of RNA binders competes for the same target RNA molecule. Evidently, bioinformaticians are just catching up with the impressive pace of these developments.
Collapse
Affiliation(s)
- Christoph Dieterich
- Berlin Institute for Medical Systems Biology, Max Delbrück Centre for Molecular Medicine, Robert-Rössle-Straße 10, Berlin, Germany.
| | | |
Collapse
|
18
|
Bompfünewerer AF, Flamm C, Fried C, Fritzsch G, Hofacker IL, Lehmann J, Missal K, Mosig A, Müller B, Prohaska SJ, Stadler BMR, Stadler PF, Tanzer A, Washietl S, Witwer C. Evolutionary patterns of non-coding RNAs. Theory Biosci 2012; 123:301-69. [PMID: 18202870 DOI: 10.1016/j.thbio.2005.01.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Accepted: 01/24/2005] [Indexed: 01/04/2023]
Abstract
A plethora of new functions of non-coding RNAs (ncRNAs) have been discovered in past few years. In fact, RNA is emerging as the central player in cellular regulation, taking on active roles in multiple regulatory layers from transcription, RNA maturation, and RNA modification to translational regulation. Nevertheless, very little is known about the evolution of this "Modern RNA World" and its components. In this contribution, we attempt to provide at least a cursory overview of the diversity of ncRNAs and functional RNA motifs in non-translated regions of regular messenger RNAs (mRNAs) with an emphasis on evolutionary questions. This survey is complemented by an in-depth analysis of examples from different classes of RNAs focusing mostly on their evolution in the vertebrate lineage. We present a survey of Y RNA genes in vertebrates and study the molecular evolution of the U7 snRNA, the snoRNAs E1/U17, E2, and E3, the Y RNA family, the let-7 microRNA (miRNA) family, and the mRNA-like evf-1 gene. We furthermore discuss the statistical distribution of miRNAs in metazoans, which suggests an explosive increase in the miRNA repertoire in vertebrates. The analysis of the transcription of ncRNAs suggests that small RNAs in general are genetically mobile in the sense that their association with a hostgene (e.g. when transcribed from introns of a mRNA) can change on evolutionary time scales. The let-7 family demonstrates, that even the mode of transcription (as intron or as exon) can change among paralogous ncRNA.
Collapse
|
19
|
Andersen JE, Huang FW, Penner RC, Reidys CM. Topology of RNA-RNA Interaction Structures. J Comput Biol 2012; 19:928-43. [DOI: 10.1089/cmb.2011.0308] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Affiliation(s)
- Jørgen E. Andersen
- Center for Quantum Geometry of Moduli Spaces, Aarhus University, Århus, Denmark
| | - Fenix W.D. Huang
- Institut for Matematik og Datalogi, University of Southern Denmark, Odense, Denmark
| | - Robert C. Penner
- Center for Quantum Geometry of Moduli Spaces, Aarhus University, Århus, Denmark
- Math and Physics Departments, California Institute of Technology, Pasadena, California
| | - Christian M. Reidys
- Institut for Matematik og Datalogi, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
20
|
Bernhart SH, Mückstein U, Hofacker IL. RNA Accessibility in cubic time. Algorithms Mol Biol 2011; 6:3. [PMID: 21388531 PMCID: PMC3063221 DOI: 10.1186/1748-7188-6-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2010] [Accepted: 03/09/2011] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND The accessibility of RNA binding motifs controls the efficacy of many biological processes. Examples are the binding of miRNA, siRNA or bacterial sRNA to their respective targets. Similarly, the accessibility of the Shine-Dalgarno sequence is essential for translation to start in prokaryotes. Furthermore, many classes of RNA binding proteins require the binding site to be single-stranded. RESULTS We introduce a way to compute the accessibility of all intervals within an RNA sequence in (n3) time. This improves on previous implementations where only intervals of one defined length were computed in the same time. While the algorithm is in the same efficiency class as sampling approaches, the results, especially if the probabilities get small, are much more exact. CONCLUSIONS Our algorithm significantly speeds up methods for the prediction of RNA-RNA interactions and other applications that require the accessibility of RNA molecules. The algorithm is already available in the program RNAplfold of the ViennaRNA package.
Collapse
|
21
|
Abstract
Noncoding RNAs form an indispensible component of the cellular information processing networks, a role that crucially depends on the specificity of their interactions among each other as well as with DNA and protein. Patterns of intramolecular and intermolecular base pairs govern most RNA interactions. Specific base pairs dominate the structure formation of nucleic acids. Only little details distinguish intramolecular secondary structures from those cofolding molecules. RNA-protein interactions, on the other hand, are strongly dependent on the RNA structure as well since the sequence content of helical regions is largely unreadable, so that sequence specificity is mostly restricted to unpaired loop regions. Conservation of both sequence and structure thus this can give indications of the functioning of the diversity of ncRNAs.
Collapse
Affiliation(s)
- Manja Marz
- Department of Computer Science, University of Leipzig, Leipzig, Germany.
| | | |
Collapse
|
22
|
Kishore S, Luber S, Zavolan M. Deciphering the role of RNA-binding proteins in the post-transcriptional control of gene expression. Brief Funct Genomics 2010; 9:391-404. [PMID: 21127008 DOI: 10.1093/bfgp/elq028] [Citation(s) in RCA: 128] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Eukaryotic cells express a large variety of ribonucleic acid-(RNA)-binding proteins (RBPs) with diverse affinity and specificity towards target RNAs that play a crucial role in almost every aspect of RNA metabolism. In addition, specific domains in RBPs impart catalytic activity or mediate protein-protein interactions, making RBPs versatile regulators of gene expression. In this review, we elaborate on recent experimental and computational approaches that have increased our understanding of RNA-protein interactions and their role in cellular function. We review aspects of gene expression that are modulated post-transcriptionally by RBPs, namely the stability of polymerase II-derived mRNA transcripts and their rate of translation into proteins. We further highlight the extensive regulatory networks of RBPs that implement a combinatorial control of gene expression. Taking cues from the recent development in the field, we argue that understanding spatio-temporal RNA-protein association on a transcriptome level will provide invaluable and unexpected insights into the regulatory codes that define growth, differentiation and disease.
Collapse
|
23
|
Gruber AR, Fallmann J, Kratochvill F, Kovarik P, Hofacker IL. AREsite: a database for the comprehensive investigation of AU-rich elements. Nucleic Acids Res 2010; 39:D66-9. [PMID: 21071424 PMCID: PMC3013810 DOI: 10.1093/nar/gkq990] [Citation(s) in RCA: 134] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
AREsite is an online resource for the detailed investigation of AU-rich elements (ARE) in vertebrate mRNA 3'-untranslated regions (UTRs). AREs are one of the most prominent cis-acting regulatory elements found in 3'-UTRs of mRNAs. Various ARE-binding proteins that possess RNA stabilizing or destabilizing functions are recruited by sequence-specific motifs. Recent findings suggest an essential role of the structural mRNA context in which these sequence motifs are embedded. AREsite is the first database that allows to quantify the structuredness of ARE motif sites in terms of opening energies and accessibility probabilities. Moreover, we also provide a detailed phylogenetic analysis of ARE motifs and incorporate information about experimentally validated targets of the ARE-binding proteins TTP, HuR and Auf1. The database is publicly available at: http://rna.tbi.univie.ac.at/AREsite.
Collapse
Affiliation(s)
- Andreas R Gruber
- Department of Microbiology and Immunobiology, Institute for Theoretical Chemistry, Max F Perutz Laboratories, University of Vienna, Vienna, Austria.
| | | | | | | | | |
Collapse
|
24
|
Khabar KSA. Post-transcriptional control during chronic inflammation and cancer: a focus on AU-rich elements. Cell Mol Life Sci 2010; 67:2937-55. [PMID: 20495997 PMCID: PMC2921490 DOI: 10.1007/s00018-010-0383-x] [Citation(s) in RCA: 127] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2010] [Revised: 04/01/2010] [Accepted: 04/21/2010] [Indexed: 12/21/2022]
Abstract
A considerable number of genes that code for AU-rich mRNAs including cytokines, growth factors, transcriptional factors, and certain receptors are involved in both chronic inflammation and cancer. Overexpression of these genes is affected by aberrations or by prolonged activation of several signaling pathways. AU-rich elements (ARE) are important cis-acting short sequences in the 3'UTR that mediate recognition of an array of RNA-binding proteins and affect mRNA stability and translation. This review addresses the cellular and molecular mechanisms that are common between inflammation and cancer and that also govern ARE-mediated post-transcriptional control. The first part examines the role of the ARE-genes in inflammation and cancer and sequence characteristics of AU-rich elements. The second part addresses the common signaling pathways in inflammation and cancer that regulate the ARE-mediated pathways and how their deregulations affect ARE-gene regulation and disease outcome.
Collapse
Affiliation(s)
- Khalid S A Khabar
- Program in BioMolecular Research, King Faisal Specialist Hospital and Research Center, Riyadh, 11211, Saudi Arabia.
| |
Collapse
|
25
|
Kazan H, Ray D, Chan ET, Hughes TR, Morris Q. RNAcontext: a new method for learning the sequence and structure binding preferences of RNA-binding proteins. PLoS Comput Biol 2010; 6:e1000832. [PMID: 20617199 PMCID: PMC2895634 DOI: 10.1371/journal.pcbi.1000832] [Citation(s) in RCA: 172] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Accepted: 05/25/2010] [Indexed: 12/31/2022] Open
Abstract
Metazoan genomes encode hundreds of RNA-binding proteins (RBPs). These proteins regulate post-transcriptional gene expression and have critical roles in numerous cellular processes including mRNA splicing, export, stability and translation. Despite their ubiquity and importance, the binding preferences for most RBPs are not well characterized. In vitro and in vivo studies, using affinity selection-based approaches, have successfully identified RNA sequence associated with specific RBPs; however, it is difficult to infer RBP sequence and structural preferences without specifically designed motif finding methods. In this study, we introduce a new motif-finding method, RNAcontext, designed to elucidate RBP-specific sequence and structural preferences with greater accuracy than existing approaches. We evaluated RNAcontext on recently published in vitro and in vivo RNA affinity selected data and demonstrate that RNAcontext identifies known binding preferences for several control proteins including HuR, PTB, and Vts1p and predicts new RNA structure preferences for SF2/ASF, RBM4, FUSIP1 and SLM2. The predicted preferences for SF2/ASF are consistent with its recently reported in vivo binding sites. RNAcontext is an accurate and efficient motif finding method ideally suited for using large-scale RNA-binding affinity datasets to determine the relative binding preferences of RBPs for a wide range of RNA sequences and structures.
Collapse
Affiliation(s)
- Hilal Kazan
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | |
Collapse
|
26
|
Li X, Quon G, Lipshitz HD, Morris Q. Predicting in vivo binding sites of RNA-binding proteins using mRNA secondary structure. RNA (NEW YORK, N.Y.) 2010; 16:1096-107. [PMID: 20418358 PMCID: PMC2874161 DOI: 10.1261/rna.2017210] [Citation(s) in RCA: 136] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
While many RNA-binding proteins (RBPs) bind RNA in a sequence-specific manner, their sequence preferences alone do not distinguish known target RNAs from other potential targets that are coexpressed and contain the same sequence motifs. Recently, the mRNA targets of dozens of RNA-binding proteins have been identified, facilitating a systematic study of the features of target transcripts. Using these data, we demonstrate that calculating the predicted structural accessibility of a putative RBP binding site allows one to significantly improve the accuracy of predicting in vivo binding for the majority of sequence-specific RBPs. In our new in silico approach, accessibility is predicted based solely on the mRNA sequence without consideration of the locations of bound trans-factors; as such, our results suggest a greater than previously anticipated role for intrinsic mRNA secondary structure in determining RBP binding target preference. Target site accessibility aids in predicting target transcripts and the binding sites for RBPs with a range of RNA-binding domains and subcellular functions. Based on this work, we introduce a new motif-finding algorithm that identifies accessible sequence-specific RBP motifs from in vivo binding data.
Collapse
Affiliation(s)
- Xiao Li
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1E3, Canada
| | | | | | | |
Collapse
|
27
|
Salari R, Backofen R, Sahinalp SC. Fast prediction of RNA-RNA interaction. Algorithms Mol Biol 2010; 5:5. [PMID: 20047661 PMCID: PMC2828455 DOI: 10.1186/1748-7188-5-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 01/04/2010] [Indexed: 11/29/2022] Open
Abstract
Background Regulatory antisense RNAs are a class of ncRNAs that regulate gene expression by prohibiting the translation of an mRNA by establishing stable interactions with a target sequence. There is great demand for efficient computational methods to predict the specific interaction between an ncRNA and its target mRNA(s). There are a number of algorithms in the literature which can predict a variety of such interactions - unfortunately at a very high computational cost. Although some existing target prediction approaches are much faster, they are specialized for interactions with a single binding site. Methods In this paper we present a novel algorithm to accurately predict the minimum free energy structure of RNA-RNA interaction under the most general type of interactions studied in the literature. Moreover, we introduce a fast heuristic method to predict the specific (multiple) binding sites of two interacting RNAs. Results We verify the performance of our algorithms for joint structure and binding site prediction on a set of known interacting RNA pairs. Experimental results show our algorithms are highly accurate and outperform all competitive approaches.
Collapse
|
28
|
Kierzek E. Binding of short oligonucleotides to RNA: studies of the binding of common RNA structural motifs to isoenergetic microarrays. Biochemistry 2009; 48:11344-56. [PMID: 19835418 DOI: 10.1021/bi901264v] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Binding of short oligonucleotides to RNA is important for many biological processes. On the basis of RNAi phenomena, antisense, and ribozyme approaches, it is useful in the inhibition of biological functions. To be considered as potential therapeutics, oligonucleotides must bind strongly and selectively to a complementary fragment of target RNA. Microarray technologies also involve the binding of oligonucleotide probes to DNA or RNA. Herein, the hybridization of common structural motifs of RNA, i.e., hairpins, internal loops, bulges, 3'- and 5'-dangling ends, and pseudoknots to isoenergetic microarray probes is presented. The analysis demonstrates that microarray probes bind to bulges, internal loops, and dangling ends as expected. Probes may also bind to terminal helixes, however, possibly due to the rearrangement of base pairs. These results suggest that isoenergetic microarray mapping can provide data to facilitate and improve RNA secondary structure prediction. However, optimal results require combination with chemical and/or enzymatic mapping.
Collapse
Affiliation(s)
- Elzbieta Kierzek
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 60-704 Poznan, Noskowskiego 12/14, Poland.
| |
Collapse
|
29
|
Huang FWD, Qin J, Reidys CM, Stadler PF. Partition function and base pairing probabilities for RNA-RNA interaction prediction. ACTA ACUST UNITED AC 2009; 25:2646-54. [PMID: 19671692 DOI: 10.1093/bioinformatics/btp481] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
MOTIVATION The RNA-RNA interaction problem (RIP) consists in finding the energetically optimal structure of two RNA molecules that bind to each other. The standard model allows secondary structures in both partners as well as additional base pairs between the two RNAs subject to certain restrictions that ensure that RIP is solvabale by a polynomial time dynamic programming algorithm. RNA-RNA binding, like RNA folding, is typically not dominated by the ground state structure. Instead, a large ensemble of alternative structures contributes to the interaction thermodynamics. RESULTS We present here an O(N(6)) time and O(N(4)) dynamics programming algorithm for computing the full partition function for RIP which is based on the combinatorial notion of 'tight structures'. Albeit equivalent to recent work by H. Chitsaz and collaborators, our approach in addition provides a full-fledged computation of the base pairing probabilities, which relies on the notion of a decomposition tree for joint structures. In practise, our implementation is efficient enough to investigate, for instance, the interactions of small bacterial RNAs and their target mRNAs. AVAILABILITY The program rip is implemented in C. The source code is available for download from http://www.combinatorics.cn/cbpc/rip.html and http://www.bioinf.uni-leipzig.de/Software/rip.html.
Collapse
Affiliation(s)
- Fenix W D Huang
- Center for Combinatorics, LPMC-TJKLC, Nankai University Tianjin 300071, P.R. China
| | | | | | | |
Collapse
|
30
|
|
31
|
Mückstein U, Tafer H, Bernhart SH, Hernandez-Rosales M, Vogel J, Stadler PF, Hofacker IL. Translational Control by RNA-RNA Interaction: Improved Computation of RNA-RNA Binding Thermodynamics. COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE 2008. [DOI: 10.1007/978-3-540-70600-7_9] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
32
|
Stich M, Briones C, Manrubia SC. On the structural repertoire of pools of short, random RNA sequences. J Theor Biol 2008; 252:750-63. [PMID: 18374951 DOI: 10.1016/j.jtbi.2008.02.018] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2007] [Revised: 01/14/2008] [Accepted: 02/13/2008] [Indexed: 01/21/2023]
Abstract
A detailed knowledge of the mapping between sequence and structure spaces in populations of RNA molecules is essential to better understand their present-day functional properties, to envisage a plausible early evolution of RNA in a prebiotic chemical environment and to improve the design of in vitro evolution experiments, among others. Analysis of natural RNAs, as well as in vitro and computational studies, show that certain RNA structural motifs are much more abundant than others, pointing out a complex relation between sequence and structure. Within this framework, we have investigated computationally the structural properties of a large pool (10(8) molecules) of single-stranded, 35 nt-long, random RNA sequences. The secondary structures obtained are ranked and classified into structure families. The number of structures in main families is analytically calculated and compared with the numerical results. This permits a quantification of the fraction of structure space covered by a large pool of sequences. We further show that the number of structural motifs and their frequency is highly unbalanced with respect to the nucleotide composition: simple structures such as stem-loops and hairpins arise from sequences depleted in G, while more complex structures require an enrichment of G. In general, we observe a strong correlation between subfamilies-characterized by a fixed number of paired nucleotides-and nucleotide composition. Our results are compared to the structural repertoire obtained in a second pool where isolated base pairs are prohibited.
Collapse
Affiliation(s)
- Michael Stich
- Centro de Astrobiología (CSIC-INTA), Instituto Nacional de Técnica Aeroespacial Ctra. de Ajalvir km. 4 28850 Torrejón de Ardoz, Madrid, Spain
| | | | | |
Collapse
|
33
|
Abstract
AU-rich elements (AREs) in the 3'-untranslated region (UTR) of unstable mRNA dictate their degradation or mediate translational repression. Cell signaling through p38alpha MAPK is necessary for post-transcriptional regulation of many pro-inflammatory cytokines. Here, the cis-acting elements of interleukin-6 (IL-6) 3'-UTR mRNA that required p38alpha signaling for mRNA stability and translation were identified. Using mouse embryonic fibroblasts (MEFs) derived from p38alpha(+/+) and p38alpha(-/-) mice, we observed that p38alpha is obligatory for the IL-1-induced IL-6 biosynthesis. IL-6 mRNA stability is promoted by p38alpha via 3'-UTR. To understand the mechanism of cis-elements regulated by p38alpha at post-transcriptional level, truncation of 3'-UTR and the full-length 3'-UTR with individual AUUUA motif mutation placed in gene reporter system was employed. Mutation-based screen performed in p38alpha(+/+) and p38alpha(-/-) mouse embryonic fibroblast cells revealed that ARE1, ARE2, and ARE5 in IL-6 3'-UTR were targeted by p38alpha, and truncation-based screen showed that IL-6 3'-UTR-(56-173) was targeted by p38alpha to stable mRNA. RNA secondary structure analysis indicated that modulated reporter gene expression was consistent with predicted secondary structure changes.
Collapse
Affiliation(s)
- Wenpu Zhao
- Department of Periodontics and Oral Medicine, University of Michigan, Ann Arbor, Michigan 48109-1078
| | - Min Liu
- Department of Periodontics and Oral Medicine, University of Michigan, Ann Arbor, Michigan 48109-1078
| | - Keith L. Kirkwood
- Department of Periodontics and Oral Medicine, University of Michigan, Ann Arbor, Michigan 48109-1078
| |
Collapse
|
34
|
Backofen R, Bernhart SH, Flamm C, Fried C, Fritzsch G, Hackermüller J, Hertel J, Hofacker IL, Missal K, Mosig A, Prohaska SJ, Rose D, Stadler PF, Tanzer A, Washietl S, Will S. RNAs everywhere: genome-wide annotation of structured RNAs. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2007; 308:1-25. [PMID: 17171697 DOI: 10.1002/jez.b.21130] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Starting with the discovery of microRNAs and the advent of genome-wide transcriptomics, non-protein-coding transcripts have moved from a fringe topic to a central field research in molecular biology. In this contribution we review the state of the art of "computational RNomics", i.e., the bioinformatics approaches to genome-wide RNA annotation. Instead of rehashing results from recently published surveys in detail, we focus here on the open problem in the field, namely (functional) annotation of the plethora of putative RNAs. A series of exploratory studies are used to provide non-trivial examples for the discussion of some of the difficulties.
Collapse
|
35
|
Hiller M, Pudimat R, Busch A, Backofen R. Using RNA secondary structures to guide sequence motif finding towards single-stranded regions. Nucleic Acids Res 2006; 34:e117. [PMID: 16987907 PMCID: PMC1903381 DOI: 10.1093/nar/gkl544] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2006] [Revised: 07/07/2006] [Accepted: 07/13/2006] [Indexed: 01/05/2023] Open
Abstract
RNA binding proteins recognize RNA targets in a sequence specific manner. Apart from the sequence, the secondary structure context of the binding site also affects the binding affinity. Binding sites are often located in single-stranded RNA regions and it was shown that the sequestration of a binding motif in a double-strand abolishes protein binding. Thus, it is desirable to include knowledge about RNA secondary structures when searching for the binding motif of a protein. We present the approach MEMERIS for searching sequence motifs in a set of RNA sequences and simultaneously integrating information about secondary structures. To abstract from specific structural elements, we precompute position-specific values measuring the single-strandedness of all substrings of an RNA sequence. These values are used as prior knowledge about the motif starts to guide the motif search. Extensive tests with artificial and biological data demonstrate that MEMERIS is able to identify motifs in single-stranded regions even if a stronger motif located in double-strand parts exists. The discovered motif occurrences in biological datasets mostly coincide with known protein-binding sites. This algorithm can be used for finding the binding motif of single-stranded RNA-binding proteins in SELEX or other biological sequence data.
Collapse
Affiliation(s)
- Michael Hiller
- Institute of Computer Science, Chair for Bioinformatics, Albert-Ludwigs-University FreiburgGeorges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rainer Pudimat
- Institute of Computer Science, Chair for Bioinformatics, Albert-Ludwigs-University FreiburgGeorges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Anke Busch
- Institute of Computer Science, Chair for Bioinformatics, Albert-Ludwigs-University FreiburgGeorges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rolf Backofen
- Institute of Computer Science, Chair for Bioinformatics, Albert-Ludwigs-University FreiburgGeorges-Koehler-Allee 106, 79110 Freiburg, Germany
| |
Collapse
|
36
|
Schoemaker RJW, Gultyaev AP. Computer simulation of chaperone effects of Archaeal C/D box sRNA binding on rRNA folding. Nucleic Acids Res 2006; 34:2015-26. [PMID: 16614451 PMCID: PMC1435978 DOI: 10.1093/nar/gkl154] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2005] [Revised: 03/12/2006] [Accepted: 03/20/2006] [Indexed: 12/04/2022] Open
Abstract
Archaeal C/D box small RNAs (sRNAs) are homologues of eukaryotic C/D box small nucleolar RNAs (snoRNAs). Their main function is guiding 2'-O-ribose methylation of nucleotides in rRNAs. The methylation requires the pairing of an sRNA antisense element to an rRNA target site with formation of an RNA-RNA duplex. The temporary formation of such a duplex during rRNA maturation is expected to influence rRNA folding in a chaperone-like way, in particular in thermophilic Archaea, where multiple sRNAs with two binding sites are found. Here we investigate possible mechanisms of chaperone function of Archaeoglobus fulgidus and Pyrococcus abyssi C/D box sRNAs using computer simulations of rRNA secondary structure formation by genetic algorithm. The effects of sRNA binding on rRNA structure are introduced as temporary structural constraints during co-transcriptional folding. Comparisons of the final predictions with simulations without sRNA binding and with phylogenetic structures show that sRNAs with two antisense elements may significantly facilitate the correct formation of long-range interactions in rRNAs, in particular at elevated temperatures. The simulations suggest that the main mechanism of this effect is a transient restriction of folding in rRNA domains where the termini are brought together by binding to double-guide sRNAs.
Collapse
MESH Headings
- Archaeoglobus fulgidus/genetics
- Base Sequence
- Binding Sites
- Computer Simulation
- Molecular Chaperones/chemistry
- Molecular Chaperones/metabolism
- Molecular Sequence Data
- Nucleic Acid Conformation
- Pyrococcus abyssi/genetics
- RNA, Antisense/chemistry
- RNA, Archaeal/chemistry
- RNA, Archaeal/metabolism
- RNA, Ribosomal, 16S/chemistry
- RNA, Ribosomal, 16S/metabolism
- RNA, Small Nucleolar/chemistry
- RNA, Small Nucleolar/metabolism
- Temperature
- RNA, Small Untranslated
Collapse
Affiliation(s)
- Ruud J. W. Schoemaker
- Section Theoretical Biology, Leiden Institute of Biology, Leiden UniversityKaiserstraat 63, 2311 GP Leiden, The Netherlands
| | - Alexander P. Gultyaev
- Section Theoretical Biology, Leiden Institute of Biology, Leiden UniversityKaiserstraat 63, 2311 GP Leiden, The Netherlands
| |
Collapse
|
37
|
Bernhart SH, Tafer H, Mückstein U, Flamm C, Stadler PF, Hofacker IL. Partition function and base pairing probabilities of RNA heterodimers. Algorithms Mol Biol 2006; 1:3. [PMID: 16722605 PMCID: PMC1459172 DOI: 10.1186/1748-7188-1-3] [Citation(s) in RCA: 200] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2006] [Accepted: 03/16/2006] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND RNA has been recognized as a key player in cellular regulation in recent years. In many cases, non-coding RNAs exert their function by binding to other nucleic acids, as in the case of microRNAs and snoRNAs. The specificity of these interactions derives from the stability of inter-molecular base pairing. The accurate computational treatment of RNA-RNA binding therefore lies at the heart of target prediction algorithms. METHODS The standard dynamic programming algorithms for computing secondary structures of linear single-stranded RNA molecules are extended to the co-folding of two interacting RNAs. RESULTS We present a program, RNAcofold, that computes the hybridization energy and base pairing pattern of a pair of interacting RNA molecules. In contrast to earlier approaches, complex internal structures in both RNAs are fully taken into account. RNAcofold supports the calculation of the minimum energy structure and of a complete set of suboptimal structures in an energy band above the ground state. Furthermore, it provides an extension of McCaskill's partition function algorithm to compute base pairing probabilities, realistic interaction energies, and equilibrium concentrations of duplex structures.
Collapse
Affiliation(s)
- Stephan H Bernhart
- Theoretical Biochemistry Group, Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, Vienna, Austria
| | - Hakim Tafer
- Theoretical Biochemistry Group, Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, Vienna, Austria
| | - Ulrike Mückstein
- Theoretical Biochemistry Group, Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, Vienna, Austria
| | - Christoph Flamm
- Theoretical Biochemistry Group, Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, Vienna, Austria
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16–18, D-04170 Leipzig, Germany
| | - Peter F Stadler
- Theoretical Biochemistry Group, Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, Vienna, Austria
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16–18, D-04170 Leipzig, Germany
- The Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, New Mexico
| | - Ivo L Hofacker
- Theoretical Biochemistry Group, Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, Vienna, Austria
| |
Collapse
|
38
|
Mückstein U, Tafer H, Hackermüller J, Bernhart SH, Stadler PF, Hofacker IL. Thermodynamics of RNA-RNA binding. Bioinformatics 2006; 22:1177-82. [PMID: 16446276 DOI: 10.1093/bioinformatics/btl024] [Citation(s) in RCA: 245] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Reliable prediction of RNA-RNA binding energies is crucial, e.g. for the understanding on RNAi, microRNA-mRNA binding and antisense interactions. The thermodynamics of such RNA-RNA interactions can be understood as the sum of two energy contributions: (1) the energy necessary to 'open' the binding site and (2) the energy gained from hybridization. METHODS We present an extension of the standard partition function approach to RNA secondary structures that computes the probabilities Pu[i, j] that a sequence interval [i, j] is unpaired. RESULTS Comparison with experimental data shows that Pu[i, j] can be applied as a significant determinant of local target site accessibility for RNA interference (RNAi). Furthermore, these quantities can be used to rigorously determine binding free energies of short oligomers to large mRNA targets. The resource consumption is comparable with a single partition function computation for the large target molecule. We can show that RNAi efficiency correlates well with the binding energies of siRNAs to their respective mRNA target. AVAILABILITY RNAup will be distributed as part of the Vienna RNA Package, www.tbi.univie.ac.at/~ivo/RNA/
Collapse
Affiliation(s)
- Ulrike Mückstein
- Institute for Theoretical Chemistry, University of Vienna Währingerstrasse 17, A-1090 Vienna, Austria
| | | | | | | | | | | |
Collapse
|
39
|
Steigele S, Nieselt K. Open reading frames provide a rich pool of potential natural antisense transcripts in fungal genomes. Nucleic Acids Res 2005; 33:5034-44. [PMID: 16147987 PMCID: PMC1201330 DOI: 10.1093/nar/gki804] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2005] [Revised: 08/15/2005] [Accepted: 08/15/2005] [Indexed: 12/28/2022] Open
Abstract
Natural antisense transcripts are reported from all kingdoms of life and several recent reports of genomewide screens indicate that they are widely distributed. These transcripts seem to be involved in various biological functions and may govern the expression of their respective sense partner. Very little, however, is known about the degree of evolutionary conservation of antisense transcripts. Furthermore, none of the earlier analyses has studied whether antisense relationships are solely dual or involved in more complex relationships. Here we present a systematic screen for cis- and trans-located antisense transcripts based on open reading frames (ORFs) from five fungal species. The relative number of ORFs involved in antisense relationships varies greatly between the five species. In addition, other significant differences are found between the species, such as the mean length of the antisense region. The majority of trans-located antisense transcripts is found to be involved in complex relationships, resulting in highly connected networks. The analysis of the degree of evolutionary conservation of antisense transcripts shows that most antisense transcripts have no ortholog in any other species. An annotation of antisense transcripts based on Gene Ontology directs to common terms and shows that proteins of genes involved in antisense relationships preferentially localize to the nucleus with common functions in the regulation or maintenance of nucleic acids.
Collapse
MESH Headings
- Evolution, Molecular
- Genome, Fungal
- Genomics
- Models, Genetic
- Open Reading Frames
- RNA, Antisense/chemistry
- RNA, Antisense/classification
- RNA, Antisense/genetics
- RNA, Fungal/chemistry
- RNA, Fungal/classification
- RNA, Fungal/genetics
- Transcription, Genetic
Collapse
Affiliation(s)
- Stephan Steigele
- Wilhelm-Schickard-Institut f. Informatik, ZBIT–Center for Bioinformatics, Tübingen, University of TübingenGermany
| | - Kay Nieselt
- Wilhelm-Schickard-Institut f. Informatik, ZBIT–Center for Bioinformatics, Tübingen, University of TübingenGermany
| |
Collapse
|