1
|
Quadrini M, Tesei L, Merelli E. Automatic generation of pseudoknotted RNAs taxonomy. BMC Bioinformatics 2023; 23:575. [PMID: 37322429 DOI: 10.1186/s12859-023-05362-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND The ability to compare RNA secondary structures is important in understanding their biological function and for grouping similar organisms into families by looking at evolutionarily conserved sequences such as 16S rRNA. Most comparison methods and benchmarks in the literature focus on pseudoknot-free structures due to the difficulty of mapping pseudoknots in classical tree representations. Some approaches exist that permit to cluster pseudoknotted RNAs but there is not a general framework for evaluating their performance. RESULTS We introduce an evaluation framework based on a similarity/dissimilarity measure obtained by a comparison method and agglomerative clustering. Their combination automatically partition a set of molecules into groups. To illustrate the framework we define and make available a benchmark of pseudoknotted (16S and 23S) and pseudoknot-free (5S) rRNA secondary structures belonging to Archaea, Bacteria and Eukaryota. We also consider five different comparison methods from the literature that are able to manage pseudoknots. For each method we clusterize the molecules in the benchmark to obtain the taxa at the rank phylum according to the European Nucleotide Archive curated taxonomy. We compute appropriate metrics for each method and we compare their suitability to reconstruct the taxa.
Collapse
Affiliation(s)
- Michela Quadrini
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| | - Luca Tesei
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy.
| | - Emanuela Merelli
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| |
Collapse
|
2
|
Lasher B, Hendrix DA. bpRNA-align: improved RNA secondary structure global alignment for comparing and clustering RNA structures. RNA (NEW YORK, N.Y.) 2023; 29:584-595. [PMID: 36759128 PMCID: PMC10159002 DOI: 10.1261/rna.079211.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 01/14/2023] [Indexed: 05/06/2023]
Abstract
Ribonucleic acid (RNA) is a polymeric molecule that is fundamental to biological processes, with structure being more highly conserved than primary sequence and often key to its function. Advances in RNA structure characterization have resulted in an increase in the number of accurate secondary structures. The task of uncovering common RNA structural motifs with a collective function through structural comparison, providing a level of similarity, remains challenging and could be used to improve RNA secondary structure databases and discover new RNA families. In this work, we present a novel secondary structure alignment method, bpRNA-align. bpRNA-align is a customized global structural alignment method, utilizing an inverted (gap extend costs more than gap open) and context-specific affine gap penalty along with a structural, feature-specific substitution matrix to provide similarity scores. We evaluate our similarity scores in comparison to other methods, using affinity propagation clustering, applied to a benchmarking data set of known structure types. bpRNA-align shows improvement in clustering performance over a broad range of structure types.
Collapse
Affiliation(s)
- Brittany Lasher
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
| | - David A Hendrix
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 97331, USA
| |
Collapse
|
3
|
Qiu X. Sequence similarity governs generalizability of de novo deep learning models for RNA secondary structure prediction. PLoS Comput Biol 2023; 19:e1011047. [PMID: 37068100 PMCID: PMC10138783 DOI: 10.1371/journal.pcbi.1011047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 04/27/2023] [Accepted: 03/25/2023] [Indexed: 04/18/2023] Open
Abstract
Making no use of physical laws or co-evolutionary information, de novo deep learning (DL) models for RNA secondary structure prediction have achieved far superior performances than traditional algorithms. However, their statistical underpinning raises the crucial question of generalizability. We present a quantitative study of the performance and generalizability of a series of de novo DL models, with a minimal two-module architecture and no post-processing, under varied similarities between seen and unseen sequences. Our models demonstrate excellent expressive capacities and outperform existing methods on common benchmark datasets. However, model generalizability, i.e., the performance gap between the seen and unseen sets, degrades rapidly as the sequence similarity decreases. The same trends are observed from several recent DL and machine learning models. And an inverse correlation between performance and generalizability is revealed collectively across all learning-based models with wide-ranging architectures and sizes. We further quantitate how generalizability depends on sequence and structure identity scores via pairwise alignment, providing unique quantitative insights into the limitations of statistical learning. Generalizability thus poses a major hurdle for deploying de novo DL models in practice and various pathways for future advances are discussed.
Collapse
Affiliation(s)
- Xiangyun Qiu
- Department of Physics, George Washington University, Washington DC, United States of America
| |
Collapse
|
4
|
Metrics for RNA Secondary Structure Comparison. Methods Mol Biol 2023; 2586:79-88. [PMID: 36705899 DOI: 10.1007/978-1-0716-2768-6_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
RNA secondary structure comparison is one of the important analyses for elucidating individual functions of RNAs since it is widely accepted that their functions and structures are strongly correlated. However, although the RNA secondary structures with pseudoknot play important roles in vivo, it is difficult to deal with such structures in silico due to their structural complexity, which is a major obstacle to the analysis of RNA functions.Here, we introduce an algorithm and a metric for comparing pseudoknotted RNA secondary structures based on topological centroid identification and tree edit distance and describe the usage protocol of a software enabling us to run the comparison. This software is publicly available and works on both Microsoft Windows and Apple macOS.
Collapse
|
5
|
Marchand B, Ponty Y, Bulteau L. Tree diet: reducing the treewidth to unlock FPT algorithms in RNA bioinformatics. Algorithms Mol Biol 2022; 17:8. [PMID: 35366923 PMCID: PMC8976393 DOI: 10.1186/s13015-022-00213-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/01/2022] [Indexed: 11/25/2022] Open
Abstract
Hard graph problems are ubiquitous in Bioinformatics, inspiring the design of specialized Fixed-Parameter Tractable algorithms, many of which rely on a combination of tree-decomposition and dynamic programming. The time/space complexities of such approaches hinge critically on low values for the treewidth tw of the input graph. In order to extend their scope of applicability, we introduce the Tree-Diet problem, i.e. the removal of a minimal set of edges such that a given tree-decomposition can be slimmed down to a prescribed treewidth \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$tw'$$\end{document}tw′. Our rationale is that the time gained thanks to a smaller treewidth in a parameterized algorithm compensates the extra post-processing needed to take deleted edges into account. Our core result is an FPT dynamic programming algorithm for Tree-Diet, using \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$2^{O(tw)}n$$\end{document}2O(tw)n time and space. We complement this result with parameterized complexity lower-bounds for stronger variants (e.g., NP-hardness when \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$tw'$$\end{document}tw′ or \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$tw-tw'$$\end{document}tw-tw′ is constant). We propose a prototype implementation for our approach which we apply on difficult instances of selected RNA-based problems: RNA design, sequence-structure alignment, and search of pseudoknotted RNAs in genomes, revealing very encouraging results. This work paves the way for a wider adoption of tree-decomposition-based algorithms in Bioinformatics.
Collapse
|
6
|
Antunes D, Santos LHS, Caffarena ER, Guimarães ACR. Bacterial 2'-Deoxyguanosine Riboswitch Classes as Potential Targets for Antibiotics: A Structure and Dynamics Study. Int J Mol Sci 2022; 23:ijms23041925. [PMID: 35216040 PMCID: PMC8872408 DOI: 10.3390/ijms23041925] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 12/20/2021] [Accepted: 12/21/2021] [Indexed: 01/18/2023] Open
Abstract
The spread of antibiotic-resistant bacteria represents a substantial health threat. Current antibiotics act on a few metabolic pathways, facilitating resistance. Consequently, novel regulatory inhibition mechanisms are necessary. Riboswitches represent promising targets for antibacterial drugs. Purine riboswitches are interesting, since they play essential roles in the genetic regulation of bacterial metabolism. Among these, class I (2′-dG-I) and class II (2′-dG-II) are two different 2′-deoxyguanosine (2′-dG) riboswitches involved in the control of deoxyguanosine metabolism. However, high affinity for nucleosides involves local or distal modifications around the ligand-binding pocket, depending on the class. Therefore, it is crucial to understand these riboswitches’ recognition mechanisms as antibiotic targets. In this work, we used a combination of computational biophysics approaches to investigate the structure, dynamics, and energy landscape of both 2′-dG classes bound to the nucleoside ligands, 2′-deoxyguanosine, and riboguanosine. Our results suggest that the stability and increased interactions in the three-way junction of 2′-dG riboswitches were associated with a higher nucleoside ligand affinity. Also, structural changes in the 2′-dG-II aptamers enable enhanced intramolecular communication. Overall, the 2′-dG-II riboswitch might be a promising drug design target due to its ability to recognize both cognate and noncognate ligands.
Collapse
Affiliation(s)
- Deborah Antunes
- Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro 21040-900, Brazil;
- Correspondence:
| | - Lucianna H. S. Santos
- Laboratório de Modelagem Molecular e Planejamento de Fármacos, Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil;
| | - Ernesto Raul Caffarena
- Grupo de Biofísica Computacional e Modelagem Molecular, Programa de Computação Científica, Fiocruz, Rio de Janeiro 21040-360, Brazil;
| | - Ana Carolina Ramos Guimarães
- Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro 21040-900, Brazil;
| |
Collapse
|
7
|
Schmidt M, Hamacher K, Reinhardt F, Lotz TS, Groher F, Suess B, Jager S. SICOR: Subgraph Isomorphism Comparison of RNA Secondary Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2189-2195. [PMID: 31295116 DOI: 10.1109/tcbb.2019.2926711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
RNA aptamer selection during SELEX experiments builds on secondary structural diversity. Advanced structural comparison methods can focus this diversity. We develop SICOR, which uses probabilistic subgraph isomorphisms for graph distances between RNA secondary structure graphs. SICOR outperforms other comparison methods and is applicable to many structural comparisons in experimental design.
Collapse
|
8
|
Wang F, Akutsu T, Mori T. Comparison of Pseudoknotted RNA Secondary Structures by Topological Centroid Identification and Tree Edit Distance. J Comput Biol 2020; 27:1443-1451. [PMID: 32058802 DOI: 10.1089/cmb.2019.0512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Comparison of RNA structures is one of the most crucial analysis for elucidating their individual functions and promoting medical applications. Because it is widely accepted that their functions and structures are strongly correlated, various methods for RNA secondary structure analysis have been proposed owing to the difficulty in predicting RNA three-dimensional structure directly from its sequence. However, there are few methods dealing with RNA secondary structures with a specific and complex partial structure called pseudoknot despite its significance to biological process, which is a big obstacle for analyzing their functions. In this study, we propose a novel tree representation of pseudoknotted RNA secondary structures by topological centroid identification and their comparison methods based on the tree edit distance. In the proposed method, a given graph representing an RNA secondary structure is transformed to a tree rooted at one of the vertices constituting the topological centroid that is identified by removing cycles with peeling processing for the graph. When comparing tree-represented RNA secondary structures collected from a public database using the tree edit distance and functional gene groups defined by Gene Ontology (GO), the proposed method showed better clustering results according to their GOs than canonical RNA sequence-based comparison. In addition, we also report a case that the combination of the tree edit distance and the sequence edit distance shows a better classification of the pseudoknotted RNA secondary structures.
Collapse
Affiliation(s)
- Feiqi Wang
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Tomoya Mori
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| |
Collapse
|
9
|
Unraveling RNA dynamical behavior of TPP riboswitches: a comparison between Escherichia coli and Arabidopsis thaliana. Sci Rep 2019; 9:4197. [PMID: 30862893 PMCID: PMC6414600 DOI: 10.1038/s41598-019-40875-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 02/19/2019] [Indexed: 01/03/2023] Open
Abstract
Riboswitches are RNA sensors that affect post-transcriptional processes through their ability to bind to small molecules. Thiamine pyrophosphate (TPP) riboswitch class is the most widespread riboswitch occurring in all three domains of life. Even though it controls different genes involved in the synthesis or transport of thiamine and its phosphorylated derivatives in bacteria, archaea, fungi, and plants, the TPP aptamer has a conserved structure. In this study, we aimed at understanding differences in the structural dynamics of TPP riboswitches from Escherichia coli and Arabidopsis thaliana, based on their crystallographic structures (TPPswec and TPPswat, respectively) and dynamics in aqueous solution, both in apo and holo states. A combination of Molecular Dynamics Simulations and Network Analysis empowered to find out slight differences in the dynamical behavior of TPP riboswitches, although relevant for their dynamics in bacteria and plants species. Our results suggest that distinct interactions in the microenvironment surrounding nucleotide U36 of TPPswec (and U35 in TPPswat) are related to different responses to TPP. The network analysis showed that minor structural differences in the aptamer enable enhanced intramolecular communication in the presence of TPP in TPPswec, but not in TPPswat. TPP riboswitches of plants present subtler and slower regulation mechanisms than bacteria do.
Collapse
|
10
|
Du S, Niu G, Nyman T, Wei M. Characterization of the mitochondrial genome of Arge bella Wei & Du sp. nov. (Hymenoptera: Argidae). PeerJ 2018; 6:e6131. [PMID: 30595984 PMCID: PMC6305119 DOI: 10.7717/peerj.6131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 11/17/2018] [Indexed: 01/27/2023] Open
Abstract
We describe Arge bella Wei & Du sp. nov., a large and beautiful species of Argidae from south China, and report its mitochondrial genome based on high-throughput sequencing data. We present the gene order, nucleotide composition of protein-coding genes (PCGs), and the secondary structures of RNA genes. The nearly complete mitochondrial genome of A. bella has a length of 15,576 bp and a typical set of 37 genes (22 tRNAs, 13 PCGs, and 2 rRNAs). Three tRNAs are rearranged in the A. bella mitochondrial genome as compared to the ancestral type in insects: trnM and trnQ are shuffled, while trnW is translocated from the trnW-trnC-trnY cluster to a location downstream of trnI. All PCGs are initiated by ATN codons, and terminated with TAA, TA or T as stop codons. All tRNAs have a typical cloverleaf secondary structure, except for trnS1. H821 of rrnS and H976 of rrnL are redundant. A phylogenetic analysis based on mitochondrial genome sequences of A. bella, 21 other symphytan species, two apocritan representatives, and four outgroup taxa supports the placement of Argidae as sister to the Pergidae within the symphytan superfamily Tenthredinoidea.
Collapse
Affiliation(s)
- Shiyu Du
- Central South University of Forestry and Technology, Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees (Central South University of Forestry and Technology), Ministry of Education, Changsha, Hunan, China
| | - Gengyun Niu
- Jiangxi Normal University, Life Science College, Nanchang, Jiangxi, China
| | - Tommi Nyman
- Norwegian Institute of Bioeconomy Research, Department of Ecosystems in the Barents Region, Svanhovd Research Station, Svanvik, Norway
| | - Meicai Wei
- Central South University of Forestry and Technology, Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees (Central South University of Forestry and Technology), Ministry of Education, Changsha, Hunan, China
| |
Collapse
|
11
|
Modular cell-internalizing aptamer nanostructure enables targeted delivery of large functional RNAs in cancer cell lines. Nat Commun 2018; 9:2283. [PMID: 29891903 PMCID: PMC5995956 DOI: 10.1038/s41467-018-04691-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 05/09/2018] [Indexed: 02/07/2023] Open
Abstract
Large RNAs and ribonucleoprotein complexes have powerful therapeutic potential, but effective cell-targeted delivery tools are limited. Aptamers that internalize into target cells can deliver siRNAs (<15 kDa, 19–21 nt/strand). We demonstrate a modular nanostructure for cellular delivery of large, functional RNA payloads (50–80 kDa, 175–250 nt) by aptamers that recognize multiple human B cell cancer lines and transferrin receptor-expressing cells. Fluorogenic RNA reporter payloads enable accelerated testing of platform designs and rapid evaluation of assembly and internalization. Modularity is demonstrated by swapping in different targeting and payload aptamers. Both modules internalize into leukemic B cell lines and remained colocalized within endosomes. Fluorescence from internalized RNA persists for ≥2 h, suggesting a sizable window for aptamer payloads to exert influence upon targeted cells. This demonstration of aptamer-mediated, cell-internalizing delivery of large RNAs with retention of functional structure raises the possibility of manipulating endosomes and cells by delivering large aptamers and regulatory RNAs. Large RNAs and ribonucleoprotein complexes have shown potential as novel therapeutic agents, but their targeted delivery to cells is still challenging. Here the authors present a modular aptamer nanostructure for intracellular delivery of RNAs up to 250 nucleotides to cancer cells.
Collapse
|
12
|
Jasiński M, Kulik M, Wojciechowska M, Stolarski R, Trylska J. Interactions of 2'-O-methyl oligoribonucleotides with the RNA models of the 30S subunit A-site. PLoS One 2018; 13:e0191138. [PMID: 29351348 PMCID: PMC5774723 DOI: 10.1371/journal.pone.0191138] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Accepted: 12/28/2017] [Indexed: 12/15/2022] Open
Abstract
Synthetic oligonucleotides targeting functional regions of the prokaryotic rRNA could be promising antimicrobial agents. Indeed, such oligonucleotides were proven to inhibit bacterial growth. 2’-O-methylated (2’-O-Me) oligoribonucleotides with a sequence complementary to the decoding site in 16S rRNA were reported as inhibitors of bacterial translation. However, the binding mode and structures of the formed complexes, as well as the level of selectivity of the oligonucleotides between the prokaryotic and eukaryotic target, were not determined. We have analyzed three 2’-O-Me oligoribonucleotides designed to hybridize with the models of the prokaryotic rRNA containing two neighboring aminoglycoside binding pockets. One pocket is the paromomycin/kanamycin binding site corresponding to the decoding site in the small ribosomal subunit and the other one is the close-by hygromycin B binding site whose dynamics has not been previously reported. Molecular dynamics (MD) simulations, as well as isothermal titration calorimetry, gel electrophoresis and spectroscopic studies have shown that the eukaryotic rRNA model is less conformationally stable (in terms of hydrogen bonds and stacking interactions) than the corresponding prokaryotic one. In MD simulations of the eukaryotic construct, the nucleotide U1498, which plays an important role in correct positioning of mRNA during translation, is flexible and spontaneously flips out into the solvent. In solution studies, the 2’-O-Me oligoribonucleotides did not interact with the double stranded rRNA models but all formed stable complexes with the single-stranded prokaryotic target. 2’-O-Me oligoribonucleotides with one and two mismatches bound less tightly to the eukaryotic target. This shows that at least three mismatches between the 2’-O-Me oligoribonucleotide and eukaryotic rRNA are required to ensure target selectivity. The results also suggest that, in the ribosome environment, the strand invasion is the preferred binding mode of 2’-O-Me oligoribonucleotides targeting the aminoglycoside binding sites in 16S rRNA.
Collapse
Affiliation(s)
- Maciej Jasiński
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Warsaw, Poland
| | - Marta Kulik
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | | | - Ryszard Stolarski
- Department of Biophysics, Institute of Experimental Physics, Faculty of Physics, University of Warsaw, Warsaw, Poland
| | - Joanna Trylska
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- * E-mail:
| |
Collapse
|
13
|
Glouzon JPS, Perreault JP, Wang S. The super-n-motifs model: a novel alignment-free approach for representing and comparing RNA secondary structures. Bioinformatics 2017; 33:1169-1178. [PMID: 28088762 DOI: 10.1093/bioinformatics/btw773] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Indexed: 12/13/2022] Open
Abstract
Motivation Comparing ribonucleic acid (RNA) secondary structures of arbitrary size uncovers structural patterns that can provide a better understanding of RNA functions. However, performing fast and accurate secondary structure comparisons is challenging when we take into account the RNA configuration (i.e. linear or circular), the presence of pseudoknot and G-quadruplex (G4) motifs and the increasing number of secondary structures generated by high-throughput probing techniques. To address this challenge, we propose the super-n-motifs model based on a latent analysis of enhanced motifs comprising not only basic motifs but also adjacency relations. The super-n-motifs model computes a vector representation of secondary structures as linear combinations of these motifs. Results We demonstrate the accuracy of our model for comparison of secondary structures from linear and circular RNA while also considering pseudoknot and G4 motifs. We show that the super-n-motifs representation effectively captures the most important structural features of secondary structures, as compared to other representations such as ordered tree, arc-annotated and string representations. Finally, we demonstrate the time efficiency of our model, which is alignment free and capable of performing large-scale comparisons of 10 000 secondary structures with an efficiency up to 4 orders of magnitude faster than existing approaches. Availability and Implementation The super-n-motifs model was implemented in C ++. Source code and Linux binary are freely available at http://jpsglouzon.github.io/supernmotifs/ . Contact Shengrui.Wang@Usherbrooke.ca. Supplementary information Supplementary data are available at Bioinformatics o nline.
Collapse
Affiliation(s)
- Jean-Pierre Séhi Glouzon
- Department of Computer Science, Faculty of Science, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada.,RNA Group, Department of Biochemistry, Faculty of Medicine and Health Sciences, Applied Cancer Research Pavilion, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Jean-Pierre Perreault
- RNA Group, Department of Biochemistry, Faculty of Medicine and Health Sciences, Applied Cancer Research Pavilion, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Shengrui Wang
- Department of Computer Science, Faculty of Science, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| |
Collapse
|
14
|
Chiu JKH, Chen YPP. A comprehensive study of RNA secondary structure alignment algorithms. Brief Bioinform 2017; 18:291-305. [PMID: 26984617 DOI: 10.1093/bib/bbw009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Indexed: 01/04/2023] Open
Abstract
RNA secondary structure alignment has received more attention since the discovery of the structure-function relationships in some non-protein-encoding RNAs. However, unlike the pure sequence alignment problem, which has been solved in polynomial time, secondary structure alignment incorporates the base pairings as another information dimension in addition to the base sequence. This problem therefore becomes more challenging. In this study, we classify the selected approaches, and algorithmically illustrate how these methods address the alignment problems with different structure types. Other features such as the types of base pair edit operations supported and the time complexity are also compared.
Collapse
Affiliation(s)
- Jimmy Ka Ho Chiu
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria, Australia
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria, Australia
| |
Collapse
|
15
|
Li Y, Shi X, Liang Y, Xie J, Zhang Y, Ma Q. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation. BMC Bioinformatics 2017; 18:51. [PMID: 28109252 PMCID: PMC5251234 DOI: 10.1186/s12859-017-1481-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/10/2017] [Indexed: 01/10/2023] Open
Abstract
Background RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. Results An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. Conclusion RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1481-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ying Li
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Xiaohu Shi
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai, 519041, China
| | - Juan Xie
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA.,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA.,BioSNTR, Brookings, SD, USA
| | - Yu Zhang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.
| | - Qin Ma
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA. .,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA. .,BioSNTR, Brookings, SD, USA.
| |
Collapse
|
16
|
Abstract
The secondary structure of an RNA molecule represents the base-pairing interactions within the molecule and fundamentally determines its overall structure. In this chapter, we overview the main approaches and existing tools for predicting RNA secondary structures, as well as methods for identifying noncoding RNAs from genomic sequences or RNA sequencing data. We then focus on the identification of a well-known class of small noncoding RNAs, namely microRNAs, which play very important roles in many biological processes through regulating post-transcriptionally the expression of genes and which dysregulation has been shown to be involved in several human diseases.
Collapse
Affiliation(s)
- Fariza Tahi
- IBISC, UEVE/Genopole, 23 bv. de France, 91000, Evry, France.
- IPS2, University of Paris-Saclay, 91190, Gif-sur-Yvette, France.
| | - Van Du T Tran
- Vital-IT group, SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Anouar Boucheham
- IBISC, UEVE/Genopole, 23 bv. de France, 91000, Evry, France
- College of NTIC, Constantine University 2, Constantine, Algeria
| |
Collapse
|
17
|
Chiu JKH, Chen YPP. Pairwise RNA secondary structure alignment with conserved stem pattern. Bioinformatics 2015; 31:3914-21. [PMID: 26275897 DOI: 10.1093/bioinformatics/btv471] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Accepted: 08/07/2015] [Indexed: 12/23/2022] Open
Abstract
MOTIVATION The regulatory functions performed by non-coding RNAs are related to their 3D structures, which are, in turn, determined by their secondary structures. Pairwise secondary structure alignment gives insight into the functional similarity between a pair of RNA sequences. Numerous exact or heuristic approaches have been proposed for computational alignment. However, the alignment becomes intractable when arbitrary pseudoknots are allowed. Also, since non-coding RNAs are, in general, more conserved in structures than sequences, it is more effective to perform alignment based on the common structural motifs discovered. RESULTS We devised a method to approximate the true conserved stem pattern for a secondary structure pair, and constructed the alignment from it. Experimental results suggest that our method identified similar RNA secondary structures better than the existing tools, especially for large structures. It also successfully indicated the conservation of some pseudoknot features with biological significance. More importantly, even for large structures with arbitrary pseudoknots, the alignment can usually be obtained efficiently. AVAILABILITY AND IMPLEMENTATION Our algorithm has been implemented in a tool called PSMAlign. The source code of PSMAlign is freely available at http://homepage.cs.latrobe.edu.au/ypchen/psmalign/.
Collapse
Affiliation(s)
- Jimmy Ka Ho Chiu
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria 3086, Australia
| |
Collapse
|
18
|
Mattei E, Pietrosanto M, Ferrè F, Helmer-Citterich M. Web-Beagle: a web server for the alignment of RNA secondary structures. Nucleic Acids Res 2015; 43:W493-7. [PMID: 25977293 PMCID: PMC4489221 DOI: 10.1093/nar/gkv489] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 05/02/2015] [Indexed: 12/18/2022] Open
Abstract
Web-Beagle (http://beagle.bio.uniroma2.it) is a web server for the pairwise global or local alignment of RNA secondary structures. The server exploits a new encoding for RNA secondary structure and a substitution matrix of RNA structural elements to perform RNA structural alignments. The web server allows the user to compute up to 10 000 alignments in a single run, taking as input sets of RNA sequences and structures or primary sequences alone. In the latter case, the server computes the secondary structure prediction for the RNAs on-the-fly using RNAfold (free energy minimization). The user can also compare a set of input RNAs to one of five pre-compiled RNA datasets including lncRNAs and 3′ UTRs. All types of comparison produce in output the pairwise alignments along with structural similarity and statistical significance measures for each resulting alignment. A graphical color-coded representation of the alignments allows the user to easily identify structural similarities between RNAs. Web-Beagle can be used for finding structurally related regions in two or more RNAs, for the identification of homologous regions or for functional annotation. Benchmark tests show that Web-Beagle has lower computational complexity, running time and better performances than other available methods.
Collapse
Affiliation(s)
- Eugenio Mattei
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 6, 40126 Bologna, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
19
|
Bourgeade L, Chauve C, Allali J. Chaining sequence/structure seeds for computing RNA similarity. J Comput Biol 2015; 22:205-17. [PMID: 25768236 DOI: 10.1089/cmb.2014.0283] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We describe a new method to compare a query RNA with a static set of target RNAs. Our method is based on (i) a static indexing of the sequence/structure seeds of the target RNAs; (ii) searching the target RNAs by detecting seeds of the query present in the target, chaining these seeds in promising candidate homologs; and then (iii) completing the alignment using an anchor-based exact alignment algorithm. We apply our method on the benchmark Bralibase2.1 and compare its accuracy and efficiency with the exact method LocARNA and its recent seeds-based speed-up ExpLoc-P. Our pipeline RNA-unchained greatly improves computation time of LocARNA and is comparable to the one of ExpLoc-P, while improving the overall accuracy of the final alignments.
Collapse
|
20
|
Toll-like receptor 3 activation is required for normal skin barrier repair following UV damage. J Invest Dermatol 2014; 135:569-578. [PMID: 25118157 PMCID: PMC4289479 DOI: 10.1038/jid.2014.354] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Revised: 07/30/2014] [Accepted: 08/05/2014] [Indexed: 12/30/2022]
Abstract
Ultraviolet (UV) damage to the skin leads to the release of noncoding RNA (ncRNA) from necrotic keratinocytes that activates toll-like receptor 3 (TLR3). This release of ncRNA triggers inflammation in the skin following UV damage. Recently, TLR3 activation was also shown to aid wound repair and increase expression of genes associated with permeability barrier repair. Here, we sought to test if skin barrier repair after UVB damage is dependent on the activation of TLR3. We observed that multiple ncRNAs induced expression of skin barrier repair genes, that the TLR3 ligand Poly (I:C) also induced expression and function of tight junctions, and that the ncRNA U1 acts in a TLR3-dependent manner to induce expression of skin barrier repair genes. These observations were shown to have functional relevance as Tlr3−/− mice displayed a delay in skin barrier repair following UVB damage. Combined, these data further validate the conclusion that recognition of endogenous RNA by TLR3 is an important step in the program of skin barrier repair.
Collapse
|
21
|
Mattei E, Ausiello G, Ferrè F, Helmer-Citterich M. A novel approach to represent and compare RNA secondary structures. Nucleic Acids Res 2014; 42:6146-57. [PMID: 24753415 PMCID: PMC4041456 DOI: 10.1093/nar/gku283] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2013] [Revised: 03/25/2014] [Accepted: 03/26/2014] [Indexed: 12/18/2022] Open
Abstract
Structural information is crucial in ribonucleic acid (RNA) analysis and functional annotation; nevertheless, how to include such structural data is still a debated problem. Dot-bracket notation is the most common and simple representation for RNA secondary structures but its simplicity leads also to ambiguity requiring further processing steps to dissolve. Here we present BEAR (Brand nEw Alphabet for RNA), a new context-aware structural encoding represented by a string of characters. Each character in BEAR encodes for a specific secondary structure element (loop, stem, bulge and internal loop) with specific length. Furthermore, exploiting this informative and yet simple encoding in multiple alignments of related RNAs, we captured how much structural variation is tolerated in RNA families and convert it into transition rates among secondary structure elements. This allowed us to compute a substitution matrix for secondary structure elements called MBR (Matrix of BEAR-encoded RNA secondary structures), of which we tested the ability in aligning RNA secondary structures. We propose BEAR and the MBR as powerful resources for the RNA secondary structure analysis, comparison and classification, motif finding and phylogeny.
Collapse
Affiliation(s)
- Eugenio Mattei
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome 'Tor Vergata', Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome 'Tor Vergata', Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome 'Tor Vergata', Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome 'Tor Vergata', Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
22
|
Optimisation Problems for Pairwise RNA Sequence and Structure Comparison: A Brief Survey. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/978-3-642-54455-2_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
|
23
|
Abstract
Many methods have been proposed for RNA secondary structure comparison, and new ones are still being developed. In this chapter, we first consider structure representations and discuss their suitability for structure comparison. Then, we take a look at the more commonly used methods, restricting ourselves to structures without pseudo-knots. For comparing structures of the same sequence, we study base pair distances. For structures of different sequences (and of different length), we study variants of the tree edit model. We name some of the available tools and give pointers to the literature. We end with a short review on comparing structures with pseudo-knots as an unsolved problem and topic of active research.
Collapse
|
24
|
Milo N, Zakov S, Katzenelson E, Bachmat E, Dinitz Y, Ziv-Ukelson M. Unrooted unordered homeomorphic subtree alignment of RNA trees. Algorithms Mol Biol 2013; 8:13. [PMID: 23590940 PMCID: PMC3765143 DOI: 10.1186/1748-7188-8-13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 02/05/2013] [Indexed: 11/17/2022] Open
Abstract
We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining global or local, ordered or unordered, and rooted or unrooted tree alignments. Our algorithm generalizes previous algorithms that either solved the problem in an asymmetric manner, or were restricted to the rooted and/or ordered cases. Focusing here on the most general unrooted unordered case, we show that for input trees T and S, our algorithm has an O(nTnS + min(dT,dS)LTLS) time complexity, where nT,LT and dT are the number of nodes, the number of leaves, and the maximum node degree in T, respectively (satisfying dT ≤ LT ≤ nT), and similarly for nS,LS and dS with respect to the tree S. This improves the time complexity of previous algorithms for less general variants of the problem. In order to obtain this time bound for HSA, we developed new algorithms for a generalized variant of the Min-Cost Bipartite Matching problem (MCM), as well as to two derivatives of this problem, entitled All-Cavity-MCM and All-Pairs-Cavity-MCM. For two input sets of size n and m, where n ≤ m, MCM and both its cavity derivatives are solved in O(n3 + nm) time, without the usage of priority queues (e.g. Fibonacci heaps) or other complex data structures. This gives the first cubic time algorithm for All-Pairs-Cavity-MCM, and improves the running times of MCM and All-Cavity-MCM problems in the unbalanced case where n ≪ m. We implemented the algorithm (in all modes mentioned above) as a graphical software tool which computes and displays similarities between secondary structures of RNA given as input, and employed it to a preliminary experiment in which we ran all-against-all inter-family pairwise alignments of RNAse P and Hammerhead RNA family members, exposing new similarities which could not be detected by the traditional rooted ordered alignment approaches. The results demonstrate that our approach can be used to expose structural similarity between some RNAs with higher sensitivity than the traditional rooted ordered alignment approaches. Source code and web-interface for our tool can be found in http://www.cs.bgu.ac.il/\~negevcb/FRUUT.
Collapse
|
25
|
BRASERO: A Resource for Benchmarking RNA Secondary Structure Comparison Algorithms. Adv Bioinformatics 2012; 2012:893048. [PMID: 22675348 PMCID: PMC3366197 DOI: 10.1155/2012/893048] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2011] [Accepted: 02/22/2012] [Indexed: 11/23/2022] Open
Abstract
The pairwise comparison of RNA secondary structures is a fundamental problem, with direct application in mining databases for annotating putative noncoding RNA candidates in newly sequenced genomes. An increasing number of software tools are available for comparing RNA secondary structures, based on different models (such as ordered trees or forests, arc annotated sequences, and multilevel trees) and computational principles (edit distance, alignment). We describe here the website BRASERO that offers tools for evaluating such software tools on real and synthetic datasets.
Collapse
|
26
|
Clote P, Lou F, Lorenz WA. Maximum expected accuracy structural neighbors of an RNA secondary structure. BMC Bioinformatics 2012; 13 Suppl 5:S6. [PMID: 22537010 PMCID: PMC3358666 DOI: 10.1186/1471-2105-13-s5-s6] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Background Since RNA molecules regulate genes and control alternative splicing by allostery, it is important to develop algorithms to predict RNA conformational switches. Some tools, such as paRNAss, RNAshapes and RNAbor, can be used to predict potential conformational switches; nevertheless, no existent tool can detect general (i.e., not family specific) entire riboswitches (both aptamer and expression platform) with accuracy. Thus, the development of additional algorithms to detect conformational switches seems important, especially since the difference in free energy between the two metastable secondary structures may be as large as 15-20 kcal/mol. It has recently emerged that RNA secondary structure can be more accurately predicted by computing the maximum expected accuracy (MEA) structure, rather than the minimum free energy (MFE) structure. Results Given an arbitrary RNA secondary structure S0 for an RNA nucleotide sequence a = a1,..., an, we say that another secondary structure S of a is a k-neighbor of S0, if the base pair distance between S0 and S is k. In this paper, we prove that the Boltzmann probability of all k-neighbors of the minimum free energy structure S0 can be approximated with accuracy ε and confidence 1 - p, simultaneously for all 0 ≤ k < K, by a relative frequency count over N sampled structures, provided that N>N(ε,p,K)=Φ-1p2K24ε2, where Φ(z) is the cumulative distribution function (CDF) for the standard normal distribution. We go on to describe the algorithm RNAborMEA, which for an arbitrary initial structure S0 and for all values 0 ≤ k < K, computes the secondary structure MEA(k), having maximum expected accuracy over all k-neighbors of S0. Computation time is O(n3 · K2), and memory requirements are O(n2 · K). We analyze a sample TPP riboswitch, and apply our algorithm to the class of purine riboswitches. Conclusions The approximation of RNAbor by sampling, with rigorous bound on accuracy, together with the computation of maximum expected accuracy k-neighbors by RNAborMEA, provide additional tools toward conformational switch detection. Results from RNAborMEA are quite distinct from other tools, such as RNAbor, RNAshapes and paRNAss, hence may provide orthogonal information when looking for suboptimal structures or conformational switches. Source code for RNAborMEA can be downloaded from http://sourceforge.net/projects/rnabormea/ or http://bioinformatics.bc.edu/clotelab/RNAborMEA/.
Collapse
Affiliation(s)
- Peter Clote
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA.
| | | | | |
Collapse
|
27
|
Rinaudo P, Ponty Y, Barth D, Denise A. Tree Decomposition and Parameterized Algorithms for RNA Structure-Sequence Alignment Including Tertiary Interactions and Pseudoknots. ACTA ACUST UNITED AC 2012. [DOI: 10.1007/978-3-642-33122-0_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
|