1
|
Binet T, Avalle B, Dávila Felipe M, Maffucci I. AptaMat: a matrix-based algorithm to compare single-stranded oligonucleotides secondary structures. Bioinformatics 2022; 39:6849515. [PMID: 36440922 PMCID: PMC9805580 DOI: 10.1093/bioinformatics/btac752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 11/14/2022] [Accepted: 11/24/2022] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION Comparing single-stranded nucleic acids (ssNAs) secondary structures is fundamental when investigating their function and evolution and predicting the effect of mutations on their structures. Many comparison metrics exist, although they are either too elaborate or not sensitive enough to distinguish close ssNAs structures. RESULTS In this context, we developed AptaMat, a simple and sensitive algorithm for ssNAs secondary structures comparison based on matrices representing the ssNAs secondary structures and a metric built upon the Manhattan distance in the plane. We applied AptaMat to several examples and compared the results to those obtained by the most frequently used metrics, namely the Hamming distance and the RNAdistance, and by a recently developed image-based approach. We showed that AptaMat is able to discriminate between similar sequences, outperforming all the other here considered metrics. In addition, we showed that AptaMat was able to correctly classify 14 RFAM families within a clustering procedure. AVAILABILITY AND IMPLEMENTATION The python code for AptaMat is available at https://github.com/GEC-git/AptaMat.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas Binet
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Bérangère Avalle
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | | | | |
Collapse
|
2
|
Abstract
Identifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via chemical probing data. We employ an information-theoretic approach to solve the problem, via considering a variant of the Rényi-Ulam game. Our framework is centered around the ensemble tree, a hierarchical bi-partition of the input ensemble, that is constructed by recursively querying about whether or not a base pair of maximum information entropy is contained in the target. These queries are answered via relating local with global probing data, employing the modularity in RNA secondary structures. We present that leaves of the tree are comprised of sub-samples exhibiting a distinguished structure with high probability. In particular, for a Boltzmann ensemble incorporating probing data, which is well established in the literature, the probability of our framework correctly identifying the target in the leaf is greater than \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$90\%$$\end{document}90%.
Collapse
|
3
|
Abstract
Aptamers have a spectrum of applications in biotechnology and drug design, because of the relative simplicity of experimental protocols and advantages of stability and specificity associated with their structural properties. However, to understand the structure-function relationships of aptamers, robust structure modeling tools are necessary. Several such tools have been developed and extensively tested, although most of them target various forms of biological RNA. In this study, we tested the performance of three tools in application to DNA aptamers, since DNA aptamers are the focus of many studies, particularly in drug discovery. We demonstrated that in most cases, the secondary structure of DNA can be reconstructed with acceptable accuracy by at least one of the three tools tested (Mfold, RNAfold, and CentroidFold), although the G-quadruplex motif found in many of the DNA aptamer structures complicates the prediction, as well as the pseudoknot interaction. This problem should be addressed more carefully to improve prediction accuracy.
Collapse
Affiliation(s)
- Arina Afanasyeva
- Artificial Intelligence Center for Health and Biomedical Research (ArCHER), National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| | - Chioko Nagao
- Laboratory of In-silico Drug Design, Center for Drug Design Research (CDDR), National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| | - Kenji Mizuguchi
- Artificial Intelligence Center for Health and Biomedical Research (ArCHER), National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan.,Laboratory of In-silico Drug Design, Center for Drug Design Research (CDDR), National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| |
Collapse
|
4
|
Abstract
An RNA switch triggers biological functions by toggling between two conformations. RNA switches include bacterial riboswitches, where ligand binding can stabilize a bound structure. For RNAs with only one stable structure, structural prediction usually just requires a straightforward free energy minimization, but for an RNA switch, the prediction of a less stable alternative structure is often computationally costly and even problematic. The current sampling-clustering method predicts stable and alternative structures by partitioning structures sampled from the energy landscape into two clusters, but it is very time-consuming. Instead, we predict the alternative structure of an RNA switch from conditional probability calculations within the energy landscape. First, our method excludes base pairs related to the most stable structure in the energy landscape. Then, it detects stable stems (“seeds”) in the remaining landscape. Finally, it folds an alternative structure prediction around a seed. While having comparable riboswitch classification performance, the conditional-probability computations had fewer adjustable parameters, offered greater predictive flexibility, and were more than one thousand times faster than the sampling step alone in sampling-clustering predictions, the competing standard. Overall, the described approach helps traverse thermodynamically improbable energy landscapes to find biologically significant substructures and structures rapidly and effectively.
Collapse
Affiliation(s)
- Amirhossein Manzourolajdad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| | - John L. Spouge
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
5
|
Glouzon JPS, Perreault JP, Wang S. Structurexplor: a platform for the exploration of structural features of RNA secondary structures. Bioinformatics 2018; 33:3117-3120. [PMID: 28575203 DOI: 10.1093/bioinformatics/btx323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 05/26/2017] [Indexed: 11/14/2022] Open
Abstract
Summary Discovering function-related structural features, such as the cloverleaf shape of transfer RNA secondary structures, is essential to understand RNA function. With this aim, we have developed a platform, named Structurexplor, to facilitate the exploration of structural features in populations of RNA secondary structures. It has been designed and developed to help biologists interactively search for, evaluate and select interesting structural features that can potentially explain RNA functions. Availability and implementation Structurxplor is a web application available at http://structurexplor.dinf.usherbrooke.ca. The source code can be found at http://jpsglouzon.github.io/structurexplor/. Contact shengrui.wang@usherbrooke.ca. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jean-Pierre Séhi Glouzon
- Department of Computer Science, Faculty of Science, Université de Sherbrooke, Sherbrooke, QC, J1K 2R1 Canada.,RNA Group, Department of Biochemistry, Faculty of Medicine and Health Sciences, Applied Cancer Research Pavilion, Université de Sherbrooke, Sherbrooke, QC, J1K 2R1, Canada
| | - Jean-Pierre Perreault
- RNA Group, Department of Biochemistry, Faculty of Medicine and Health Sciences, Applied Cancer Research Pavilion, Université de Sherbrooke, Sherbrooke, QC, J1K 2R1, Canada
| | - Shengrui Wang
- Department of Computer Science, Faculty of Science, Université de Sherbrooke, Sherbrooke, QC, J1K 2R1 Canada
| |
Collapse
|
6
|
Kato Y, Gorodkin J, Havgaard JH. Alignment-free comparative genomic screen for structured RNAs using coarse-grained secondary structure dot plots. BMC Genomics 2017; 18:935. [PMID: 29197323 PMCID: PMC5712110 DOI: 10.1186/s12864-017-4309-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 11/15/2017] [Indexed: 01/01/2023] Open
Abstract
Background Structured non-coding RNAs play many different roles in the cells, but the annotation of these RNAs is lacking even within the human genome. The currently available computational tools are either too computationally heavy for use in full genomic screens or rely on pre-aligned sequences. Methods Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level. This allows us to perform an all-against-all scan of all window pairs from two genomes without alignment. Results Our computational experiments with simulated data and real chromosomes demonstrate that the presented method has good sensitivity. Conclusions DotcodeR can be useful as a pre-filter in a genomic comparative scan for structured RNAs. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4309-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yuki Kato
- Department of RNA Biology and Neuroscience, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, 565-0871, Japan. .,Center for non-coding RNA in Technology and Health (RTH), University of Copenhagen, Groennegaardsvej 3, Frederiksberg, 1870, Denmark.
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health (RTH), University of Copenhagen, Groennegaardsvej 3, Frederiksberg, 1870, Denmark
| | - Jakob Hull Havgaard
- Center for non-coding RNA in Technology and Health (RTH), University of Copenhagen, Groennegaardsvej 3, Frederiksberg, 1870, Denmark.
| |
Collapse
|
7
|
Abstract
The secondary structure of an RNA molecule represents the base-pairing interactions within the molecule and fundamentally determines its overall structure. In this chapter, we overview the main approaches and existing tools for predicting RNA secondary structures, as well as methods for identifying noncoding RNAs from genomic sequences or RNA sequencing data. We then focus on the identification of a well-known class of small noncoding RNAs, namely microRNAs, which play very important roles in many biological processes through regulating post-transcriptionally the expression of genes and which dysregulation has been shown to be involved in several human diseases.
Collapse
Affiliation(s)
- Fariza Tahi
- IBISC, UEVE/Genopole, 23 bv. de France, 91000, Evry, France.
- IPS2, University of Paris-Saclay, 91190, Gif-sur-Yvette, France.
| | - Van Du T Tran
- Vital-IT group, SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Anouar Boucheham
- IBISC, UEVE/Genopole, 23 bv. de France, 91000, Evry, France
- College of NTIC, Constantine University 2, Constantine, Algeria
| |
Collapse
|
8
|
Kutchko KM, Sanders W, Ziehr B, Phillips G, Solem A, Halvorsen M, Weeks KM, Moorman N, Laederach A. Multiple conformations are a conserved and regulatory feature of the RB1 5' UTR. RNA 2015; 21:1274-85. [PMID: 25999316 PMCID: PMC4478346 DOI: 10.1261/rna.049221.114] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2014] [Accepted: 03/27/2015] [Indexed: 05/22/2023]
Abstract
Folding to a well-defined conformation is essential for the function of structured ribonucleic acids (RNAs) like the ribosome and tRNA. Structured elements in the untranslated regions (UTRs) of specific messenger RNAs (mRNAs) are known to control expression. The importance of unstructured regions adopting multiple conformations, however, is still poorly understood. High-resolution SHAPE-directed Boltzmann suboptimal sampling of the Homo sapiens Retinoblastoma 1 (RB1) 5' UTR yields three distinct conformations compatible with the experimental data. Private single nucleotide variants (SNVs) identified in two patients with retinoblastoma each collapse the structural ensemble to a single but distinct well-defined conformation. The RB1 5' UTRs from Bos taurus (cow) and Trichechus manatus latirostris (manatee) are divergent in sequence from H. sapiens (human) yet maintain structural compatibility with high-probability base pairs. SHAPE chemical probing of the cow and manatee RB1 5' UTRs reveals that they also adopt multiple conformations. Luciferase reporter assays reveal that 5' UTR mutations alter RB1 expression. In a traditional model of disease, causative SNVs disrupt a key structural element in the RNA. For the subset of patients with heritable retinoblastoma-associated SNVs in the RB1 5' UTR, the absence of multiple structures is likely causative of the cancer. Our data therefore suggest that selective pressure will favor multiple conformations in eukaryotic UTRs to regulate expression.
Collapse
Affiliation(s)
- Katrina M Kutchko
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Wes Sanders
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Ben Ziehr
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, North Carolina 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Gabriela Phillips
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Amanda Solem
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Matthew Halvorsen
- Institute for Genomic Medicine, Columbia University Medical Center, New York, New York 10032, USA
| | - Kevin M Weeks
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Nathaniel Moorman
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, North Carolina 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| |
Collapse
|
9
|
Harrison JG, Zheng YB, Beal PA, Tantillo DJ. Computational approaches to predicting the impact of novel bases on RNA structure and stability. ACS Chem Biol 2013; 8:2354-9. [PMID: 24063428 DOI: 10.1021/cb4006062] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The use of computational modeling techniques to gain insight into nucleobase interactions has been a challenging endeavor to date. Accurate treatment requires the tackling of many challenges but also holds the promise of great rewards. The development of effective computational approaches to predict the binding affinities of nucleobases and analogues can, for example, streamline the process of developing novel nucleobase modifications, which should facilitate the development of new RNAi-based therapeutics. This brief review focuses on available computational approaches to predicting base pairing affinity in RNA-based contexts such as nucleobase-nucleobase interactions in duplexes and nucleobase-protein interactions. The challenges associated with such modeling along with potential future directions for the field are highlighted.
Collapse
Affiliation(s)
- Jason G. Harrison
- Department of Chemistry, University of California−Davis, Davis, California 95616, United States
| | - Yvonne B. Zheng
- Department of Chemistry, University of California−Davis, Davis, California 95616, United States
| | - Peter A. Beal
- Department of Chemistry, University of California−Davis, Davis, California 95616, United States
| | - Dean J. Tantillo
- Department of Chemistry, University of California−Davis, Davis, California 95616, United States
| |
Collapse
|
10
|
Kim Y, Lee G, Jeon E, Sohn EJ, Lee Y, Kang H, Lee DW, Kim DH, Hwang I. The immediate upstream region of the 5'-UTR from the AUG start codon has a pronounced effect on the translational efficiency in Arabidopsis thaliana. Nucleic Acids Res 2013; 42:485-98. [PMID: 24084084 PMCID: PMC3874180 DOI: 10.1093/nar/gkt864] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The nucleotide sequence around the translational initiation site is an important cis-acting element for post-transcriptional regulation. However, it has not been fully understood how the sequence context at the 5′-untranslated region (5′-UTR) affects the translational efficiency of individual mRNAs. In this study, we provide evidence that the 5′-UTRs of Arabidopsis genes showing a great difference in the nucleotide sequence vary greatly in translational efficiency with more than a 200-fold difference. Of the four types of nucleotides, the A residue was the most favourable nucleotide from positions −1 to −21 of the 5′-UTRs in Arabidopsis genes. In particular, the A residue in the 5′-UTR from positions −1 to −5 was required for a high-level translational efficiency. In contrast, the T residue in the 5′-UTR from positions −1 to −5 was the least favourable nucleotide in translational efficiency. Furthermore, the effect of the sequence context in the −1 to −21 region of the 5′-UTR was conserved in different plant species. Based on these observations, we propose that the sequence context immediately upstream of the AUG initiation codon plays a crucial role in determining the translational efficiency of plant genes.
Collapse
Affiliation(s)
- Younghyun Kim
- Department of Life Sciences, School of Bioscience and Bioengineering and Division of Integrative Biosciences and Biotechnology, Pohang University of Science and Technology, Pohang 790-784, Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Churkin A, Barash D. RNA dot plots: an image representation for RNA secondary structure analysis and manipulations. Wiley Interdiscip Rev RNA 2013; 4:205-16. [PMID: 23386427 DOI: 10.1002/wrna.1154] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Dot plots were originally introduced in bioinformatics as dot-containing images used to compare biological sequences and identify regions of close similarity between them. In addition to similarity, dot plots were extended to possibly represent interactions between building blocks of biological sequences, where the dots can vary in size or color according to desired features. In this survey, we first review their use in representing an RNA secondary structure, which has mostly been applied for displaying the output secondary structures as a result of running RNA folding prediction algorithms. Such a result may often contain suboptimal solutions in addition to the optimal one, which can be easily incorporated in the dot plot. We then proceed from their passive use of providing RNA secondary structure snapshots to their active use of illustrating RNA secondary structure manipulations in beneficial ways. While comparison between RNA secondary structures can mostly be done efficiently using a string representation, there are notable advantages in using dot plots for analyzing the suboptimal solutions that convey important information about the structure of the RNA molecule. In addition, structure-based alignment of dot plots has been advanced considerably and the filtering of dot plots that considers chemical and enzymatic data from structure determination experiments has been suggested. We discuss these procedures and how they can be enhanced in the future by using an image representation to analyze RNA secondary structures and examine their manipulations.
Collapse
Affiliation(s)
- Alexander Churkin
- Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
| | | |
Collapse
|
12
|
Chung WJ, Agius P, Westholm JO, Chen M, Okamura K, Robine N, Leslie CS, Lai EC. Computational and experimental identification of mirtrons in Drosophila melanogaster and Caenorhabditis elegans. Genome Res 2010; 21:286-300. [PMID: 21177960 DOI: 10.1101/gr.113050.110] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Mirtrons are intronic hairpin substrates of the dicing machinery that generate functional microRNAs. In this study, we describe experimental assays that defined the essential requirements for entry of introns into the mirtron pathway. These data informed a bioinformatic screen that effectively identified functional mirtrons from the Drosophila melanogaster transcriptome. These included 17 known and six confident novel mirtrons among the top 51 candidates, and additional candidates had limited read evidence in available small RNA data. Our computational model also proved effective on Caenorhabditis elegans, for which the identification of 14 cloned mirtrons among the top 22 candidates more than tripled the number of validated mirtrons in this species. A few low-scoring introns generated mirtron-like read patterns from atypical RNA structures, but their paucity suggests that relatively few such loci were not captured by our model. Unexpectedly, we uncovered examples of clustered mirtrons in both fly and worm genomes, including a <8-kb region in C. elegans harboring eight distinct mirtrons. Altogether, we demonstrate that discovery of functional mirtrons, unlike canonical miRNAs, is amenable to computational methods independent of evolutionary constraint.
Collapse
Affiliation(s)
- Wei-Jen Chung
- Department of Developmental Biology, Sloan-Kettering Institute, 1017 Rockefeller Research Laboratories, New York, New York 10065, USA
| | | | | | | | | | | | | | | |
Collapse
|