Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Xu X, Ji Y, Stormo GD. RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics 2007;23:1883-91. [PMID: 17537756 DOI: 10.1093/bioinformatics/btm272] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

For:	Xu X, Ji Y, Stormo GD. RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics 2007;23:1883-91. [PMID: 17537756 DOI: 10.1093/bioinformatics/btm272] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Number

Cited by Other Article(s)

Tieng FYF, Abdullah-Zawawi MR, Md Shahri NAA, Mohamed-Hussein ZA, Lee LH, Mutalib NSA. A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools. Brief Bioinform 2023;25:bbad421. [PMID: 38040490 PMCID: PMC10753535 DOI: 10.1093/bib/bbad421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/16/2023] [Accepted: 10/26/2023] [Indexed: 12/03/2023] Open

El Fatmi A, Bekri MA, Benhlima S. RNAknot: A new algorithm for RNA secondary structure prediction based on genetic algorithm and GRASP method. J Bioinform Comput Biol 2020;17:1950031. [PMID: 31856666 DOI: 10.1142/s0219720019500318] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Abstract

The prediction of the optimal secondary structure for a given RNA sequence represents a challenging computational problem in bioinformatics. This challenge becomes harder especially with the discovery of different pseudoknot classes, which is a complex topology that plays diverse roles in biological processes. Many recent studies have been proposed to predict RNA secondary structure with some pseudoknot classes, but only a few of them have reached satisfying results in terms of both complexity and accuracy. Here we present RNAknot, a new method for predicting RNA secondary structure that contains the following components: stems, hairpin loops, multi-branched loops or multi-loops, bulge loops, and internal loops, in addition to two types of pseudoknots, H-type pseudoknot and Hairpin kissing. RNAknot is based on a genetic algorithm and Greedy Randomized Adaptive Search Procedure (GRASP), and it uses the free energy as fitness function to evaluate the obtained structures. In order to validate the performance of the presented method 131 tests have been performed using two datasets of 26 and 105 RNA sequences, which have been taken from the two data bases RNAstrand and Pseudobase respectively. The obtained results are compared with those of some RNA secondary structure prediction programs such as Vs_subopt, CyloFold, IPknot, Kinefold, RNAstructure, and Sfold. The results of this comparative study show that the prediction accuracy of our proposed approach is significantly improved compared to those obtained by the other programs. For the first dataset, RNAknot has the highest specificity (SP) (71.23%) and sensitivity (SN) (72.15%) averages compared to the other programs. Concerning the second dataset, the RNA secondary structure predictions obtained by the RNAknot correspond to the highest averages of SP (85.49%) and F-measure (79.97%) compared to the other programs. The program is available as a jar file in the link: www.bachmek.umi.ac.ma/wp-content/uploads/RNAknot.0.0.2.rar.

Collapse

Barquist L, Burge SW, Gardner PP. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families. CURRENT PROTOCOLS IN BIOINFORMATICS 2016;54:12.13.1-12.13.25. [PMID: 27322404 PMCID: PMC5010141 DOI: 10.1002/cpbi.4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Zhang Y, Huang H, Dong X, Fang Y, Wang K, Zhu L, Wang K, Huang T, Yang J. A Dynamic 3D Graphical Representation for RNA Structure Analysis and Its Application in Non-Coding RNA Classification. PLoS One 2016;11:e0152238. [PMID: 27213271 PMCID: PMC4877074 DOI: 10.1371/journal.pone.0152238] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 03/10/2016] [Indexed: 12/21/2022] Open

Prediction of Secondary Structures Conserved in Multiple RNA Sequences. Methods Mol Biol 2016;1490:35-50. [PMID: 27665591 DOI: 10.1007/978-1-4939-6433-8_3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Chatzou M, Magis C, Chang JM, Kemena C, Bussotti G, Erb I, Notredame C. Multiple sequence alignment modeling: methods and applications. Brief Bioinform 2015;17:1009-1023. [PMID: 26615024 DOI: 10.1093/bib/bbv099] [Citation(s) in RCA: 84] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2015] [Revised: 10/16/2015] [Indexed: 12/20/2022] Open

Song Y, Hua L, Shapiro BA, Wang JTL. Effective alignment of RNA pseudoknot structures using partition function posterior log-odds scores. BMC Bioinformatics 2015;16:39. [PMID: 25727492 PMCID: PMC4339682 DOI: 10.1186/s12859-015-0464-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 01/13/2015] [Indexed: 11/18/2022] Open

Sloma MF, Mathews DH. Improving RNA secondary structure prediction with structure mapping data. Methods Enzymol 2015;553:91-114. [PMID: 25726462 DOI: 10.1016/bs.mie.2014.10.053] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Theil Have C, Zambach S, Christiansen H. Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes. BMC Bioinformatics 2013;14:118. [PMID: 23557142 PMCID: PMC3639795 DOI: 10.1186/1471-2105-14-118] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 03/19/2013] [Indexed: 11/10/2022] Open

Puton T, Kozlowski LP, Rother KM, Bujnicki JM. CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction. Nucleic Acids Res 2013;41:4307-23. [PMID: 23435231 PMCID: PMC3627593 DOI: 10.1093/nar/gkt101] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Shao J, Zhang J, Zhang Z, Jiang H, Lou X, Huang B, Foltz G, Lan Q, Huang Q, Lin B. Alternative polyadenylation in glioblastoma multiforme and changes in predicted RNA binding protein profiles. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013;17:136-49. [PMID: 23421905 DOI: 10.1089/omi.2012.0098] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Sato K, Kato Y, Akutsu T, Asai K, Sakakibara Y. DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition. ACTA ACUST UNITED AC 2012;28:3218-24. [PMID: 23060618 DOI: 10.1093/bioinformatics/bts612] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Seetin MG, Mathews DH. TurboKnot: rapid prediction of conserved RNA secondary structures including pseudoknots. ACTA ACUST UNITED AC 2012;28:792-8. [PMID: 22285566 DOI: 10.1093/bioinformatics/bts044] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Xu Z, Almudevar A, Mathews DH. Statistical evaluation of improvement in RNA secondary structure prediction. Nucleic Acids Res 2011;40:e26. [PMID: 22139940 PMCID: PMC3287165 DOI: 10.1093/nar/gkr1081] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Achawanantakun R, Sun Y, Takyar SS. ncRNA consensus secondary structure derivation using grammar strings. J Bioinform Comput Biol 2011;9:317-37. [PMID: 21523935 DOI: 10.1142/s0219720011005501] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Revised: 02/28/2011] [Accepted: 03/01/2011] [Indexed: 11/18/2022]

Wei D, Alpert LV, Lawrence CE. RNAG: a new Gibbs sampler for predicting RNA secondary structure for unaligned sequences. ACTA ACUST UNITED AC 2011;27:2486-93. [PMID: 21788211 PMCID: PMC3167047 DOI: 10.1093/bioinformatics/btr421] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

MOTIVATION

RNA secondary structure plays an important role in the function of many RNAs, and structural features are often key to their interaction with other cellular components. Thus, there has been considerable interest in the prediction of secondary structures for RNA families. In this article, we present a new global structural alignment algorithm, RNAG, to predict consensus secondary structures for unaligned sequences. It uses a blocked Gibbs sampling algorithm, which has a theoretical advantage in convergence time. This algorithm iteratively samples from the conditional probability distributions P(Structure | Alignment) and P(Alignment | Structure). Not surprisingly, there is considerable uncertainly in the high-dimensional space of this difficult problem, which has so far received limited attention in this field. We show how the samples drawn from this algorithm can be used to more fully characterize the posterior space and to assess the uncertainty of predictions.

RESULTS

Our analysis of three publically available datasets showed a substantial improvement in RNA structure prediction by RNAG over extant prediction methods. Additionally, our analysis of 17 RNA families showed that the RNAG sampled structures were generally compact around their ensemble centroids, and at least 11 families had at least two well-separated clusters of predicted structures. In general, the distance between a reference structure and our predicted structure was large relative to the variation among structures within an ensemble.

AVAILABILITY

The Perl implementation of the RNAG algorithm and the data necessary to reproduce the results described in Sections 3.1 and 3.2 are available at http://ccmbweb.ccv.brown.edu/rnag.html

CONTACT

charles_lawrence@brown.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Sahraeian SME, Yoon BJ. PicXAA-Web: a web-based platform for non-progressive maximum expected accuracy alignment of multiple biological sequences. Nucleic Acids Res 2011;39:W8-12. [PMID: 21515632 PMCID: PMC3125727 DOI: 10.1093/nar/gkr244] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open

Harmanci AO, Sharma G, Mathews DH. TurboFold: iterative probabilistic estimation of secondary structures for multiple RNA sequences. BMC Bioinformatics 2011;12:108. [PMID: 21507242 PMCID: PMC3120699 DOI: 10.1186/1471-2105-12-108] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2010] [Accepted: 04/20/2011] [Indexed: 01/07/2023] Open

Abstract

Background

The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented.

Results

TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold.

TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms.

Conclusions

TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.

Collapse

Sahraeian SME, Yoon BJ. PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach. BMC Bioinformatics 2011;12 Suppl 1:S38. [PMID: 21342569 PMCID: PMC3044294 DOI: 10.1186/1471-2105-12-s1-s38] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

George AD, Tenenbaum SA. Web-based tools for studying RNA structure and function. Methods Mol Biol 2011;703:67-86. [PMID: 21125484 DOI: 10.1007/978-1-59745-248-9_6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]

Xu Z, Mathews DH. Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences. ACTA ACUST UNITED AC 2010;27:626-32. [PMID: 21193521 DOI: 10.1093/bioinformatics/btq726] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Chen Q, Li G, Phoebe Chen YP. Interval-based distance function for identifying RNA structure candidates. J Theor Biol 2010;269:280-6. [PMID: 21056578 DOI: 10.1016/j.jtbi.2010.11.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Revised: 11/01/2010] [Accepted: 11/01/2010] [Indexed: 10/18/2022]

Taneda A. Multi-objective pairwise RNA sequence alignment. Bioinformatics 2010;26:2383-90. [DOI: 10.1093/bioinformatics/btq439] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Iacoangeli A, Bianchi R, Tiedge H. Regulatory RNAs in brain function and disorders. Brain Res 2010;1338:36-47. [PMID: 20307503 PMCID: PMC3524968 DOI: 10.1016/j.brainres.2010.03.042] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Revised: 03/10/2010] [Accepted: 03/15/2010] [Indexed: 11/17/2022]

Bernhart SH, Hofacker IL. From consensus structure prediction to RNA gene finding. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009;8:461-71. [PMID: 19833701 DOI: 10.1093/bfgp/elp043] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Fan D, Bitterman PB, Larsson O. Regulatory element identification in subsets of transcripts: comparison and integration of current computational methods. RNA (NEW YORK, N.Y.) 2009;15:1469-82. [PMID: 19553345 PMCID: PMC2714745 DOI: 10.1261/rna.1617009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2009] [Accepted: 05/20/2009] [Indexed: 05/20/2023]

Harmanci AO, Sharma G, Mathews DH. Stochastic sampling of the RNA structural alignment space. Nucleic Acids Res 2009;37:4063-75. [PMID: 19429694 PMCID: PMC2709569 DOI: 10.1093/nar/gkp276] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Tabei Y, Asai K. A local multiple alignment method for detection of non-coding RNA sequences. ACTA ACUST UNITED AC 2009;25:1498-505. [PMID: 19376823 DOI: 10.1093/bioinformatics/btp261] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

MOTIVATION

Non-coding RNAs (ncRNAs) show a unique evolutionary process in which the substitutions of distant bases are correlated in order to conserve the secondary structure of the ncRNA molecule. Therefore, the multiple alignment method for the detection of ncRNAs should take into account both the primary sequence and the secondary structure. Recently, there has been intense focus on multiple alignment investigations for the detection of ncRNAs; however, most of the proposed methods are designed for global multiple alignments. For this reason, these methods are not appropriate to identify locally conserved ncRNAs among genomic sequences. A more efficient local multiple alignment method for the detection of ncRNAs is required.

RESULTS

We propose a new local multiple alignment method for the detection of ncRNAs. This method uses a local multiple alignment construction procedure inspired by ProDA, which is a local multiple aligner program for protein sequences with repeated and shuffled elements. To align sequences based on secondary structure information, we propose a new alignment model which incorporates secondary structure features. We define the conditional probability of an alignment via a conditional random field and use a gamma-centroid estimator to align sequences. The locally aligned subsequences are clustered into blocks of approximately globally alignable subsequences between pairwise alignments. Finally, these blocks are multiply aligned via MXSCARNA. In benchmark experiments, we demonstrate the high ability of the implemented software, SCARNA_LM, for local multiple alignment for the detection of ncRNAs.

AVAILABILITY

The C++ source code for SCARNA_LM and its experimental datasets are available at http://www.ncrna.org/software/scarna_lm/download.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Xu X, Ji Y, Stormo GD. Discovering cis-regulatory RNAs in Shewanella genomes by Support Vector Machines. PLoS Comput Biol 2009;5:e1000338. [PMID: 19343219 PMCID: PMC2659441 DOI: 10.1371/journal.pcbi.1000338] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Accepted: 02/24/2009] [Indexed: 12/31/2022] Open

Abstract

An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM.

RNA is remarkably versatile, acting not only as messengers to transfer genetic information from DNA to protein but also as critical structural components and catalytic enzymes in the cell. More intriguingly, RNA elements in messenger RNAs have been widely found in bacteria to control the expression of their downstream genes. The functions of these RNA elements are intrinsically linked to their secondary structures, which are usually conserved across multiple closely related species during evolution and often shared by genes in the same metabolic pathways. We developed a new computational approach to find putative functional RNA elements by looking for conserved RNA secondary structures that are distinguished from random RNA secondary structures in the orthologous RNA sequences from related species. We applied this approach to multiple Shewanella genomes and predicted putative regulatory RNA elements in Shewanella oneidensis, a bacterium that has extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. Our findings not only recovered many RNA elements that are known or supported by literature evidence but also included exciting novel RNA elements for further exploration.

Collapse

Singh V, Somvanshi P. Computational modeling analyses of RNA secondary structures and phylogenetic inference of evolutionary conserved 5S rRNA in the prokaryotes. J Mol Graph Model 2009;27:770-6. [PMID: 19217331 DOI: 10.1016/j.jmgm.2008.11.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2008] [Revised: 11/16/2008] [Accepted: 11/19/2008] [Indexed: 10/21/2022]

Chiaruttini C, Allemand F, Springer M. Structural probing of RNA thermosensors. Methods Mol Biol 2009;540:233-245. [PMID: 19381564 DOI: 10.1007/978-1-59745-558-9_17] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Kavanaugh LA, Ohler U. Predicting Non-coding RNA Transcripts. Bioinformatics 2009. [DOI: 10.1007/978-0-387-92738-1_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

Taneda A. An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast. BMC Bioinformatics 2008;9:521. [PMID: 19061486 PMCID: PMC2630964 DOI: 10.1186/1471-2105-9-521] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2008] [Accepted: 12/05/2008] [Indexed: 11/30/2022] Open

Abstract

Background

Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA) discovery.

Results

We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared S. cerevisiae genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp) sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%). By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences.

Conclusion

The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.

Collapse

Comparative analysis of sequences and secondary structures of the rRNA internal transcribed spacer 2 (ITS2) in pollen beetles of the subfamily Meligethinae (Coleoptera, Nitidulidae): potential use of slippage-derived sequences in molecular systematics. Mol Phylogenet Evol 2008;51:215-26. [PMID: 19059352 DOI: 10.1016/j.ympev.2008.11.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2008] [Revised: 11/05/2008] [Accepted: 11/06/2008] [Indexed: 11/21/2022]

Informatic resources for identifying and annotating structural RNA motifs. Mol Biotechnol 2008;41:180-93. [PMID: 18979204 DOI: 10.1007/s12033-008-9114-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2008] [Accepted: 10/01/2008] [Indexed: 10/21/2022]

Bradley RK, Pachter L, Holmes I. Specific alignment of structured RNA: stochastic grammars and sequence annealing. ACTA ACUST UNITED AC 2008;24:2677-83. [PMID: 18796475 DOI: 10.1093/bioinformatics/btn495] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Do CB, Foo CS, Batzoglou S. A max-margin model for efficient simultaneous alignment and folding of RNA sequences. Bioinformatics 2008;24:i68-76. [PMID: 18586747 PMCID: PMC2718655 DOI: 10.1093/bioinformatics/btn177] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Torarinsson E, Lindgreen S. WAR: Webserver for aligning structural RNAs. Nucleic Acids Res 2008;36:W79-84. [PMID: 18492721 PMCID: PMC2447782 DOI: 10.1093/nar/gkn275] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Moretti S, Wilm A, Higgins DG, Xenarios I, Notredame C. R-Coffee: a web server for accurately aligning noncoding RNA sequences. Nucleic Acids Res 2008;36:W10-3. [PMID: 18483080 PMCID: PMC2447777 DOI: 10.1093/nar/gkn278] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Katoh K, Toh H. Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework. BMC Bioinformatics 2008;9:212. [PMID: 18439255 PMCID: PMC2387179 DOI: 10.1186/1471-2105-9-212] [Citation(s) in RCA: 444] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2007] [Accepted: 04/25/2008] [Indexed: 11/10/2022] Open

Wilm A, Higgins DG, Notredame C. R-Coffee: a method for multiple alignment of non-coding RNA. Nucleic Acids Res 2008;36:e52. [PMID: 18420654 PMCID: PMC2396437 DOI: 10.1093/nar/gkn174] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Tabei Y, Kiryu H, Kin T, Asai K. A fast structural multiple alignment method for long RNA sequences. BMC Bioinformatics 2008;9:33. [PMID: 18215258 PMCID: PMC2375124 DOI: 10.1186/1471-2105-9-33] [Citation(s) in RCA: 132] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Accepted: 01/23/2008] [Indexed: 11/10/2022] Open

Lindgreen S, Gardner PP, Krogh A. MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing. Bioinformatics 2007;23:3304-11. [PMID: 18006551 DOI: 10.1093/bioinformatics/btm525] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open